From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,MAILING_LIST_MULTI, REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,e136d2bb18e6fb60 X-Google-Attributes: gid103376,public X-Google-ArrivalTime: 2002-12-15 15:28:01 PST Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!newsfeed.icl.net!newsfeed.fjserv.net!proxad.net!teaser.fr!enst.fr!not-for-mail From: "Robert C. Leif" Newsgroups: comp.lang.ada Subject: RE: Character Sets Date: Sun, 15 Dec 2002 15:26:15 -0800 Organization: ENST, France Sender: comp.lang.ada-admin@ada.eu.org Message-ID: Reply-To: comp.lang.ada@ada.eu.org NNTP-Posting-Host: marvin.enst.fr Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Trace: avanie.enst.fr 1039994880 37374 137.194.161.2 (15 Dec 2002 23:28:00 GMT) X-Complaints-To: usenet@enst.fr NNTP-Posting-Date: Sun, 15 Dec 2002 23:28:00 +0000 (UTC) Return-Path: X-Envelope-From: rleif@rleif.com X-Envelope-To: X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.4024 Importance: Normal In-Reply-To: <665e587a.0212141453.42386f5d@posting.google.com> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Errors-To: comp.lang.ada-admin@ada.eu.org X-BeenThere: comp.lang.ada@ada.eu.org X-Mailman-Version: 2.0.13 Precedence: bulk List-Unsubscribe: , List-Id: comp.lang.ada mail<->news gateway List-Post: List-Help: List-Subscribe: , Errors-To: comp.lang.ada-admin@ada.eu.org X-BeenThere: comp.lang.ada@ada.eu.org Xref: archiver1.google.com comp.lang.ada:31862 Date: 2002-12-15T15:26:15-08:00 I believe that we need to change to Latin_9. The European Economic Community needs to have a Euro character. In the long-run, an XML_Io or Unicode_Io package will have to be created. However it should be an Applications Program Interface, rather than being part of the core language or an annex. Bob Leif -----Original Message----- From: comp.lang.ada-admin@ada.eu.org [mailto:comp.lang.ada-admin@ada.eu.org] On Behalf Of Vadim Godunko Sent: Saturday, December 14, 2002 2:54 PM To: comp.lang.ada@ada.eu.org Subject: Re: Character Sets starner@okstate.edu (David Starner) wrote in message news:<81f70ac6.0212131927.4fa6b642@posting.google.com>... > > > This seems reasonable if we don't want to have to amend Ada each time a > > bunch of characters are added to 10646. > > Why would you have to amend Ada? Add a Unicode version constant, and > define the data in terms of its Unicode properties. Then the > recentness of the characters is just a quality of implementation > issue. > How many memory required for save all data from Unicode Character Database? What you do if this constant changed? Retest all existing applications? > From: Robert Dewar > > We certainly > > put in a lot of work in GNAT in implementing wide character with many > > different representation schemes, > > GNAT supports input files in a dozen mostly bizzare or archaic > formats. It doesn't strike me as very useful, especially considering > as it supports Latin-1, Latin-2 (both useful), but also Latin-4 > (completely unused) and Latin-3 (good for Maltese and Esperanto, and > most Esperanto users don't use it). It doesn't support ISO-8859-5 or > KOI8-R (Russian), or ISO-8859-7 (Greek). Latest public GNAT version and GCC3/GNAT both support ISO-8859-5 encoding in identifiers. And don't known any GNAT users who use KOI8-R/U/B encodings outside comment, character and string literals. > It doesn't support changing > formats on the fly - many users have multiple encodings around, > besides the fact that having to compile a different binary for each > user is a pain. > You may propose any method for detect encoding of Ada source file "on the fly"? > From: Pascal Leroy > > Remember, we are talking Ada applications here. There are probably many > > applications out there that deal with mathematical symbols or with Tengwar, > > but I doubt that they are written in Ada. > > Mathematical symbols and Tengwar are text. Any text handling system > that supports Unicode should handle them like any other text, because > sooner or later users will expect it to handle them. (If you're > unlucky, it will be the day that you're showing your system off in > Hong Kong, and the potential buyer decides to put in his name that > isn't in the BMP.) If people don't want Ada to be a general-purpose > programming language, then that's fine; but it's not acceptable for a > general-purpose programming language not to be able to handle text, > and for a modern language, that means Unicode. The main problem with encodings in Ada is a history. Many programs assume what Character is Latin-1. If we change semantic of Ada.Characters.Handling, that results we get? Ada83 define type Character as enumeration. The order of symbols defined by its order in this enumeration not by real code. This allow simple programs portation from, for example, ASCII to EBCDIC encodings. Ada95 simple extend 7-bit ASCII to 8-bit ISO-8859-1. The difference between logical code order in encoding and collation order of current user language environment is another problem. Both Ada9X and AI-00285 not solve this. The best way for implement localization/internationalization support in Ada is define special needs annex, but not change existing interfaces because (1) this not affect to portability and (2) allow new applications (if internationalization is critic) use new interfaces. Vadim Godunko