comp.lang.ada
 help / color / mirror / Atom feed
From: vgodunko@vipmail.ru (Vadim Godunko)
Subject: Re: Character Sets
Date: 14 Dec 2002 14:53:43 -0800
Date: 2002-12-14T22:53:43+00:00	[thread overview]
Message-ID: <665e587a.0212141453.42386f5d@posting.google.com> (raw)
In-Reply-To: 81f70ac6.0212131927.4fa6b642@posting.google.com

starner@okstate.edu (David Starner) wrote in message news:<81f70ac6.0212131927.4fa6b642@posting.google.com>...
> 
> > This seems reasonable if we don't want to have to amend Ada each time a
> > bunch of characters are added to 10646.
> 
> Why would you have to amend Ada? Add a Unicode version constant, and
> define the data in terms of its Unicode properties. Then the
> recentness of the characters is just a quality of implementation
> issue.
> 
How many memory required for save all data from Unicode Character
Database? What you do if this constant changed? Retest all existing
applications?

> From: Robert Dewar
> > We certainly
> > put in a lot of work in GNAT in implementing wide character with many
> > different representation schemes,
> 
> GNAT supports input files in a dozen mostly bizzare or archaic
> formats. It doesn't strike me as very useful, especially considering
> as it supports Latin-1, Latin-2 (both useful), but also Latin-4
> (completely unused) and Latin-3 (good for Maltese and Esperanto, and
> most Esperanto users don't use it). It doesn't support ISO-8859-5 or
> KOI8-R (Russian), or ISO-8859-7 (Greek).
Latest public GNAT version and GCC3/GNAT both support ISO-8859-5
encoding in identifiers. And don't known any GNAT users who use
KOI8-R/U/B encodings outside comment, character and string literals.

> It doesn't support changing
> formats on the fly - many users have multiple encodings around,
> besides the fact that having to compile a different binary for each
> user is a pain. 
> 
You may propose any method for detect encoding of Ada source file "on
the fly"?

> From: Pascal Leroy
> > Remember, we are talking Ada applications here.  There are probably many
> > applications out there that deal with mathematical symbols or with Tengwar, 
> > but I doubt that they are written in Ada.
> 
> Mathematical symbols and Tengwar are text. Any text handling system
> that supports Unicode should handle them like any other text, because
> sooner or later users will expect it to handle them. (If you're
> unlucky, it will be the day that you're showing your system off in
> Hong Kong, and the potential buyer decides to put in his name that
> isn't in the BMP.) If people don't want Ada to be a general-purpose
> programming language, then that's fine; but it's not acceptable for a
> general-purpose programming language not to be able to handle text,
> and for a modern language, that means Unicode.

The main problem with encodings in Ada is a history. 

Many programs assume what Character is Latin-1. If we change semantic
of Ada.Characters.Handling, that results we get?

Ada83 define type Character as enumeration. The order of symbols
defined by its order in this enumeration not by real code. This allow
simple programs portation from, for example, ASCII to EBCDIC
encodings. Ada95 simple extend 7-bit ASCII to 8-bit ISO-8859-1.

The difference between logical code order in encoding and collation
order of current user language environment is another problem. Both
Ada9X and AI-00285 not solve this.

The best way for implement localization/internationalization support
in Ada is define special needs annex, but not change existing
interfaces because (1) this not affect to portability and (2) allow
new applications (if internationalization is critic) use new
interfaces.


Vadim Godunko



  reply	other threads:[~2002-12-14 22:53 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-11-28 17:53 Character Sets Robert C. Leif
2002-11-28 18:08 ` Character Sets (plain text police report) Warren W. Gay VE3WWG
2002-11-28 18:11   ` Warren W. Gay VE3WWG
2002-11-29 11:12     ` Lutz Donnerhacke
2002-11-29 14:58       ` Frank J. Lhota
2002-11-29 20:37   ` Robert C. Leif
2002-11-30 14:49     ` Marin David Condic
2002-12-01 11:28       ` Jacob Sparre Andersen
2002-12-01 14:38         ` Marin David Condic
2002-12-01 20:25           ` Jacob Sparre Andersen
2002-12-02  9:43             ` Preben Randhol
2002-12-02 13:26               ` Marin David Condic
2002-12-02  6:44           ` Robert C. Leif
2002-12-02  9:41           ` Preben Randhol
2002-12-02 16:58           ` Charles Lindsey
2002-12-02 19:29     ` A suggestion, completely unrelated to the original topic Wes Groleau
2002-12-02 23:21       ` David C. Hoos, Sr.
2002-11-29 12:28 ` Character Sets Georg Bauhaus
2002-12-02 18:28 ` Stephen Leake
2002-12-03  2:45   ` Robert C. Leif
2002-12-03 13:33     ` Robert A Duff
2002-12-03 15:32       ` Juanma Barranquero
2002-12-04  0:49       ` Robert C. Leif
2002-12-14  3:27         ` David Starner
2002-12-14 22:53           ` Vadim Godunko [this message]
2002-12-15  3:46             ` David Starner
2002-12-15 23:26             ` Robert C. Leif
  -- strict thread matches above, loose matches on Subject: below --
2002-11-27  9:00 Grein, Christoph
2002-11-26 21:41 Robert C. Leif
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox