comp.lang.ada
 help / color / mirror / Atom feed
From: dewar@merv.cs.nyu.edu (Robert Dewar)
Subject: Re: Ada and UNICODE?
Date: 1998/05/20
Date: 1998-05-20T00:00:00+00:00	[thread overview]
Message-ID: <dewar.895707155@merv> (raw)
In-Reply-To: 35622857.77912B4@cl.cam.ac.uk


Markus said

<<But in a way that violates the Ada95 standard: The GNAT conversion
routines only work if the Wide_Character encoding used in the
Ada program is also JIS/EUC. The Ada95 standard however requires
that the Wide_Character encoding is the ISO 10646 BMP. Strictly
speeking, the library would have to include the huge Unicode<->JIS
conversion tables on ftp.unicode.org in order to provide a
conforming implementation. UTF-8 instead of EUC and Shift-JIS
is clearly the right encoding to use here.
>>

A common misconception is that the reference manual has something to
say about representation of source programs. That is ENTIRELY wrong,
the standard has nothing whatsoever to say about the representation
of source programs. So the claim that *any* program representation
method violates the standard is simply wrong-at-the-start. When
I chaired the CRG (which is the group attached to ISO WG9 that
decided on these matters for Ada 9X), we found constant confusion
on this issue. 

There is a requirement that any Ada 95 compiler have *some* representation
for all possible programs. Clearly incomplete representations like
EUC, and Shift-JIS, though exactly what a lot of users want, do not meet
this requirement. So a compiler that had ONLY these methods would be
non-compliant. However, GNAT supports a number of different encoding
methods, and in particular the "brackets" notation (which is used for
example in the distribution format of the ACVC tests) is complete and
is supported.


Just to emphasize how little the standard specifies here, an implementation
that used B to represent the character A, and A to represent B would be
highly annoying, but would not violate the standard.

In fact this freedom is completely intentional, for example, it is expected
that a compiler for Ada 95 on an IBM mainframe might accept *only* EBCDIC
input, since such a decision would make perfectly reasonable sense in this
environment.





      reply	other threads:[~1998-05-20  0:00 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
1998-05-15  0:00 Ada and UNICODE? William A Whitaker
1998-05-15  0:00 ` Robert Dewar
1998-05-18  0:00   ` Joel VanLaven
1998-05-19  0:00     ` Robert Dewar
1998-05-19  0:00       ` Ronald Cole
1998-05-19  0:00         ` Robert Dewar
1998-05-24  0:00           ` Ronald Cole
1998-05-25  0:00             ` Robert Dewar
1998-05-20  0:00         ` Markus Kuhn
1998-05-20  0:00           ` Larry Kilgallen
1998-05-20  0:00   ` Markus Kuhn
1998-05-20  0:00     ` Robert Dewar [this message]
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox