comp.lang.ada
 help / color / mirror / Atom feed
* Ada and UNICODE?
@ 1998-05-15  0:00 William A Whitaker
  1998-05-15  0:00 ` Robert Dewar
  0 siblings, 1 reply; 12+ messages in thread
From: William A Whitaker @ 1998-05-15  0:00 UTC (permalink / raw)



Has anyone out there experience in the use of UNICODE for expression of
other languages (Greek, Hebrew) - in Ada programs or elsewhere.

It was my understanding that ISO was shaking out to be
identical/compatible to UNICODE for "wide characters" and that was what
Ada was to support.  But I have never seen an actual example in/with
Ada.  

There is also a question about support tools, e.g., keyboard entry that
does not require four fingers.

What about UNICODE use in commercial word processors - any tales to
tell?

Whitaker




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Ada and UNICODE?
  1998-05-15  0:00 Ada and UNICODE? William A Whitaker
@ 1998-05-15  0:00 ` Robert Dewar
  1998-05-18  0:00   ` Joel VanLaven
  1998-05-20  0:00   ` Markus Kuhn
  0 siblings, 2 replies; 12+ messages in thread
From: Robert Dewar @ 1998-05-15  0:00 UTC (permalink / raw)



W A Whitaker asks

<<There is also a question about support tools, e.g., keyboard entry that
does not require four fingers.
>>

GNAT supports all the coding methods that are commonly used in Japan
and China, including EUC and Shift-JIS.





^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Ada and UNICODE?
  1998-05-15  0:00 ` Robert Dewar
@ 1998-05-18  0:00   ` Joel VanLaven
  1998-05-19  0:00     ` Robert Dewar
  1998-05-20  0:00   ` Markus Kuhn
  1 sibling, 1 reply; 12+ messages in thread
From: Joel VanLaven @ 1998-05-18  0:00 UTC (permalink / raw)



Robert Dewar <dewar@merv.cs.nyu.edu> wrote:
: W A Whitaker asks

: <<There is also a question about support tools, e.g., keyboard entry that
: does not require four fingers.
: >>

: GNAT supports all the coding methods that are commonly used in Japan
: and China, including EUC and Shift-JIS.

Since we are telling what we support... :)

OC Systems supports the UTF-8 coding method for UNICODE (i.e. ISO).  We
felt that the ISO standard was the intention for wide_characters and so
chose the standard text (non-wide character) representation of UNICODE.
UTF-8 is supported by Netscape and might become part of the next revision
of html.

Other than Netscape I am unsure about the availability of tools and such
related to UTF-8 (or unicode at all).  I think that newer versions of
operating systems are coming out (or will come out) with some sort of
support for wide_characters but I think that despite the amount of time
this has been an issue it is still in its infancy.

-- Joel VanLaven
-- (only partially speaking for OC Systems)




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Ada and UNICODE?
  1998-05-18  0:00   ` Joel VanLaven
@ 1998-05-19  0:00     ` Robert Dewar
  1998-05-19  0:00       ` Ronald Cole
  0 siblings, 1 reply; 12+ messages in thread
From: Robert Dewar @ 1998-05-19  0:00 UTC (permalink / raw)



Joel says

<<OC Systems supports the UTF-8 coding method for UNICODE (i.e. ISO).  We
felt that the ISO standard was the intention for wide_characters and so
chose the standard text (non-wide character) representation of UNICODE.
UTF-8 is supported by Netscape and might become part of the next revision
of html.

Other than Netscape I am unsure about the availability of tools and such
related to UTF-8 (or unicode at all).  I think that newer versions of
operating systems are coming out (or will come out) with some sort of
support for wide_characters but I think that despite the amount of time
this has been an issue it is still in its infancy.
>>

So far we have not found any of our Japanese or Chinese users using
UTF-8. There are already plenty of operating systems that fully
support Japanese and Chinese characters (I have sitting on my shelf
the Japanese version of windows). THe most common coding methods
we have run into in Japan are EUC and SHift-JIS, and in China, the
modified upper bit approach is used (80h bit signals wide character).

It sure would be nice if UTF-8 would become a standard, supporting
umpteen different coding methods for wide character is a pain!





^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Ada and UNICODE?
  1998-05-19  0:00     ` Robert Dewar
@ 1998-05-19  0:00       ` Ronald Cole
  1998-05-19  0:00         ` Robert Dewar
  1998-05-20  0:00         ` Markus Kuhn
  0 siblings, 2 replies; 12+ messages in thread
From: Ronald Cole @ 1998-05-19  0:00 UTC (permalink / raw)



dewar@merv.cs.nyu.edu (Robert Dewar) writes:
> So far we have not found any of our Japanese or Chinese users using
> UTF-8. There are already plenty of operating systems that fully
> support Japanese and Chinese characters (I have sitting on my shelf
> the Japanese version of windows). THe most common coding methods
> we have run into in Japan are EUC and SHift-JIS, and in China, the
> modified upper bit approach is used (80h bit signals wide character).
> 
> It sure would be nice if UTF-8 would become a standard, supporting
> umpteen different coding methods for wide character is a pain!

Probably won't happen.  The Japanese don't like the "han-unification"
in Unicode or UTF-x.  The O'Reilly book, "Understanding Japanese
Information Processing," covers the subject in good detail.

-- 
Forte International, P.O. Box 1412, Ridgecrest, CA  93556-1412
Ronald Cole <ronald@forte-intl.com>      Phone: (760) 499-9142
President, CEO                             Fax: (760) 499-9152
My PGP fingerprint: E9 A8 E3 68 61 88 EF 43  56 2B CE 3E E9 8F 3F 2B




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Ada and UNICODE?
  1998-05-19  0:00       ` Ronald Cole
@ 1998-05-19  0:00         ` Robert Dewar
  1998-05-24  0:00           ` Ronald Cole
  1998-05-20  0:00         ` Markus Kuhn
  1 sibling, 1 reply; 12+ messages in thread
From: Robert Dewar @ 1998-05-19  0:00 UTC (permalink / raw)



Ronald said

<<Probably won't happen.  The Japanese don't like the "han-unification"
in Unicode or UTF-x.  The O'Reilly book, "Understanding Japanese
Information Processing," covers the subject in good detail.
>>

Indeed! The Koreans are not very enthusiastic about this unification
either. However ... no one ever thought that the unification could succeed
at all, let alone the unification with unicode, and it did, so who knows
maybe we can make more progress here than one might guess!





^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Ada and UNICODE?
  1998-05-15  0:00 ` Robert Dewar
  1998-05-18  0:00   ` Joel VanLaven
@ 1998-05-20  0:00   ` Markus Kuhn
  1998-05-20  0:00     ` Robert Dewar
  1 sibling, 1 reply; 12+ messages in thread
From: Markus Kuhn @ 1998-05-20  0:00 UTC (permalink / raw)



Robert Dewar wrote:
> GNAT supports all the coding methods that are commonly used in Japan
> and China, including EUC and Shift-JIS.

But in a way that violates the Ada95 standard: The GNAT conversion
routines only work if the Wide_Character encoding used in the
Ada program is also JIS/EUC. The Ada95 standard however requires
that the Wide_Character encoding is the ISO 10646 BMP. Strictly
speeking, the library would have to include the huge Unicode<->JIS
conversion tables on ftp.unicode.org in order to provide a
conforming implementation. UTF-8 instead of EUC and Shift-JIS
is clearly the right encoding to use here.

Markus

-- 
Markus G. Kuhn, Security Group, Computer Lab, Cambridge University, UK
email: mkuhn at acm.org,  home page: <http://www.cl.cam.ac.uk/~mgk25/>




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Ada and UNICODE?
  1998-05-19  0:00       ` Ronald Cole
  1998-05-19  0:00         ` Robert Dewar
@ 1998-05-20  0:00         ` Markus Kuhn
  1998-05-20  0:00           ` Larry Kilgallen
  1 sibling, 1 reply; 12+ messages in thread
From: Markus Kuhn @ 1998-05-20  0:00 UTC (permalink / raw)



Ronald Cole wrote:
[Unicode in Japan]
> Probably won't happen.  The Japanese don't like the "han-unification"
> in Unicode or UTF-x.

Only those Japanese programmers who haven't really understood Unicode.
There are many popular misconceptions in the Japanese community about
Unicode. Check old comp.std.internat and comp.software.international
postings (via dejanews), where this subject is discussed in great
detail. After all, the editor of the ISO 10646-1 standard was Japanese.

Markus

-- 
Markus G. Kuhn, Security Group, Computer Lab, Cambridge University, UK
email: mkuhn at acm.org,  home page: <http://www.cl.cam.ac.uk/~mgk25/>




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Ada and UNICODE?
  1998-05-20  0:00         ` Markus Kuhn
@ 1998-05-20  0:00           ` Larry Kilgallen
  0 siblings, 0 replies; 12+ messages in thread
From: Larry Kilgallen @ 1998-05-20  0:00 UTC (permalink / raw)



In article <3562A240.150F17BC@cl.cam.ac.uk>, Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk> writes:
> Ronald Cole wrote:
> [Unicode in Japan]
>> Probably won't happen.  The Japanese don't like the "han-unification"
>> in Unicode or UTF-x.
> 
> Only those Japanese programmers who haven't really understood Unicode.

So I could say that Ada has been universally embraced by C programmers,
except for those C programmers who haven't really understood Ada ?

Larry Kilgallen




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Ada and UNICODE?
  1998-05-20  0:00   ` Markus Kuhn
@ 1998-05-20  0:00     ` Robert Dewar
  0 siblings, 0 replies; 12+ messages in thread
From: Robert Dewar @ 1998-05-20  0:00 UTC (permalink / raw)



Markus said

<<But in a way that violates the Ada95 standard: The GNAT conversion
routines only work if the Wide_Character encoding used in the
Ada program is also JIS/EUC. The Ada95 standard however requires
that the Wide_Character encoding is the ISO 10646 BMP. Strictly
speeking, the library would have to include the huge Unicode<->JIS
conversion tables on ftp.unicode.org in order to provide a
conforming implementation. UTF-8 instead of EUC and Shift-JIS
is clearly the right encoding to use here.
>>

A common misconception is that the reference manual has something to
say about representation of source programs. That is ENTIRELY wrong,
the standard has nothing whatsoever to say about the representation
of source programs. So the claim that *any* program representation
method violates the standard is simply wrong-at-the-start. When
I chaired the CRG (which is the group attached to ISO WG9 that
decided on these matters for Ada 9X), we found constant confusion
on this issue. 

There is a requirement that any Ada 95 compiler have *some* representation
for all possible programs. Clearly incomplete representations like
EUC, and Shift-JIS, though exactly what a lot of users want, do not meet
this requirement. So a compiler that had ONLY these methods would be
non-compliant. However, GNAT supports a number of different encoding
methods, and in particular the "brackets" notation (which is used for
example in the distribution format of the ACVC tests) is complete and
is supported.


Just to emphasize how little the standard specifies here, an implementation
that used B to represent the character A, and A to represent B would be
highly annoying, but would not violate the standard.

In fact this freedom is completely intentional, for example, it is expected
that a compiler for Ada 95 on an IBM mainframe might accept *only* EBCDIC
input, since such a decision would make perfectly reasonable sense in this
environment.





^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Ada and UNICODE?
  1998-05-19  0:00         ` Robert Dewar
@ 1998-05-24  0:00           ` Ronald Cole
  1998-05-25  0:00             ` Robert Dewar
  0 siblings, 1 reply; 12+ messages in thread
From: Ronald Cole @ 1998-05-24  0:00 UTC (permalink / raw)



dewar@merv.cs.nyu.edu (Robert Dewar) writes:
> Indeed! The Koreans are not very enthusiastic about this unification
> either. However ... no one ever thought that the unification could succeed
> at all, let alone the unification with unicode, and it did, so who knows
> maybe we can make more progress here than one might guess!

Has it succeeded?  I seem to recall at least one kanji character where
the Japanese and the Chinese disagree on the radical and therefore
it's KUTEN position.  And there still appears to be many as-yet-
ununified kanji in the Unicode encoding.  Personally, I evidence the
seeming reluctance on the part of kanji users to fully embrace it, and
so I conclude Unicode to be just another step on the way towards a
truly unified kanji.  However, not having been formally schooled with
kanji, I am unable to offer any actual personal insight or opinion on
the "han unification".

-- 
Forte International, P.O. Box 1412, Ridgecrest, CA  93556-1412
Ronald Cole <ronald@forte-intl.com>      Phone: (760) 499-9142
President, CEO                             Fax: (760) 499-9152
My PGP fingerprint: E9 A8 E3 68 61 88 EF 43  56 2B CE 3E E9 8F 3F 2B




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Ada and UNICODE?
  1998-05-24  0:00           ` Ronald Cole
@ 1998-05-25  0:00             ` Robert Dewar
  0 siblings, 0 replies; 12+ messages in thread
From: Robert Dewar @ 1998-05-25  0:00 UTC (permalink / raw)



Ronald COle said

<<Has it succeeded?  I seem to recall at least one kanji character where
the Japanese and the Chinese disagree on the radical and therefore
it's KUTEN position.  And there still appears to be many as-yet-
ununified kanji in the Unicode encoding.  Personally, I evidence the
seeming reluctance on the part of kanji users to fully embrace it, and
so I conclude Unicode to be just another step on the way towards a
truly unified kanji.  However, not having been formally schooled with
kanji, I am unable to offer any actual personal insight or opinion on
the "han unification".
>>

By success here, I mean ratification of the ISO 10646 standard. For
a while it looked like this would be deraield.





^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~1998-05-25  0:00 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1998-05-15  0:00 Ada and UNICODE? William A Whitaker
1998-05-15  0:00 ` Robert Dewar
1998-05-18  0:00   ` Joel VanLaven
1998-05-19  0:00     ` Robert Dewar
1998-05-19  0:00       ` Ronald Cole
1998-05-19  0:00         ` Robert Dewar
1998-05-24  0:00           ` Ronald Cole
1998-05-25  0:00             ` Robert Dewar
1998-05-20  0:00         ` Markus Kuhn
1998-05-20  0:00           ` Larry Kilgallen
1998-05-20  0:00   ` Markus Kuhn
1998-05-20  0:00     ` Robert Dewar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox