From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00, REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,1c4b8fdfa762b2bb X-Google-Attributes: gid103376,public X-Google-ArrivalTime: 2001-04-06 09:38:28 PST Path: supernews.google.com!sn-xit-02!supernews.com!nntp-relay.ihug.net!ihug.co.nz!logbridge.uoregon.edu!newsfeed.stanford.edu!headwall.stanford.edu!unlnews.unl.edu!newsfeed.ksu.edu!nntp.ksu.edu!news.okstate.edu!dvdeug From: dvdeug@x8b4e53cd.dhcp.okstate.edu (David Starner) Newsgroups: comp.lang.ada Subject: Re: Hebrew language character set Date: 5 Apr 2001 18:35:44 GMT Organization: Oklahoma State University Message-ID: <9aidu0$9281@news.cis.okstate.edu> References: <3ACA11B0.9AAFDDDD@lmco.com> <3ACB85DF.9E6DBD03@lmco.com> <3ACC6E83.5B860AC5@free.fr> <3ACC9597.14D3B23D@lmco.com> Reply-To: dstarner98@aasaa.ofe.org NNTP-Posting-Host: x8b4e531b.dhcp.okstate.edu User-Agent: slrn/0.9.6.3 (Linux) Xref: supernews.google.com comp.lang.ada:6576 Date: 2001-04-05T18:35:44+00:00 List-Id: On Thu, 05 Apr 2001 08:56:07 -0800, Paul Storm wrote: >Apparently this is not the trivial problem I thought it was. As a Ada >newbie it makes me wonder about Ada's capabilities for program >internationalization. In our internet age that is important. Try this in C. (1) It's impossible to use Hebrew characters, since wchar_t is opaque - you can only portably use ASCII from inside the code. (2) If you just stick the Unicode Hebrew values in there, it might work (since wchar_t is often some form of Unicode), but it probably will print out UTF-8 or UTF-16, which will look like noise on your screen or even Ehud Lamm's system. The encouraged solution in C is to store all non-ASCII characters external to the program and treat them opaquely, if at all. The correct solutions if you need more are hottly debated. I'm sorry - you picked a hard problem in any programming language. I was about to blame the Americans (who invented ASCII and other English-only, 7/8-bit codes, and spread them round the world), but equal blame falls on the Japenese (who use three differnt codes, and oppose one unified code), the Europeans (who hardcoded 8-bit codes and especially Latin-1 everywhere), and generally anyone who invented a solution that worked for thier language and quit. There's no way to output a non-ASCII character and expect that it will work in any more than a small subset of places. There's no way for any language to portably guess what the appropriate encoding of output would be. GNAT could possibly do better, and hopefully I will find time to write up a decent proposal on how it could do better, but the current solution differs little from what almost any programming language would do. -- David Starner - dstarner98@aasaa.ofe.org Pointless website: http://dvdeug.dhis.org "I don't care if Bill personally has my name and reads my email and laughs at me. In fact, I'd be rather honored." - Joseph_Greg