From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=unavailable autolearn_force=no version=3.4.4 Path: border2.nntp.dca1.giganews.com!nntp.giganews.com!usenet.blueworldhosting.com!feeder01.blueworldhosting.com!feeder.erje.net!eu.feeder.erje.net!eternal-september.org!feeder.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Ludovic Brenta Newsgroups: comp.lang.ada Subject: Re: Ada & Unicode support Date: Sun, 21 Sep 2014 02:17:36 +0200 Organization: A noiseless patient Spider Message-ID: <871tr5x1wf.fsf@ludovic-brenta.org> References: <20140920233647.06c82dde@atmarama.ddns.net> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: mx05.eternal-september.org; posting-host="28ab561c5e5f995175697b8ebcb2c9af"; logging-data="31039"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18eZr3IhWBsmTx0YLFILBto" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) Cancel-Lock: sha1:2WGsXNMxWQrxLG0vJYoqxcaDo9w= sha1:QcGDvos8p6uiJqs/JhBsplU59ts= Xref: number.nntp.dca.giganews.com comp.lang.ada:189061 Date: 2014-09-21T02:17:36+02:00 List-Id: Gour writes on comp.lang.ada: > So, can someone provide some explanation what is the meaning of e.g. > full unicode support in 2005 and what is the significance of 'String > Encoding Package' in this context? You need to distinguish between the character set and encoding of the Ada text (which you give to the compiler) and the character set(s) and encoding(s) that your program must support. ARM 2.1 specifies that the character set for Ada text is Unicode (precisely, ISO/IEC 10646:2011 Universal Multiple-Octet Coded Character Set) and that the encoding is implementation-defined. For example, GNAT supports several encodings, including UTF-8 with command-line options. As for the text that your program must process, that's really up to you. Ada 95 added the Wide_Character and Wide_String to help you use 16-bit characters (not exactly UTF-16, rather supporting only the first plane of the Unicode character set); Ada 2005 added Wide_Wide_Character for 32-bit characters (i.e. UTF-32 encoding) The String Encoding package is there to help you transcode text between 8-bit Latin_1, UTF-8, proper UTF-16 and UTF-32. The new packages are there to help you but they don't do anything that wasn't possible in previous versions of Ada (i.e. you could reimplement them in Ada 95 if you so wished). GTK+ and GtkAda treat all strings as UTF-8. If your program uses GtkAda, then don't bother transcoding anything and specify that all Strings really contain UTF-8. HTH -- Ludovic Brenta.