From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_00,FREEMAIL_FROM, FROM_STARTS_WITH_NUMS,PDS_OTHER_BAD_TLD autolearn=no autolearn_force=no version=3.4.4 X-Received: by 10.36.36.197 with SMTP id f188mr21188342ita.37.1514414037319; Wed, 27 Dec 2017 14:33:57 -0800 (PST) X-Received: by 10.157.82.148 with SMTP id f20mr1343994oth.2.1514414037174; Wed, 27 Dec 2017 14:33:57 -0800 (PST) Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!feeder.eternal-september.org!news.unit0.net!peer03.am4!peer.am4.highwinds-media.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!i6no3134746itb.0!news-out.google.com!b73ni11929ita.0!nntp.google.com!g80no3128013itg.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail Newsgroups: comp.lang.ada Date: Wed, 27 Dec 2017 14:33:56 -0800 (PST) In-Reply-To: Complaints-To: groups-abuse@google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=85.240.218.207; posting-account=rhqvKAoAAABpikMmPHJSZh4400BboHwT NNTP-Posting-Host: 85.240.218.207 References: User-Agent: G2/1.0 MIME-Version: 1.0 Message-ID: <13ee2a99-1908-48e0-a308-9466e606135b@googlegroups.com> Subject: Re: unicode and wide_text_io From: Mehdi Saada <00120260a@gmail.com> Injection-Date: Wed, 27 Dec 2017 22:33:57 +0000 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Received-Bytes: 2624 X-Received-Body-CRC: 42784276 Xref: reader02.eternal-september.org comp.lang.ada:49663 Date: 2017-12-27T14:33:56-08:00 List-Id: Le mercredi 27 d=C3=A9cembre 2017 23:32:52 UTC+1, Mehdi Saada a =C3=A9crit= =C2=A0: > > Wide Text_IO is UCS-2. Keep on using UTF-8. You probably=20 > > meant output of code points. That is a different beast. Convert a code= =20 > > point to UTF-8 string and output that. E.g. > Sure I'll look to your work, but ... Fundamentaly, how can a UTF8 string = even represent codepoints next to the 255th ?? > Superscripts and subscripts means more change in the IO package. > Before I could simply use the generic Integer_IO, but I have no clue how = to do to output a specific code point for each digit in a specific base... = wouldn't that mean rewriting part of Integer_IO ? >=20 > I may have a rather very shallow understanding of characters encoding and= representation, and that's quite an understatement, but you said: "Ada's C= haracter has Latin-1 encoding which differs from UTF-8 in the code position= s greater than 127"=20 > Really ?? You're sayin' there position such as Wide_Character'Val(X) does= n't correspond to the Xth character in the UNICODE standard ?? > And I know peanuts about the UCS-2 thing. I'm too ignorant for getting on= e bit of your saying, except it sounds like heresy in the ears of the Ada C= hurch. Burn them all !! > Ada.stream permits output of bytes without any formatting, right ? I neve= r studied streams for now. Sounds too early. But I'll look at it.