From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_00,FREEMAIL_FROM, FROM_STARTS_WITH_NUMS,PDS_OTHER_BAD_TLD autolearn=no autolearn_force=no version=3.4.4 X-Received: by 10.107.145.212 with SMTP id t203mr21679921iod.7.1514413971275; Wed, 27 Dec 2017 14:32:51 -0800 (PST) X-Received: by 10.157.12.185 with SMTP id b54mr1342799otb.1.1514413971107; Wed, 27 Dec 2017 14:32:51 -0800 (PST) Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!feeder.eternal-september.org!paganini.bofh.team!weretis.net!feeder6.news.weretis.net!feeder.usenetexpress.com!feeder-in1.iad1.usenetexpress.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!i6no3134441itb.0!news-out.google.com!b73ni11929ita.0!nntp.google.com!g80no3127672itg.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail Newsgroups: comp.lang.ada Date: Wed, 27 Dec 2017 14:32:50 -0800 (PST) In-Reply-To: Complaints-To: groups-abuse@google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=85.240.218.207; posting-account=rhqvKAoAAABpikMmPHJSZh4400BboHwT NNTP-Posting-Host: 85.240.218.207 References: User-Agent: G2/1.0 MIME-Version: 1.0 Message-ID: Subject: Re: unicode and wide_text_io From: Mehdi Saada <00120260a@gmail.com> Injection-Date: Wed, 27 Dec 2017 22:32:51 +0000 Content-Type: text/plain; charset="UTF-8" Xref: reader02.eternal-september.org comp.lang.ada:49662 Date: 2017-12-27T14:32:50-08:00 List-Id: > Wide Text_IO is UCS-2. Keep on using UTF-8. You probably > meant output of code points. That is a different beast. Convert a code > point to UTF-8 string and output that. E.g. Sure I'll look to your work, but ... Fundamentaly, how can a UTF8 string even represent codepoints next to the 255th ?? Superscripts and subscripts means more change in the IO package. Before I could simply use the generic Integer_IO, but I have no clue how to do to output a specific code point for each digit in a specific base... wouldn't that mean rewriting part of Integer_IO ? I may have a rather very shallow understanding of characters encoding and representation, and that's quite an understatement, but you said: "Ada's Character has Latin-1 encoding which differs from UTF-8 in the code positions greater than 127" Really ?? You're sayin' there position such as Wide_Character'Val(X) doesn't correspond to the Xth character in the UNICODE standard ?? And I know peanuts about the UCS-2 thing. I'm too ignorant for getting one bit of your saying, except it sounds like heresy in the ears of the Ada Church. Burn them all !! Ada.stream permits output of bits without any formatting, right ? If so, it might do.