From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=unavailable autolearn_force=no version=3.4.4 Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Simon Wright Newsgroups: comp.lang.ada Subject: Re: Community Input for the Maintenance and Revision of the Ada Programming Language Date: Thu, 31 Aug 2017 16:45:57 +0100 Organization: A noiseless patient Spider Message-ID: References: <79e06550-67d7-45b3-88f8-b7b3980ecb20@googlegroups.com> <9d4bc8aa-cc44-4c30-8385-af0d29d49b36@googlegroups.com> <1395655516.524005222.638450.laguest-archeia.com@nntp.aioe.org> <4527d955-a6fe-4782-beea-e59c3bb69f21@googlegroups.com> <22c5d2f4-6b96-4474-936c-024fdbed6ac7@googlegroups.com> <1919594098.524164165.354468.laguest-archeia.com@nntp.aioe.org> <85d4930c-d4dc-4e4f-af7a-fd7c213b8290@googlegroups.com> <725b229b-f768-4603-b564-4751e5e7136f@googlegroups.com> <87ziag9ois.fsf@jacob-sparre.dk> <87val3aoly.fsf@jacob-sparre.dk> <87pobbakxr.fsf@jacob-sparre.dk> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: mx02.eternal-september.org; posting-host="a99653ecc0eee7bcb2f130168083ac2d"; logging-data="15040"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/jmphXtJeHWMk1bT2jzrlf/7cLe1pCUPM=" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (darwin) Cancel-Lock: sha1:8zyplGCHD0EdsAiOHL417tKtq+o= sha1:0HIjtwluToU3p8ejQmXJmI9iXVY= Xref: news.eternal-september.org comp.lang.ada:47861 Date: 2017-08-31T16:45:57+01:00 List-Id: "Dmitry A. Kazakov" writes: > On 31/08/2017 16:09, Jacob Sparre Andersen wrote: >> As I see it, there is nothing wrong with reading a sequence of octets >> containing an UTF-8 encoded string, mapping it to the internal >> encoding, and *then* parse the text. > > UTF-8 *is* the internal encoding. It is the best representation for > most cases. But see the thread beginning at [1], and specifically [2], for the effect of different normalization forms .. [1] https://groups.google.com/d/msg/comp.lang.ada/ZhDARPQ8deQ/fubEjsggBAAJ [2] https://groups.google.com/d/msg/comp.lang.ada/ZhDARPQ8deQ/6v-c9SmNAQAJ