From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=unavailable autolearn_force=no version=3.4.4 Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!news.eternal-september.org!feeder.eternal-september.org!gandalf.srv.welterde.de!news.jacob-sparre.dk!franka.jacob-sparre.dk!pnx.dk!.POSTED.109.59.4.58.mobile.3.dk!not-for-mail From: Jacob Sparre Andersen Newsgroups: comp.lang.ada Subject: Re: Community Input for the Maintenance and Revision of the Ada Programming Language Date: Thu, 31 Aug 2017 14:49:45 +0200 Organization: JSA Research & Innovation Message-ID: <87val3aoly.fsf@jacob-sparre.dk> References: <79e06550-67d7-45b3-88f8-b7b3980ecb20@googlegroups.com> <9d4bc8aa-cc44-4c30-8385-af0d29d49b36@googlegroups.com> <1395655516.524005222.638450.laguest-archeia.com@nntp.aioe.org> <4527d955-a6fe-4782-beea-e59c3bb69f21@googlegroups.com> <22c5d2f4-6b96-4474-936c-024fdbed6ac7@googlegroups.com> <1919594098.524164165.354468.laguest-archeia.com@nntp.aioe.org> <85d4930c-d4dc-4e4f-af7a-fd7c213b8290@googlegroups.com> <725b229b-f768-4603-b564-4751e5e7136f@googlegroups.com> <87ziag9ois.fsf@jacob-sparre.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: franka.jacob-sparre.dk; posting-host="109.59.4.58.mobile.3.dk:109.59.4.58"; logging-data="14609"; mail-complaints-to="news@jacob-sparre.dk" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.4 (gnu/linux) Cancel-Lock: sha1:8nqrN2uzIrZ2irJRnBsS1AZldPQ= Xref: news.eternal-september.org comp.lang.ada:47854 Date: 2017-08-31T14:49:45+02:00 List-Id: Dmitry A. Kazakov wrote: > You need a view of a string as an array of code points / unicode > characters *and* another view as an array of encoding items, > e.g. octet for UTF-8 or word for UTF-16 etc. But the encoding stuff is (mostly) on the out-side of the application. I don't mind having routines for mapping to and from various encodings, but the encoded types should not have character or string literals, they should just be arrays of octets with certain characteristics. > UTF-16 and UTF-8 strings are equivalent types in the view of code > point arrays. UCS-2 is a constrained subtype of both. ASCII string is > a constrained subtype of any. Yes. > In the second view ASCII string is a subtype of only UTF-8 string. It > is an unrelated type to UCS-2 and UTF-16. Don't worry so much about the encoding-view. Push the encoding troubles to the edge of your application, and work in a consistent form inside the application. > You cannot handle this in present Ada. You can, if you harmonize to a single encoding for the character and string view, and only see specific encodings as serializations of (subsets of) the general character and string types. The places I expect to see trouble is if some source text assumes that Standard.Character and Interfaces.C.char are the the same. Greetings, Jacob -- "In space, no-one can press CTRL-ALT-DEL" -- An Ada programmer