From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=unavailable autolearn_force=no version=3.4.4 Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!news.eternal-september.org!feeder.eternal-september.org!aioe.org!.POSTED!not-for-mail From: "Dmitry A. Kazakov" Newsgroups: comp.lang.ada Subject: Re: Community Input for the Maintenance and Revision of the Ada Programming Language Date: Sat, 19 Aug 2017 10:47:47 +0200 Organization: Aioe.org NNTP Server Message-ID: References: <79e06550-67d7-45b3-88f8-b7b3980ecb20@googlegroups.com> <9d4bc8aa-cc44-4c30-8385-af0d29d49b36@googlegroups.com> <1395655516.524005222.638450.laguest-archeia.com@nntp.aioe.org> <4527d955-a6fe-4782-beea-e59c3bb69f21@googlegroups.com> <22c5d2f4-6b96-4474-936c-024fdbed6ac7@googlegroups.com> NNTP-Posting-Host: MajGvm9MbNtGBKE7r8NgYA.user.gioia.aioe.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: abuse@aioe.org User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 X-Notice: Filtered by postfilter v. 0.8.2 Content-Language: en-US Xref: news.eternal-september.org comp.lang.ada:47747 Date: 2017-08-19T10:47:47+02:00 List-Id: On 2017-08-18 22:33, Robert Eachus wrote: > There are three decisions about character sets made every time you > create Ada source files. It is nice to have them all the same, but > people used to that get their minds wrapped around the axle when it is > not true. The three cases are: > > 1) The character set used to write the Ada parts of the program. > 2) The character sets the executing computer will use when interpreting characters and string values. ? Computer does not interpret anything. It executes machine code. The semantics of that code is beyond the computer and irrelevant, so long the compiler generates it right. > 3) The character sets used by the programmer to express character and string constants. The issue has nothing to do with character sets. It does with encodings of the same set. There is no reason to have anything but Unicode code points. All others are constrained subsets of. > I hope that Ada is (slowly) moving toward UTF-8 as the default--with > the emphasis on slowly. For example, a standard conversion from UTF-8 to > Long_Long_Character and back would be very nice. I suppose. "Conversion" from string to code point is called array indexing: Text (Index) (I don't think one should bother considering things called "character" in Unicode. Code point is good enough for most purposes.) > (What about Long_Character? It should be Unicode, and if there is any > deviation between UTF-16 and Unicode, stick with Unicode.) Hmm, UTF-16 is an encoding. AFAIK, per Unicode design there cannot be any deviation from it, that is the part "uni" in the name... -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de