From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00
	autolearn=unavailable autolearn_force=no version=3.4.4
Path: 
 eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!news.eternal-september.org!feeder.eternal-september.org!aioe.org!.POSTED!not-for-mail
From: "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de>
Newsgroups: comp.lang.ada
Subject: Re: Community Input for the Maintenance and Revision of the Ada
 Programming Language
Date: Thu, 31 Aug 2017 16:41:02 +0200
Organization: Aioe.org NNTP Server
Message-ID: <oo975v$4n6$1@gioia.aioe.org>
References: <oludae$seh$1@franka.jacob-sparre.dk>
 <79e06550-67d7-45b3-88f8-b7b3980ecb20@googlegroups.com>
 <9d4bc8aa-cc44-4c30-8385-af0d29d49b36@googlegroups.com>
 <1395655516.524005222.638450.laguest-archeia.com@nntp.aioe.org>
 <omgrf6$5nf$1@dont-email.me>
 <4527d955-a6fe-4782-beea-e59c3bb69f21@googlegroups.com>
 <22c5d2f4-6b96-4474-936c-024fdbed6ac7@googlegroups.com>
 <aab7a027-f0e8-4278-bc5c-9492ca4ccefe@googlegroups.com>
 <1919594098.524164165.354468.laguest-archeia.com@nntp.aioe.org>
 <85d4930c-d4dc-4e4f-af7a-fd7c213b8290@googlegroups.com>
 <oml2tl$11i5$1@gioia.aioe.org> <oml6m6$qgq$1@franka.jacob-sparre.dk>
 <725b229b-f768-4603-b564-4751e5e7136f@googlegroups.com>
 <87ziag9ois.fsf@jacob-sparre.dk> <oo8m80$17c1$1@gioia.aioe.org>
 <87val3aoly.fsf@jacob-sparre.dk> <oo926q$1rj7$1@gioia.aioe.org>
 <87pobbakxr.fsf@jacob-sparre.dk>
NNTP-Posting-Host: vZYCW951TbFitc4GdEwQJg.user.gioia.aioe.org
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Complaints-To: abuse@aioe.org
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101
 Thunderbird/52.3.0
X-Notice: Filtered by postfilter v. 0.8.2
Content-Language: en-US
Xref: news.eternal-september.org comp.lang.ada:47858
Date: 2017-08-31T16:41:02+02:00
List-Id: <comp.lang.ada>

On 31/08/2017 16:09, Jacob Sparre Andersen wrote:
> Dmitry A. Kazakov wrote:
> 
>> Not really. E.g. parsing is done in octets for obvious reasons. That
>> was the reason why UTF-8 was designed this way.
> 
> What obvious reasons?  Performance?

Performance and simplicity.

> As I see it, there is nothing wrong with reading a sequence of octets
> containing an UTF-8 encoded string, mapping it to the internal encoding,
> and *then* parse the text.

UTF-8 *is* the internal encoding. It is the best representation for most 
cases.

>> What is the use of a string type without literals?
> 
> The point is that you shouldn't ever treat an encoded string as a
> string.  If you need to treat it as a string, you map it to
> Standard.String, and do what you have to do.

There is no such thing as a not encoded string. String encoding = string 
representation. All objects have representation. [Not encoded string is 
string value]

>> It is all about having an ability to choose a representation
>> (encoding) rather than getting it enforced upon you by the
>> language.
> 
> The whole point is that enforcing a single internal representation
> simplifies things.

Nobody ever uses Wide_Wide_String which is such a representation now.

> Encoding of characters is purely an interfacing/serialization issue.  It
> isn't something the programmer should have to worry about when not
> interfacing.

Everything in computing is about and in encoding. Program is encoded 
semantics. There is nothing else.

>> It is no solution if you simply create yet another type with
>> the required representation losing the original type's interface and
>> forced to convert forth and back between two types all over the
>> place.
> 
> Not all over the place.  Only where you need to (de)serialize the
> strings.

Table tokens, constants, keys, parameters of subprograms are all in 
their corresponding encodings. Most of them are in UTF-8, of course.

>> Many, if not most, applications never care about code points.
> 
> They usually do.  They just tend to call them "characters".

Yes.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de