From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00
	autolearn=unavailable autolearn_force=no version=3.4.4
Path: 
 eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!feeder.eternal-september.org!aioe.org!.POSTED!not-for-mail
From: "J-P. Rosen" <rosen@adalog.fr>
Newsgroups: comp.lang.ada
Subject: Re: Strange crash on custom iterator
Date: Wed, 4 Jul 2018 11:55:06 +0200
Organization: Adalog
Message-ID: <phi5hp$fbv$1@gioia.aioe.org>
References: <70c11a71-3832-4f57-8127-f3f1c48a052f@googlegroups.com>
 <ly1scotsqq.fsf@pushface.org>
 <62e38ee4-f72f-4ed8-bef1-952040fb7f8d@googlegroups.com>
 <lytvpks65b.fsf@pushface.org>
 <64d8b4a1-a92c-4b90-b95c-e821749de969@googlegroups.com>
 <lya7rc9iw0.fsf@pushface.org>
 <887212304.552080112.848502.laguest-archeia.com@nntp.aioe.org>
 <87muvan83x.fsf@adaheads.home> <ly4lhiafs8.fsf@pushface.org>
 <1449870001.552246132.581310.laguest-archeia.com@nntp.aioe.org>
 <lyzhz98lvh.fsf@pushface.org>
 <b0d7482d-3c02-4e0b-8720-58ee5b65af03@googlegroups.com>
 <phg0h7$10dd$1@gioia.aioe.org>
 <c980d621-6d5d-4a23-8005-733bb024285d@googlegroups.com>
 <phg5nk$1a46$1@gioia.aioe.org> <phg6cg$1ba2$1@gioia.aioe.org>
 <bd52280b-662a-49b3-891d-e39044e2bf32@googlegroups.com>
 <phg8lo$1fnq$1@gioia.aioe.org> <phht7f$1vj2$1@gioia.aioe.org>
 <phhuei$1v6$1@gioia.aioe.org>
NNTP-Posting-Host: vtydEJu0RziDZHka7ZZ6bg.user.gioia.aioe.org
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Complaints-To: abuse@aioe.org
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101
 Thunderbird/52.8.0
Content-Language: fr
Openpgp: preference=signencrypt
X-Notice: Filtered by postfilter v. 0.8.3
Xref: reader02.eternal-september.org comp.lang.ada:53568
Date: 2018-07-04T11:55:06+02:00
List-Id: <comp.lang.ada>

Le 04/07/2018 à 09:53, Dmitry A. Kazakov a écrit :
> On 2018-07-04 09:33, J-P. Rosen wrote:
>> Le 03/07/2018 à 18:36, Dmitry A. Kazakov a écrit :
>>> E.g. UTF8_String and String must share interfaces but have
>>> different representations.
>> No. UTF_8 is useful only for IOs, as soon as you want to use a UTF 
>> string, you need to convert it to a Wide_String.
> 
> I cannot. Wide_String is UCS-2 which is not full Unicode.
For most purposes, Wide_String is sufficient, unless you really need to
support emojis or ancient chinese. In those cases, decode to
Wide_Wide_String, no problem.

> Anyway, whatever conversion of representations needed it must be 
> transparent to the user.
> 
>> Why? Because even the simplest operation (Length, Indexing) are
>> O(N) and are mostly equivalent to decoding the whole string.
> 
> Premature optimization, huh? And you still need UTF-8 string type
> even if you are going to convert it to something else. Back to the
> square one, how to design an UTF-8 string type?
> 
Choosing a representation that allows a more efficient algorithm is
proper design, not premature optimization.

And the point is that when you receive a string, you don't know before
looking at the BOM (or other recognition techniques) whether the octets
you received are pure Latin-1 or UTF_8 encoded. So you need to store it
in a plain String.

We discussed that point, and the agreement was that making a different
type would force the user to many conversions that would bring nothing
but trouble, and make Ada once again look impractical out of excessive
purism.

-- 
J-P. Rosen
Adalog
2 rue du Docteur Lombard, 92441 Issy-les-Moulineaux CEDEX
Tel: +33 1 45 29 21 52, Fax: +33 1 45 29 25 00
http://www.adalog.fr