comp.lang.ada
 help / color / mirror / Atom feed
From: "Yannick Duchêne (Hibou57)" <yannick_duchene@yahoo.fr>
Subject: Re: Ada 2012 and Unicode package (UTF-nn encodings handling)
Date: Sat, 21 Aug 2010 10:12:11 +0200
Date: 2010-08-21T10:12:11+02:00	[thread overview]
Message-ID: <op.vhr3qlivule2fv@garhos> (raw)
In-Reply-To: i4ntld$njs$1@news.eternal-september.org

> I still fail to see the benefit of encoding 31 bits values into 32 bits
> values...
UTF-32 is not formally an encoding format, it would better be referred to  
as a matter of Byte order. But this byte order is not system dependent, it  
is cross-platform data dependent.

> And even if implementation is not a nightmare, it always has a cost.
> Implementers are reluctant to spend money for features that nobody will
> use. (Wide_Wide_Character was forced on us by ISO).
I suppose the ISO forced the introduction of Wide_Wide_Character because  
it is part of the Unicode standard, and as you know, conformance requires  
full-conformance. There is no part-of with this, because as soon and it is  
defined, this may really have occurrences.

Imagine a web crawler: it would have to be designed with this option in  
mind. Designers could not say “We do not feel UTF-32 is useful, our  
crawler will then not be offered the capabilities of handling such  
documents”.

I just though this was a little pity, if one want to rely on the standard  
packages capabilities, then this one will only be able to do it partially.  
This would be a bit like Two way linked list without the one way (or the  
opposite). A matter of completeness.

> A package provides functionnalities. It should not presume how it is
> used. Since this package is clearly in the "string handling" class, it
> makes sense to handle this with strings.
Right, this is defined in *String*_Encoding.

> For files, the usage is to have a BOM on the first line of the file. The
> way the functions are defined makes it easy to not process the first
> line specially; see the use case in the AI.
I just had a look back at
http://www.ada-auth.org/standards/12aarm/html/AA-A-4-11.html
Only Encode has this capability (via Output_BOM : Boolean). Decode/Convert  
has nothing similar and will always skip any 16#FEFF# which will be  
interpreted as a BOM instead of as a character (there is nothing like an  
Interpret_BOM : Boolean).

But may be I am missing something. Will have a deeper look at it and at  
the AI which come with it (I saw UTF-32 was at least “pronounced” during  
the talk).



  reply	other threads:[~2010-08-21  8:12 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-20 21:38 Ada 2012 and Unicode package (UTF-nn encodings handling) Yannick Duchêne (Hibou57)
2010-08-20 21:41 ` Yannick Duchêne (Hibou57)
2010-08-21  6:21 ` Dmitry A. Kazakov
2010-08-21  7:01 ` J-P. Rosen
2010-08-21  8:12   ` Yannick Duchêne (Hibou57) [this message]
2010-08-22 18:51     ` J-P. Rosen
2010-08-22 19:48       ` Georg Bauhaus
2010-08-22 20:40         ` J-P. Rosen
2010-08-23 10:32           ` Georg Bauhaus
2010-08-23 22:28 ` Randy Brukardt
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox