From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Thread: 103376,5bcc293dc5642650
X-Google-NewGroupId: yes
X-Google-Attributes: gida07f3367d7,domainid0,public,usenet
X-Google-Language: ENGLISH,ASCII
Received: by 10.68.8.135 with SMTP id r7mr3070958pba.8.1318950151938;
        Tue, 18 Oct 2011 08:02:31 -0700 (PDT)
Path: 
 d5ni27503pbc.0!nntp.google.com!news2.google.com!postnews.google.com!l10g2000pra.googlegroups.com!not-for-mail
From: Adam Beneschan <adam@irvine.com>
Newsgroups: comp.lang.ada
Subject: Re: Why no Ada.Wide_Directories?
Date: Tue, 18 Oct 2011 08:02:31 -0700 (PDT)
Organization: http://groups.google.com
Message-ID: 
 <d831c4d8-3540-44cb-8976-e588e22b4c59@l10g2000pra.googlegroups.com>
References: <9937871.172.1318575525468.JavaMail.geo-discussion-forums@prib32>
 <418b8140-fafb-442f-b91c-e22cc47f8adb@y22g2000pri.googlegroups.com>
 <j7i6va$nso$1@munin.nbi.dk>
 <7156122c-b63f-487e-ad1b-0edcc6694a7a@u10g2000prl.googlegroups.com>
 <ffeeb5d0-5685-42ff-a141-72bea410f239@u10g2000prl.googlegroups.com>
 <409c81ab-bd54-493b-beb4-a0cca99ec306@p27g2000prp.googlegroups.com>
NNTP-Posting-Host: 66.126.103.122
Mime-Version: 1.0
X-Trace: posting.google.com 1318950151 23438 127.0.0.1 (18 Oct 2011 15:02:31
 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Tue, 18 Oct 2011 15:02:31 +0000 (UTC)
Complaints-To: groups-abuse@google.com
Injection-Info: l10g2000pra.googlegroups.com; posting-host=66.126.103.122;
 posting-account=duW0ogkAAABjRdnxgLGXDfna0Gc6XqmQ
User-Agent: G2/1.0
X-Google-Web-Client: true
X-Google-Header-Order: ARLUEHNKC
X-HTTP-UserAgent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; WOW64;
 Trident/4.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR
 3.5.21022; .NET CLR 3.5.30729; .NET CLR 3.0.30618; .NET4.0C),gzip(gfe)
Xref: news2.google.com comp.lang.ada:14053
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Date: 2011-10-18T08:02:31-07:00
List-Id: <comp.lang.ada>

On Oct 17, 7:32=A0pm, ytomino <aghi...@gmail.com> wrote:
>
> I'm not confused. Your misreading.

I think we have a terminology problem.  To me, Latin-1 is a set of
characters (a subset of the full Unicode character set).  So I get
confused when people talk about Latin-1 versus UTF-8 strings as if
they were mutually exclusive.  They're not, the way I understand the
terms.  You can have a string composed of Latin-1 characters that's
represented using UTF-8 encoding; and the bits in that string would be
different from a string of the same Latin-1 characters using the
"regular" encoding, if any character in the string is in the 16#80#..
16#FF# range.

However, everyone else seems to be using "Latin-1" to talk about the
*representation* in addition to the subset of characters that's being
represented---in particular, the representation in which each symbol
is represented as one 8-bit byte.  And I guess we don't really have a
good term to describe that representation.  I think UCS-1 is best, but
it doesn't seem to be commonly used.  So I guess I'll have to learn to
live with the misuse of the term "Latin-1" to refer to a
representation (encoding)---just as we older programmers have learned
to live with the terms "Julian Date" and "Gregorian Date" to mean a
dates in year/day-of-year form and in year/month/day form despite the
fact that this has nothing to do with the Julian or Gregorian
calendar.  OK, then.  I apologize for assuming that this was a sign of
your misunderstanding.

On the other hand, I was confused by your statement
"Ada.Character.Handling.To_Upper breaks UTF-8".  I don't even see a
way for this to make sense.  Ada.Characters.Handling works on
character types, and a character type is an enumeration type; but a
UTF-8 "character" can't be an enumeration type at all, since it's a
variable-length sequence of 8-bit bytes.  I'm not quite sure what you
meant here.

As to having utilities such as versions of Ada.Strings.Unbounded or
Ada.Strings.Fixed that work directly on UTF-8-encoded strings (and
versions of Ada.Characters that operate on single UTF-8-encoded
characters): it's certainly possible to write a package like that, and
anyone is free to do so, but I just don't think they'd be widely used
enough to add to the Standard.  I could be wrong.

                            -- Adam