From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.1 required=5.0 tests=BAYES_00,
	PP_MIME_FAKE_ASCII_TEXT autolearn=no autolearn_force=no version=3.4.4
Path: 
 eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!mx02.eternal-september.org!feeder.eternal-september.org!gandalf.srv.welterde.de!news.jacob-sparre.dk!loke.jacob-sparre.dk!pnx.dk!.POSTED!not-for-mail
From: "Randy Brukardt" <randy@rrsoftware.com>
Newsgroups: comp.lang.ada
Subject: Re: Unicode string comparision functions
Date: Thu, 12 Nov 2015 13:46:20 -0600
Organization: JSA Research & Innovation
Message-ID: <n22qac$6np$1@loke.gir.dk>
References: <00aab01c-7d18-408a-9a4c-feb80ac9a1e1@googlegroups.com>
NNTP-Posting-Host: rrsoftware.com
X-Trace: loke.gir.dk 1447357581 6905 24.196.82.226 (12 Nov 2015 19:46:21 GMT)
X-Complaints-To: news@jacob-sparre.dk
NNTP-Posting-Date: Thu, 12 Nov 2015 19:46:21 +0000 (UTC)
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2900.5931
X-RFC2646: Format=Flowed; Original
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157
Xref: news.eternal-september.org comp.lang.ada:28332
Date: 2015-11-12T13:46:20-06:00
List-Id: <comp.lang.ada>

"Shark8" <onewingedshark@gmail.com> wrote in message 
news:00aab01c-7d18-408a-9a4c-feb80ac9a1e1@googlegroups.com...
>I thought I had come across a unicode Equals_Case_Insensitive
>(and less than) for unicode using Wide_Wide_Strings some time
>ago, but I cannot seem to find them again; am I misremembering,
>or were they in a really odd place?

Not an odd place, but they have their own subclause (A.4.10).

>For this particular application I would rather use Wide_Wide_String than
> Wide_String so I wouldn't have to worry about invalid character 
> [sequences]
> for the non-ASCII characters. (And, while UTF-8 encoded strings have the
> nice property of being endian agnostic, they still have that property.) --  
> But I
> suppose the main thing is to have a good case insensitive compare such 
> that
> PRUSSIAN and Prußian are considered equal.

Sorry, the language-defined equality won't do that. It uses 
"locale-independent simple case folding", which means that strings of 
different lengths are always different. (That's the same case comparison 
that's used for Ada identifiers.)

The much more complex "locale-independent full case folding" is not provided 
by the language, we didn't want to inflict that level of pain on Ada 
implementers (especially as the need was unclear).

The AARM note A.4.10(3.a/3) gives a bit of background.

                                       Randy.


Thanks.