comp.lang.ada
 help / color / mirror / Atom feed
From: Shark8 <onewingedshark@gmail.com>
Subject: Re: Unicode string comparision functions
Date: Thu, 12 Nov 2015 12:07:07 -0800 (PST)
Date: 2015-11-12T12:07:07-08:00	[thread overview]
Message-ID: <fdb68ece-f102-481c-af22-6999d29be7a1@googlegroups.com> (raw)
In-Reply-To: <n22qac$6np$1@loke.gir.dk>

On Thursday, November 12, 2015 at 12:46:22 PM UTC-7, Randy Brukardt wrote:
> "Shark8" wrote in message 
> 
> >I thought I had come across a unicode Equals_Case_Insensitive
> >(and less than) for unicode using Wide_Wide_Strings some time
> >ago, but I cannot seem to find them again; am I misremembering,
> >or were they in a really odd place?
> 
> Not an odd place, but they have their own subclause (A.4.10).

Thank you for the ref.

> 
> >For this particular application I would rather use Wide_Wide_String than
> > Wide_String so I wouldn't have to worry about invalid character 
> > [sequences]
> > for the non-ASCII characters. (And, while UTF-8 encoded strings have the
> > nice property of being endian agnostic, they still have that property.) --  
> > But I
> > suppose the main thing is to have a good case insensitive compare such 
> > that
> > PRUSSIAN and Prußian are considered equal.
> 
> Sorry, the language-defined equality won't do that. It uses 
> "locale-independent simple case folding", which means that strings of 
> different lengths are always different. (That's the same case comparison 
> that's used for Ada identifiers.)
> 
> The much more complex "locale-independent full case folding" is not provided 
> by the language, we didn't want to inflict that level of pain on Ada 
> implementers (especially as the need was unclear).

I can see why, and certainly don't begrudge that decision -- unicode is, IMO, a terrible 'solution' to the problem of multiple languages.

I thought I read something in the rationale that implied the full case folding was to be used, at least with respect identifiers in Ada's own source-code... and so mistakenly thought the Equal_Case_Insensitive would do so (after all, if the compiler itself requires that functionality there's little reason not to provide access to it).

> 
> The AARM note A.4.10(3.a/3) gives a bit of background.

I'll have to read that.

Thank you.


  reply	other threads:[~2015-11-12 20:07 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-12  4:06 Unicode string comparision functions Shark8
2015-11-12  5:04 ` Jeffrey R. Carter
2015-11-12 20:01   ` Shark8
2015-11-12 22:33     ` Jeffrey R. Carter
2015-11-13  0:10       ` Randy Brukardt
2015-11-13  8:22         ` Simon Wright
2015-11-12 19:46 ` Randy Brukardt
2015-11-12 20:07   ` Shark8 [this message]
2015-11-12 21:35     ` Randy Brukardt
2015-11-13  6:03 ` Vadim Godunko
2015-11-13 17:43   ` Shark8
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox