From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=BAYES_00, PP_MIME_FAKE_ASCII_TEXT autolearn=no autolearn_force=no version=3.4.4 Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!mx02.eternal-september.org!feeder.eternal-september.org!gandalf.srv.welterde.de!news.jacob-sparre.dk!loke.jacob-sparre.dk!pnx.dk!.POSTED!not-for-mail From: "Randy Brukardt" Newsgroups: comp.lang.ada Subject: Re: Unicode string comparision functions Date: Thu, 12 Nov 2015 13:46:20 -0600 Organization: JSA Research & Innovation Message-ID: References: <00aab01c-7d18-408a-9a4c-feb80ac9a1e1@googlegroups.com> NNTP-Posting-Host: rrsoftware.com X-Trace: loke.gir.dk 1447357581 6905 24.196.82.226 (12 Nov 2015 19:46:21 GMT) X-Complaints-To: news@jacob-sparre.dk NNTP-Posting-Date: Thu, 12 Nov 2015 19:46:21 +0000 (UTC) X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 6.00.2900.5931 X-RFC2646: Format=Flowed; Original X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 Xref: news.eternal-september.org comp.lang.ada:28332 Date: 2015-11-12T13:46:20-06:00 List-Id: "Shark8" wrote in message news:00aab01c-7d18-408a-9a4c-feb80ac9a1e1@googlegroups.com... >I thought I had come across a unicode Equals_Case_Insensitive >(and less than) for unicode using Wide_Wide_Strings some time >ago, but I cannot seem to find them again; am I misremembering, >or were they in a really odd place? Not an odd place, but they have their own subclause (A.4.10). >For this particular application I would rather use Wide_Wide_String than > Wide_String so I wouldn't have to worry about invalid character > [sequences] > for the non-ASCII characters. (And, while UTF-8 encoded strings have the > nice property of being endian agnostic, they still have that property.) -- > But I > suppose the main thing is to have a good case insensitive compare such > that > PRUSSIAN and Prußian are considered equal. Sorry, the language-defined equality won't do that. It uses "locale-independent simple case folding", which means that strings of different lengths are always different. (That's the same case comparison that's used for Ada identifiers.) The much more complex "locale-independent full case folding" is not provided by the language, we didn't want to inflict that level of pain on Ada implementers (especially as the need was unclear). The AARM note A.4.10(3.a/3) gives a bit of background. Randy. Thanks.