comp.lang.ada
 help / color / mirror / Atom feed
From: Georg Bauhaus <rm.dash-bauhaus@futureapps.de>
Subject: Re: The letter Sharp S and the English language
Date: Mon, 25 Mar 2013 20:48:02 +0100
Date: 2013-03-25T20:48:02+01:00	[thread overview]
Message-ID: <5150a9f2$0$6567$9b4e6d93@newsspool4.arcor-online.net> (raw)
In-Reply-To: <792f8298-4502-40cf-acef-bda706555738@googlegroups.com>

On 25.03.13 16:23, Adam Beneschan wrote:
> On Saturday, March 23, 2013 2:22:26 PM UTC-7, Georg Bauhaus wrote:
>> In case you remember a heated discussions of what � is,
>> whether it is an S-Z ligature or an S-S, and how to (not)
>> downcase "ACCESS", more evidence comes from Ireland of 1759,
>> in the signature of Arthur Guinne�,
>>
>> http://home.arcor.de/bauhaus/Ada/GUINNESS.jpg
> 
> That pretty clearly looks like two separate letters to me, although the two s's are in different styles.  But it isn't a ligature.  I'm not sure what your point is since I don't remember the original thread very well.

I'm investigating how Unicode enabled Ada can help me "export"
street names to Switzerland. Thus,

  To_Upper ("Xyz-Stra�e");  -- String or Wide_String

What interests me is whether or not this might or might not work in the
future, i.e. with Ada 2012, in the light of recent developments of
ISO/IEC 10646:

First, you'd typically not be writing '�' in Switzerland and instead
replace every occurrence with "ss". That's for both lower case and
upper case. (And also when using small caps). So, To_Upper's definition
won't help.

But! Since Ada 2005 there are two new twists. In 2008, ISO/IEC 10646
has published an official upper case character for '�', U+1E9E. And in
2010, official spelling (read: government; "amtlich") requires U+1E9E
in geographical names. These include street names.

http://141.74.33.52/stagn/Portals/0/101125_TopR5.pdf

Currently, GNAT's implementation of
 Ada.Wide_Characters.Handling.To_Upper
gives Wide_Character'Val (223) for To_Upper ('�'), AFAICS.

Unicode's CaseFolding.txt, if applicable, has two lines pertaining
to the matter,

1E9E; F; 0073 0073; # LATIN CAPITAL LETTER SHARP S
1E9E; S; 00DF; # LATIN CAPITAL LETTER SHARP S

So I'm wondering if Simple Case Mapping might mean that

  To_Upper (Wide_Character'('�'))

should return Wide_Character'Val (16#1E9E#) in Ada 2012.


('�' will thus continue to cause problems originating in web based form
entry fields and elsewhere, I'm almost sure. Just one out of many
experiences: a major fruit company's customer invoices have consistently
shown what looks like junk HTML right after "Stra" in my address for years.)




  reply	other threads:[~2013-03-25 19:48 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-23 21:22 The letter Sharp S and the English language Georg Bauhaus
2013-03-25 15:23 ` Adam Beneschan
2013-03-25 19:48   ` Georg Bauhaus [this message]
2013-03-25 23:08     ` Randy Brukardt
2013-03-31 19:47     ` Paul Sture
2013-03-25 21:55   ` Georg Bauhaus
2013-03-25 16:15 ` Eryndlia Mavourneen
2013-03-25 19:42   ` Georg Bauhaus
2013-03-25 20:12     ` Eryndlia Mavourneen
2013-03-25 22:09       ` Adam Beneschan
2013-03-25 23:12     ` Randy Brukardt
2013-03-26 13:13       ` Eryndlia Mavourneen
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox