From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,66b5a21ede8a1dd0 X-Google-Attributes: gid103376,public X-Google-ArrivalTime: 1995-03-09 05:21:10 PST Path: bga.com!news.sprintlink.net!howland.reston.ans.net!Germany.EU.net!zib-berlin.de!tfh-berlin.de!cha01!weberwu From: weberwu@cha01.tfh-berlin.de (Debora Weber-Wulff) Newsgroups: comp.lang.ada Subject: Re: icelandic eth thorn Date: 9 Mar 1995 11:20:32 GMT Organization: TFH-Berlin (Berlin, Germany) Message-ID: <3jmoa0$f2k@sun24.tfh-berlin.de> References: <3iuu5p$15qe@info4.rus.uni-stuttgart.de> NNTP-Posting-Host: cha01.tfh-berlin.de X-Newsreader: TIN [version 1.2 PL2] Date: 1995-03-09T11:20:32+00:00 List-Id: Peter Hermann (ucaa2385@iris2.csv.ica.uni-stuttgart.de) wrote: : icelandic question: : For a simple sorting feature running under both Ada83 and Ada95 : I am mapping the latin-1 character code for : a" (a with umlaut) (diaeresis) german-ae : as well as a` a with grave accent, a' a with acute accent, : a^ a with circumflex, a~ a with tilde, ao a with ring, ae ae diphtong : to the code of character a. : Correspondingly I map german_sharp_s to s, or n_tilde to n. : For my application this is the best fit for my sorting purposes. : However, I do not know how whether the characters : icelandic_eth and icelandic_thorn are usually sorted into : the lower set of ascii-characters in a typical icelandic : telephone book or whether they are characters entirely in their : own right. Now that's a "thorn"y problem ;-) as not all dictionaries sort the same way. Some sort a' and a the same, many sort first the a and then all the a' letters. The "funny stuff" like thorn and o-umlaut get sorted after z. There's an Icelandic-English dictionary that apparently just did a "SELECT ALL WORDS FROM A* to Z*" and left out ALL of the thorn words. Note that eth is only lowercase (can never be a first letter) whereas thorn can be both upper or lowercase and as a first letter. Bless og saelir! -- Debora Weber-Wulff (Professorin fuer Softwaretechnik und Programmiersprachen) Technische Fachhochschule Berlin, FB Informatik, Luxemburger Str. 10, 13353 Berlin, Germany email: weberwu@tfh-berlin.de