From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII X-Google-Thread: 103376,2afae4a128914036 X-Google-Attributes: gid103376,public Path: g2news1.google.com!news1.google.com!news.glorb.com!news-stoc.telia.net!217.209.241.210.MISMATCH!news-stod.telia.net!telia.net!masternews.telia.net.!newsc.telia.net.POSTED!not-for-mail From: =?ISO-8859-1?Q?Bj=F6rn_Persson?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.1) Gecko/20031114 X-Accept-Language: sv, sv-se, sv-fi, en-gb, en-us, en MIME-Version: 1.0 Newsgroups: comp.lang.ada Subject: Re: wide_string and assertions References: <47SdnXI-D-3icyLdRVn-uQ@megapath.net> <1224046.crrTJmpIeA@linux1.krischik.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: quoted-printable Message-ID: <1towc.94710$dP1.304947@newsc.telia.net> Date: Sat, 05 Jun 2004 18:41:01 GMT NNTP-Posting-Host: 217.209.116.179 X-Complaints-To: abuse@telia.com X-Trace: newsc.telia.net 1086460861 217.209.116.179 (Sat, 05 Jun 2004 20:41:01 CEST) NNTP-Posting-Date: Sat, 05 Jun 2004 20:41:01 CEST Organization: Telia Internet Xref: g2news1.google.com comp.lang.ada:1135 Date: 2004-06-05T18:41:01+00:00 List-Id: Georg Bauhaus wrote: > Martin Krischik wrote: >=20 > : As I said the functions are based on the Unicode part of XML/Ada. Not= point > : reinventing the Wheel. >=20 > I think I haven't actually, as the wheels are slightly different ;-) I'm inventing wheels too. My wheel is spherical so it rolls in all=20 directions. ;-) I just finished a first, limited version of a library that might be of=20 interest here. I call it EAstrings, for encoding-aware strings. It keeps = track of how each string is encoded and transcodes them automatically=20 when necessary. It uses Iconv for transcoding, so it supports all=20 encodings that Iconv supports (and that's a *lot* on my Fedora box). My aim is to make it possible to use it almost as a drop-in replacement=20 for both Unbounded_String and Unbounded_Wide_String. Many operations=20 aren't implemented yet, but here are some things you can already do: EA_1, EA_2 : EAstrings.EAstring; UTF_8_String : EAstrings.Byte_Sequence :=3D (49, 32, 226, 130, 172, 32, 226, 137, 160); Latin_1_String : String :=3D "price: 1 =A3"; UCS_2_String : Ada.Strings.Wide_Unbounded.Unbounded_Wide_String; ... EA_1 :=3D EAstrings.To_EAstring(UTF_8_String, "UTF-8"); EA_2 :=3D EAstrings.Latin_1.To_EAstring(Latin_1_String); EAstrings.Append(EA_1, EAstrings.Tail(EA_2, 4)); UCS_2_String :=3D EAstrings.UCS_2.To_Unbounded_Wide_String(EA_1); One drawback is that it only works on Unix (and probably only fairly=20 modern Unixes). I hope to get it ported to Windows some day. For those who want to have a look, I have uploaded the files to=20 http://rombobeorn.webhop.net/eastrings (only temporarily, so don't link=20 there). There are also wrappers for Ada.Command_Line.Command_Name and=20 Ada.Command_Line.Argument that return EAstrings correctly marked with=20 the encoding that is used in the environment. I have vague plans for=20 more wrappers for subprograms that take String parameters that may be=20 interpreted as some other encoding than Latin 1. There's no wrapper for=20 exceptions yet, but here's a way an exception could be raised with a=20 message from an EAstring: Ada.Exceptions.Raise_Exception (The_Error'Identity, EAstrings.Byte_Sequence_To_Fake_String (EAstrings.Bytes (EAstrings.Transcode(EAstring_Message, EAstrings.OS.OS_Encoding)))); This would convert the message to whichever encoding is set in the=20 environment. (Transcode can raise exceptions, but a wrapper for Raise_Exception could = of course catch those.) --=20 Bj=F6rn Persson jor ers @sv ge. b n_p son eri nu