From: "Björn Persson" <spam-away@nowhere.nil>
Subject: Re: wide_string and assertions
Date: Sat, 05 Jun 2004 18:41:01 GMT
Date: 2004-06-05T18:41:01+00:00 [thread overview]
Message-ID: <1towc.94710$dP1.304947@newsc.telia.net> (raw)
In-Reply-To: <c9sbah$abt$1@a1-hrz.uni-duisburg.de>
Georg Bauhaus wrote:
> Martin Krischik <krischik@users.sourceforge.net> wrote:
>
> : As I said the functions are based on the Unicode part of XML/Ada. Not point
> : reinventing the Wheel.
>
> I think I haven't actually, as the wheels are slightly different ;-)
I'm inventing wheels too. My wheel is spherical so it rolls in all
directions. ;-)
I just finished a first, limited version of a library that might be of
interest here. I call it EAstrings, for encoding-aware strings. It keeps
track of how each string is encoded and transcodes them automatically
when necessary. It uses Iconv for transcoding, so it supports all
encodings that Iconv supports (and that's a *lot* on my Fedora box).
My aim is to make it possible to use it almost as a drop-in replacement
for both Unbounded_String and Unbounded_Wide_String. Many operations
aren't implemented yet, but here are some things you can already do:
EA_1, EA_2 : EAstrings.EAstring;
UTF_8_String : EAstrings.Byte_Sequence :=
(49, 32, 226, 130, 172, 32, 226, 137, 160);
Latin_1_String : String := "price: 1 £";
UCS_2_String : Ada.Strings.Wide_Unbounded.Unbounded_Wide_String;
...
EA_1 := EAstrings.To_EAstring(UTF_8_String, "UTF-8");
EA_2 := EAstrings.Latin_1.To_EAstring(Latin_1_String);
EAstrings.Append(EA_1, EAstrings.Tail(EA_2, 4));
UCS_2_String := EAstrings.UCS_2.To_Unbounded_Wide_String(EA_1);
One drawback is that it only works on Unix (and probably only fairly
modern Unixes). I hope to get it ported to Windows some day.
For those who want to have a look, I have uploaded the files to
http://rombobeorn.webhop.net/eastrings (only temporarily, so don't link
there).
There are also wrappers for Ada.Command_Line.Command_Name and
Ada.Command_Line.Argument that return EAstrings correctly marked with
the encoding that is used in the environment. I have vague plans for
more wrappers for subprograms that take String parameters that may be
interpreted as some other encoding than Latin 1. There's no wrapper for
exceptions yet, but here's a way an exception could be raised with a
message from an EAstring:
Ada.Exceptions.Raise_Exception
(The_Error'Identity,
EAstrings.Byte_Sequence_To_Fake_String
(EAstrings.Bytes
(EAstrings.Transcode(EAstring_Message,
EAstrings.OS.OS_Encoding))));
This would convert the message to whichever encoding is set in the
environment.
(Transcode can raise exceptions, but a wrapper for Raise_Exception could
of course catch those.)
--
Björn Persson
jor ers @sv ge.
b n_p son eri nu
next prev parent reply other threads:[~2004-06-05 18:41 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-06-03 16:23 wide_string and assertions Georg Bauhaus
2004-06-04 3:37 ` Randy Brukardt
2004-06-04 8:49 ` Martin Krischik
2004-06-05 8:42 ` Pascal Obry
2004-06-05 17:15 ` Martin Krischik
2004-06-04 17:48 ` Georg Bauhaus
2004-06-05 7:10 ` Martin Krischik
2004-06-05 11:37 ` Georg Bauhaus
2004-06-05 17:11 ` Martin Krischik
2004-06-05 18:41 ` Björn Persson [this message]
2004-06-08 16:41 ` Georg Bauhaus
2004-06-09 13:19 ` Björn Persson
2004-06-09 15:03 ` Georg Bauhaus
2004-06-09 15:26 ` Björn Persson
2004-06-10 12:25 ` Georg Bauhaus
2004-06-10 13:30 ` Björn Persson
2004-06-05 12:32 ` China Björn Persson
2004-06-05 16:49 ` China, character sets Georg Bauhaus
2004-06-05 21:50 ` China Alexander E. Kopilovich
2004-06-04 20:42 ` wide_string and assertions Nick Roberts
2004-06-06 13:23 ` Björn Persson
replies disabled
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox