comp.lang.ada
 help / color / mirror / Atom feed
From: "Björn Persson" <spam-away@nowhere.nil>
Subject: Re: wide_string and assertions
Date: Sat, 05 Jun 2004 18:41:01 GMT
Date: 2004-06-05T18:41:01+00:00	[thread overview]
Message-ID: <1towc.94710$dP1.304947@newsc.telia.net> (raw)
In-Reply-To: <c9sbah$abt$1@a1-hrz.uni-duisburg.de>

Georg Bauhaus wrote:

> Martin Krischik <krischik@users.sourceforge.net> wrote:
> 
> : As I said the functions are based on the Unicode part of XML/Ada. Not point
> : reinventing the Wheel.
> 
> I think I haven't actually, as the wheels are slightly different ;-)

I'm inventing wheels too. My wheel is spherical so it rolls in all 
directions. ;-)

I just finished a first, limited version of a library that might be of 
interest here. I call it EAstrings, for encoding-aware strings. It keeps 
track of how each string is encoded and transcodes them automatically 
when necessary. It uses Iconv for transcoding, so it supports all 
encodings that Iconv supports (and that's a *lot* on my Fedora box).

My aim is to make it possible to use it almost as a drop-in replacement 
for both Unbounded_String and Unbounded_Wide_String. Many operations 
aren't implemented yet, but here are some things you can already do:

    EA_1, EA_2     : EAstrings.EAstring;
    UTF_8_String   : EAstrings.Byte_Sequence :=
                       (49, 32, 226, 130, 172, 32, 226, 137, 160);
    Latin_1_String : String := "price: 1 £";
    UCS_2_String   : Ada.Strings.Wide_Unbounded.Unbounded_Wide_String;
    ...
    EA_1 := EAstrings.To_EAstring(UTF_8_String, "UTF-8");
    EA_2 := EAstrings.Latin_1.To_EAstring(Latin_1_String);
    EAstrings.Append(EA_1, EAstrings.Tail(EA_2, 4));
    UCS_2_String := EAstrings.UCS_2.To_Unbounded_Wide_String(EA_1);

One drawback is that it only works on Unix (and probably only fairly 
modern Unixes). I hope to get it ported to Windows some day.

For those who want to have a look, I have uploaded the files to 
http://rombobeorn.webhop.net/eastrings (only temporarily, so don't link 
there).

There are also wrappers for Ada.Command_Line.Command_Name and 
Ada.Command_Line.Argument that return EAstrings correctly marked with 
the encoding that is used in the environment. I have vague plans for 
more wrappers for subprograms that take String parameters that may be 
interpreted as some other encoding than Latin 1. There's no wrapper for 
exceptions yet, but here's a way an exception could be raised with a 
message from an EAstring:

    Ada.Exceptions.Raise_Exception
      (The_Error'Identity,
       EAstrings.Byte_Sequence_To_Fake_String
         (EAstrings.Bytes
            (EAstrings.Transcode(EAstring_Message,
                                 EAstrings.OS.OS_Encoding))));

This would convert the message to whichever encoding is set in the 
environment.

(Transcode can raise exceptions, but a wrapper for Raise_Exception could 
of course catch those.)

-- 
Björn Persson

jor ers @sv ge.
b n_p son eri nu




  parent reply	other threads:[~2004-06-05 18:41 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-06-03 16:23 wide_string and assertions Georg Bauhaus
2004-06-04  3:37 ` Randy Brukardt
2004-06-04  8:49   ` Martin Krischik
2004-06-05  8:42     ` Pascal Obry
2004-06-05 17:15       ` Martin Krischik
2004-06-04 17:48   ` Georg Bauhaus
2004-06-05  7:10     ` Martin Krischik
2004-06-05 11:37       ` Georg Bauhaus
2004-06-05 17:11         ` Martin Krischik
2004-06-05 18:41         ` Björn Persson [this message]
2004-06-08 16:41           ` Georg Bauhaus
2004-06-09 13:19             ` Björn Persson
2004-06-09 15:03               ` Georg Bauhaus
2004-06-09 15:26                 ` Björn Persson
2004-06-10 12:25                   ` Georg Bauhaus
2004-06-10 13:30                     ` Björn Persson
2004-06-05 12:32     ` China Björn Persson
2004-06-05 16:49       ` China, character sets Georg Bauhaus
2004-06-05 21:50       ` China Alexander E. Kopilovich
2004-06-04 20:42   ` wide_string and assertions Nick Roberts
2004-06-06 13:23   ` Björn Persson
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox