comp.lang.ada
 help / color / mirror / Atom feed
From: Ludovic Brenta <ludovic.brenta@insalien.org>
Subject: Re: UTF-8 in strings - a bug?
Date: 06 May 2004 09:25:53 GMT
Date: 2004-05-06T09:25:53+00:00	[thread overview]
Message-ID: <200456-112553-85684@foorum.com> (raw)
In-Reply-To: lMmmc.58280$mU6.237078@newsb.telia.net


Bjorn Persson wrote:
> Recompiling is not a workable solution. The encoding isn't known
> until run time. Software is frequently distributed in precompiled
> form you know, and the users may use many different encodings. It
> might even be that different users on the same system use different
> encodings. So I guess a transcoding library will have to be wrapped
> around Ada.Command_Line, and probably around
> Ada.Command_Line.Environment and the standard input, output and
> error files too.

You are correct: the encoding depends not only on the operating system
but also on the particular user who runs the software.  You can learn
about which encoding is currently in effect using the getlocale(3)
library call.  glibc also has transcoding facilities, which you can
import into your Ada program; the most powerful and general one is
iconv.

I am not aware of a thick binding to either getlocale or iconv (both
are in glibc).  If you write such a binding, it would be nice to make
it GMGPL.

In the general case, though, you do not necessarily have to transcode
unless you want to manipulate the string data with algorithms that
depend on the internal encoding.

Whenever your program interacts with GTK+, it must use UTF-8 as the
internal encoding.  Even if you don't use GTK+, I'd recommend you use
gettext for all user-visible strings and store them in UTF-8 in .po
file(s).  There is a thick binding to Gettext as part of GtkAda, FWIW.

So, I would personally depart from the Ada standard in this respect,
and declare that all Strings are in UTF-8, both internally and
externally.  GtkAda does this explicitly with a separate type,
UTF8_String.

-- 
Ludovic Brenta.


-- 
Use our news server 'news.foorum.com' from anywhere.
More details at: http://nnrpinfo.go.foorum.com/



  reply	other threads:[~2004-05-06  9:25 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-05-05 22:12 UTF-8 in strings - a bug? Björn Persson
2004-05-05 23:31 ` Robert I. Eachus
2004-05-06  8:34   ` Björn Persson
2004-05-06  9:25     ` Ludovic Brenta [this message]
2004-05-06 17:13       ` Björn Persson
2004-05-06 18:24       ` Martin Krischik
2004-05-07 23:32         ` Björn Persson
2004-05-08  6:38           ` Martin Krischik
2004-05-08  7:44           ` Jacob Sparre Andersen
2004-05-08 11:06             ` Björn Persson
2004-05-08 16:25               ` Martin Krischik
2004-05-09 12:16                 ` Georg Bauhaus
2004-05-10  6:29                   ` Martin Krischik
2004-05-08 12:10           ` Georg Bauhaus
2004-05-06  9:06 ` David Starner
2004-05-06 17:36   ` Björn Persson
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox