comp.lang.ada
 help / color / mirror / Atom feed
* [GNAT-specific] Using the Form parameter/-gnatW switch
@ 2010-06-25  2:08 deadlyhead
  2010-06-25  7:03 ` Georg Bauhaus
  0 siblings, 1 reply; 3+ messages in thread
From: deadlyhead @ 2010-06-25  2:08 UTC (permalink / raw)


I've been messing around a bit with files of various encodings, and
just recently I've become aware of the Form parameter to Open and
Create and the -gnatW switch for handling character encoding.

This is a pretty big deal to me.  For a long time I've been a bit...
frustrated? ... by the fact that the Ada standard specifically gives
us Wide_ and Wide_Wide_Characters and their associated strings, but
actually _using_ them seemed pretty much worthless.  I mean, if you
can't actually _talk_ with them to a modern system (UTF-8 or UTF-16
encoding seems to be pretty much the way it goes), what's the point in
using them?

So I'm pretty happy with using either the WCEM=8 or -gnatW8 methods of
setting the encoding to get UTF-8 input and output.  What I'm
wondering now is can I get other UTF outputs to work?

I actually have the peculiar case of dealing with UTF-32 encoded
files, which need to be translated to UTF-8 for editing, and back to
UTF-32 for machine-use again.  It seems that it would be pretty
straight-forward to just pull the file in with a straight
Wide_Wide_Text_IO.Open/Get_Line system, then output via
Wide_Wide_Text_IO.Put on a file where Form => "WCEM=8".  So far,
though, I'm having trouble since the encoding for GNAT defaults to
bracket notation, not binary character dumping.  As well, if I want
output printed to the terminal in UTF-8, I have to set the -gnatW8
switch, which means that _now_ the default encoding for all
unspecified files is UTF-8.  Any ideas on how to get around this?

And, just for giggles, is it _possible_ to use the Upper_Half encoding
"WCEM=u" to encode UTF-16?  Or is this something completely different
(which it seems it might be, from the little that's said in the GNAT
Reference Manual).

I'm okay with giving up on this method and using the XML/Ada Unicode
libraries for the text translation.  It'd be nice if I didn't have to,
though.



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [GNAT-specific] Using the Form parameter/-gnatW switch
  2010-06-25  2:08 [GNAT-specific] Using the Form parameter/-gnatW switch deadlyhead
@ 2010-06-25  7:03 ` Georg Bauhaus
  2010-06-26  6:59   ` deadlyhead
  0 siblings, 1 reply; 3+ messages in thread
From: Georg Bauhaus @ 2010-06-25  7:03 UTC (permalink / raw)


On 6/25/10 4:08 AM, deadlyhead wrote:
> I've been messing around a bit with files of various encodings, and
> just recently I've become aware of the Form parameter to Open and
> Create and the -gnatW switch for handling character encoding.

(Sometimes I think that Ada designers should, as part of their
"engineering awareness", work in a "web shop" for a few
months.  The experience of working with real encoded data
might make them look again at character encoding, but less,
uhm, condescendingly.  Character encoding (or string
encoding) is a representation issue and should be treated at
this level.  ISO 10646 deals with UTF.  A character is
ubiquitously a fundamental piece of data.

In my dream, then, there is enough motivation to make
character encoding a solid part of the language proper
and thus have Ada be the first language that makes character
representation well defined and easy to use!)

> I'm okay with giving up on this method and using the XML/Ada Unicode
> libraries for the text translation.  It'd be nice if I didn't have to,
> though.

Does GNAT 2010 support the Ada 2012 strings encoding package?

http://www.ada-auth.org/cgi-bin/cvsweb.cgi/ai05s/ai05-0137-1.txt?rev=1.5&raw=Y

Another alternative might be EAstrings (encoding aware strings).
It has an IO child package. It's part of AdaCL at
http://adacl.sourceforge.net/



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Using the Form parameter/-gnatW switch
  2010-06-25  7:03 ` Georg Bauhaus
@ 2010-06-26  6:59   ` deadlyhead
  0 siblings, 0 replies; 3+ messages in thread
From: deadlyhead @ 2010-06-26  6:59 UTC (permalink / raw)


On Jun 25, 12:03 am, Georg Bauhaus <rm-
host.bauh...@maps.futureapps.de> wrote:
> On 6/25/10 4:08 AM, deadlyhead wrote:
>
> > I've been messing around a bit with files of various encodings, and
> > just recently I've become aware of the Form parameter to Open and
> > Create and the -gnatW switch for handling character encoding.
>
> (Sometimes I think that Ada designers should, as part of their
> "engineering awareness", work in a "web shop" for a few
> months.  The experience of working with real encoded data
> might make them look again at character encoding, but less,
> uhm, condescendingly.  Character encoding (or string
> encoding) is a representation issue and should be treated at
> this level.  ISO 10646 deals with UTF.  A character is
> ubiquitously a fundamental piece of data.
>
> In my dream, then, there is enough motivation to make
> character encoding a solid part of the language proper
> and thus have Ada be the first language that makes character
> representation well defined and easy to use!)
>
> > I'm okay with giving up on this method and using the XML/Ada Unicode
> > libraries for the text translation.  It'd be nice if I didn't have to,
> > though.
>
> Does GNAT 2010 support the Ada 2012 strings encoding package?
>
> http://www.ada-auth.org/cgi-bin/cvsweb.cgi/ai05s/ai05-0137-1.txt?rev=...
>
> Another alternative might be EAstrings (encoding aware strings).
> It has an IO child package. It's part of AdaCL athttp://adacl.sourceforge.net/

I've searched around Adacore's site, googling ath manually, but
haven't found any references to their supporting Strings.Encodings in
GNAT 2010.  Still, it's very encouraging that ARG has proposed making
encodings a standard part of the language.  I don't know of any other
languages that make character encoding part of the standard, though
I'm pretty ignorant when it comes to the vast majority of languages
out there.

Also, I keep forgetting about AdaCL. I've been meaning to try it out
for a long time.  Perhaps this is just the right place to start.

-- deadlyhead



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2010-06-26  6:59 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-06-25  2:08 [GNAT-specific] Using the Form parameter/-gnatW switch deadlyhead
2010-06-25  7:03 ` Georg Bauhaus
2010-06-26  6:59   ` deadlyhead

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox