comp.lang.ada
 help / color / mirror / Atom feed
From: mtrenkmann <martin.trenkmann@googlemail.com>
Subject: Re: Does OpenToken support Unicode
Date: Mon, 23 Jan 2012 14:48:55 -0800 (PST)
Date: 2012-01-23T14:48:55-08:00	[thread overview]
Message-ID: <b05bce9a-d9bd-484f-8787-22264c77c6ec@y10g2000vbn.googlegroups.com> (raw)
In-Reply-To: 82vcpgf1zl.fsf@stephe-leake.org

Just for closing this thread, here is what I have done.

Beginning at the Text_Feeder level I changed all occurences of
Character/String variables that are involved in storing parsing data
(buffers, lexemes, etc) to the Wide_Wide_Character/Wide_Wide_String
type.

Then I provided a derivation of Text_Feeder that read UTF-8
(multibyte) characters from Ada.Text_IO and decode them into
Wide_Wide_Characters. The decoding is currently based on
System.WCh_Con (GNAT).

As mentioned by Stephe I also tried to implement a generic solution
regarding the character type, but that wasn't completely possible. For
instance in the top-level OpenToken package there are constants for
EOL and EOF that are of type Character. Text_Feeder.Text_IO uses
Ada.Text_IO.Get_Line which is not generic. Furthermore, as far as I
know, Ada exceptions cannot carry Wide_Wide_Strings to report the
lexemes of unexpected tokens ...

To support constants and non-generic Ada procedures one has to turn
them into formal parameters of generic OpenToken packages, right?
Maybe this could end in an generics instantiation nightmare. This let
me come to the question why in Ada are some packages prefixed with
Wide_Wide_ and not generic. (Sorry for this question, but a come from
the C++ universe.)

Ok, thanks again for your previous hints. If there is any interest I
will provide the modified OpenToken code with UTF-8 support after
finishing my thesis.

-- Martin



  parent reply	other threads:[~2012-01-23 22:55 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-15 14:09 Does OpenToken support Unicode mtrenkmann
2011-12-15 15:16 ` Dmitry A. Kazakov
2011-12-17  0:58 ` Stephen Leake
2012-01-23 22:03   ` mtrenkmann
2012-01-23 22:48   ` mtrenkmann [this message]
2012-01-24 10:40     ` Georg Bauhaus
2012-01-24 13:47     ` Stephen Leake
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox