From: Stephen Leake <stephen_leake@stephe-leake.org>
Subject: Re: Does OpenToken support Unicode
Date: Tue, 24 Jan 2012 08:47:51 -0500
Date: 2012-01-24T08:47:51-05:00 [thread overview]
Message-ID: <824nvlfbzs.fsf@stephe-leake.org> (raw)
In-Reply-To: b05bce9a-d9bd-484f-8787-22264c77c6ec@y10g2000vbn.googlegroups.com
mtrenkmann <martin.trenkmann@googlemail.com> writes:
> Just for closing this thread, here is what I have done.
Thanks for the update.
> Beginning at the Text_Feeder level I changed all occurences of
> Character/String variables that are involved in storing parsing data
> (buffers, lexemes, etc) to the Wide_Wide_Character/Wide_Wide_String
> type.
>
> Then I provided a derivation of Text_Feeder that read UTF-8
> (multibyte) characters from Ada.Text_IO and decode them into
> Wide_Wide_Characters. The decoding is currently based on
> System.WCh_Con (GNAT).
>
> As mentioned by Stephe I also tried to implement a generic solution
> regarding the character type, but that wasn't completely possible. For
> instance in the top-level OpenToken package there are constants for
> EOL and EOF that are of type Character.
Yes, that's an annoying hack. You could try moving them down lower.
> Text_Feeder.Text_IO uses Ada.Text_IO.Get_Line which is not generic.
You'd have to write a generic wrapper for Ada.Text_IO. That might be
useful in other contexts, but it is a lot of work.
> Furthermore, as far as I know, Ada exceptions cannot carry
> Wide_Wide_Strings to report the lexemes of unexpected tokens ...
True, but they can carry UTF-8.
> To support constants and non-generic Ada procedures one has to turn
> them into formal parameters of generic OpenToken packages, right?
Right.
> Maybe this could end in an generics instantiation nightmare.
Well, complicated anyway :).
> This let me come to the question why in Ada are some packages prefixed
> with Wide_Wide_ and not generic. (Sorry for this question, but a come
> from the C++ universe.)
Good point. For example, Elementary_Functions is generic, and
instantiations are provided for the various float types.
There may be a problem with the functions that convert to other string
types, but those could be moved to child packages.
--
-- Stephe
prev parent reply other threads:[~2012-01-24 13:47 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-15 14:09 Does OpenToken support Unicode mtrenkmann
2011-12-15 15:16 ` Dmitry A. Kazakov
2011-12-17 0:58 ` Stephen Leake
2012-01-23 22:03 ` mtrenkmann
2012-01-23 22:48 ` mtrenkmann
2012-01-24 10:40 ` Georg Bauhaus
2012-01-24 13:47 ` Stephen Leake [this message]
replies disabled
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox