comp.lang.ada
 help / color / mirror / Atom feed
From: "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de>
Subject: Re: Is there a lex utility for Ada that handles unicode?
Date: Thu, 27 Oct 2005 19:57:37 +0200
Date: 2005-10-27T19:57:26+02:00	[thread overview]
Message-ID: <ndfasdmq13aa$.gi0uh66a0di5$.dlg@40tude.net> (raw)
In-Reply-To: 1130433435.410224.186300@g49g2000cwa.googlegroups.com

On 27 Oct 2005 10:17:15 -0700, brian.b.mcguinness@lmco.com wrote:

> Is there some equivalent of the lex utility that produces
> Ada code rather than C code, and is capable of handling
> any character in the Unicode basic code plane?  I am
> thinking of using it on strings read from a GUI created
> with GtkAda, so it would probably be best if it accepted
> UTF-8 strings, but I could convert the input to a wide
> string if necessary.

Why do you wish to convert it to wide? You can parse UTF-8 encoded text
as-is. After all that was the idea behind UTF-8. For example, my unit
compiler parses directly UTF-8. The advantage is that I can use the same
parser for units spelt both in pure ASCII and in full UTF-8. I simply flag
UTF-8 tokens from the table if I don't want to recognize them. There is a
trick that 8-bit tokes need to be replaced with 2-characters UTF-8
equivalents. But they are rare. BTW, the parser is table-driven, so I don't
need lex.

For UTF-8 handing in Ada you can take a look at:
http://www.dmitry-kazakov.de/ada/strings_edit.htm

It and table-driven parsers in Ada are included in components:
http://www.dmitry-kazakov.de/ada/components.htm

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de



  parent reply	other threads:[~2005-10-27 17:57 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-10-27 17:17 Is there a lex utility for Ada that handles unicode? brian.b.mcguinness
2005-10-27 17:33 ` Martin Dowie
2005-10-27 18:33   ` Frank J. Lhota
2005-10-27 17:57 ` Dmitry A. Kazakov [this message]
2005-10-28  1:49 ` Steve
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox