comp.lang.ada
 help / color / mirror / Atom feed
* Is there a lex utility for Ada that handles unicode?
@ 2005-10-27 17:17 brian.b.mcguinness
  2005-10-27 17:33 ` Martin Dowie
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: brian.b.mcguinness @ 2005-10-27 17:17 UTC (permalink / raw)


Is there some equivalent of the lex utility that produces
Ada code rather than C code, and is capable of handling
any character in the Unicode basic code plane?  I am
thinking of using it on strings read from a GUI created
with GtkAda, so it would probably be best if it accepted
UTF-8 strings, but I could convert the input to a wide
string if necessary.

Thanks.

--- Brian




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Is there a lex utility for Ada that handles unicode?
  2005-10-27 17:17 Is there a lex utility for Ada that handles unicode? brian.b.mcguinness
@ 2005-10-27 17:33 ` Martin Dowie
  2005-10-27 18:33   ` Frank J. Lhota
  2005-10-27 17:57 ` Dmitry A. Kazakov
  2005-10-28  1:49 ` Steve
  2 siblings, 1 reply; 5+ messages in thread
From: Martin Dowie @ 2005-10-27 17:33 UTC (permalink / raw)


brian.b.mcguinness@lmco.com wrote:
> Is there some equivalent of the lex utility that produces
> Ada code rather than C code, and is capable of handling
> any character in the Unicode basic code plane?  I am
> thinking of using it on strings read from a GUI created
> with GtkAda, so it would probably be best if it accepted
> UTF-8 strings, but I could convert the input to a wide
> string if necessary.

Alex and Ayacc:
http://www.iste.uni-stuttgart.de/ps/ada-software/html/tools.html

<Warning>
I've never even looked at them! :-)
</Warning>

Cheers

-- Martin



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Is there a lex utility for Ada that handles unicode?
  2005-10-27 17:17 Is there a lex utility for Ada that handles unicode? brian.b.mcguinness
  2005-10-27 17:33 ` Martin Dowie
@ 2005-10-27 17:57 ` Dmitry A. Kazakov
  2005-10-28  1:49 ` Steve
  2 siblings, 0 replies; 5+ messages in thread
From: Dmitry A. Kazakov @ 2005-10-27 17:57 UTC (permalink / raw)


On 27 Oct 2005 10:17:15 -0700, brian.b.mcguinness@lmco.com wrote:

> Is there some equivalent of the lex utility that produces
> Ada code rather than C code, and is capable of handling
> any character in the Unicode basic code plane?  I am
> thinking of using it on strings read from a GUI created
> with GtkAda, so it would probably be best if it accepted
> UTF-8 strings, but I could convert the input to a wide
> string if necessary.

Why do you wish to convert it to wide? You can parse UTF-8 encoded text
as-is. After all that was the idea behind UTF-8. For example, my unit
compiler parses directly UTF-8. The advantage is that I can use the same
parser for units spelt both in pure ASCII and in full UTF-8. I simply flag
UTF-8 tokens from the table if I don't want to recognize them. There is a
trick that 8-bit tokes need to be replaced with 2-characters UTF-8
equivalents. But they are rare. BTW, the parser is table-driven, so I don't
need lex.

For UTF-8 handing in Ada you can take a look at:
http://www.dmitry-kazakov.de/ada/strings_edit.htm

It and table-driven parsers in Ada are included in components:
http://www.dmitry-kazakov.de/ada/components.htm

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Is there a lex utility for Ada that handles unicode?
  2005-10-27 17:33 ` Martin Dowie
@ 2005-10-27 18:33   ` Frank J. Lhota
  0 siblings, 0 replies; 5+ messages in thread
From: Frank J. Lhota @ 2005-10-27 18:33 UTC (permalink / raw)


Martin Dowie wrote:
> 
> 
> Alex and Ayacc:
> http://www.iste.uni-stuttgart.de/ps/ada-software/html/tools.html
> 
> <Warning>
> I've never even looked at them! :-)
> </Warning>
> 
> Cheers

I've used Alex and Ayacc. They are rather close to Lex / Yacc. Lex and 
Yacc, however, are designed to work with 8 bit characters, and Alex / 
Ayacc inherit that design. The parsing of Unicode sources require a 
different set of tools.

> -- Martin


-- 
"All things extant in this world,
Gods of Heaven, gods of Earth,
Let everything be as it should be;
Thus shall it be!"
- Magical chant from "Magical Shopping Arcade Abenobashi"

"Drizzle, Drazzle, Drozzle, Drome,
Time for the this one to come home!"
- Mr. Lizard from "Tutor Turtle"



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Is there a lex utility for Ada that handles unicode?
  2005-10-27 17:17 Is there a lex utility for Ada that handles unicode? brian.b.mcguinness
  2005-10-27 17:33 ` Martin Dowie
  2005-10-27 17:57 ` Dmitry A. Kazakov
@ 2005-10-28  1:49 ` Steve
  2 siblings, 0 replies; 5+ messages in thread
From: Steve @ 2005-10-28  1:49 UTC (permalink / raw)


<brian.b.mcguinness@lmco.com> wrote in message 
news:1130433435.410224.186300@g49g2000cwa.googlegroups.com...
> Is there some equivalent of the lex utility that produces
> Ada code rather than C code, and is capable of handling
> any character in the Unicode basic code plane?  I am
> thinking of using it on strings read from a GUI created
> with GtkAda, so it would probably be best if it accepted
> UTF-8 strings, but I could convert the input to a wide
> string if necessary.
>
> Thanks.
>
> --- Brian
>
I don't know if it helps, but if I were looking (based on what I have seen 
in the past) I would look at:
  Aflex
  Ayacc
  OpenToken

Source code is available.  It probably wouldn't be that much work to make a 
unicode version.

Steve
(The Duck) 





^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2005-10-28  1:49 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-10-27 17:17 Is there a lex utility for Ada that handles unicode? brian.b.mcguinness
2005-10-27 17:33 ` Martin Dowie
2005-10-27 18:33   ` Frank J. Lhota
2005-10-27 17:57 ` Dmitry A. Kazakov
2005-10-28  1:49 ` Steve

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox