comp.lang.ada
 help / color / mirror / Atom feed
* Regular expressions???
@ 2001-06-29  9:23 Michael Andersson
  2001-06-29 10:05 ` David C. Hoos, Sr.
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Michael Andersson @ 2001-06-29  9:23 UTC (permalink / raw)


Hi!
I'm trying to write a simple XML-parser and I wonder how regular
expressions in Ada works. Can I use variables as in Pearl so that these
are assigned a value according to the values found in the string. Say
for example that I have a file looking like this:
<Name="Michael" Phone="7980438" Age="22"/>
I want to extract Michael, 7989438 and 22 from the string above and
assign these to some variables. Is it possible to use regular
expressions in Ada to do this or do I have to use procedure/functions
from the Ada.Strings package?

Thanks!
/Michael Andersson



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Regular expressions???
  2001-06-29  9:23 Regular expressions??? Michael Andersson
@ 2001-06-29 10:05 ` David C. Hoos, Sr.
  2001-06-29 16:13 ` Ray Blaak
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: David C. Hoos, Sr. @ 2001-06-29 10:05 UTC (permalink / raw)
  To: comp.lang.ada; +Cc: michael

There is nothing defined in the language for regular expressions.

However, if you are using GNAT, you can use the GNAT.Regexp
package to do what you want.

----- Original Message ----- 
From: "Michael Andersson" <michael@ida.his.se>
Newsgroups: comp.lang.ada
To: <comp.lang.ada@ada.eu.org>
Sent: June 29, 2001 4:23 AM
Subject: Regular expressions???


> Hi!
> I'm trying to write a simple XML-parser and I wonder how regular
> expressions in Ada works. Can I use variables as in Pearl so that these
> are assigned a value according to the values found in the string. Say
> for example that I have a file looking like this:
> <Name="Michael" Phone="7980438" Age="22"/>
> I want to extract Michael, 7989438 and 22 from the string above and
> assign these to some variables. Is it possible to use regular
> expressions in Ada to do this or do I have to use procedure/functions
> from the Ada.Strings package?
> 
> Thanks!
> /Michael Andersson
> _______________________________________________
> comp.lang.ada mailing list
> comp.lang.ada@ada.eu.org
> http://ada.eu.org/mailman/listinfo/comp.lang.ada
> 




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Regular expressions???
  2001-06-29  9:23 Regular expressions??? Michael Andersson
  2001-06-29 10:05 ` David C. Hoos, Sr.
@ 2001-06-29 16:13 ` Ray Blaak
  2001-07-02  9:54   ` M. A. Alves
  2001-06-30 14:49 ` Florian Weimer
  2001-06-30 20:38 ` R. Srinivasan
  3 siblings, 1 reply; 8+ messages in thread
From: Ray Blaak @ 2001-06-29 16:13 UTC (permalink / raw)


Michael Andersson <michael@ida.his.se> writes:


> I'm trying to write a simple XML-parser and I wonder how regular
> expressions in Ada works. Can I use variables as in Pearl so that these
> are assigned a value according to the values found in the string. Say
> for example that I have a file looking like this:
> <Name="Michael" Phone="7980438" Age="22"/>
> I want to extract Michael, 7989438 and 22 from the string above and
> assign these to some variables. Is it possible to use regular
> expressions in Ada to do this or do I have to use procedure/functions
> from the Ada.Strings package?

For this problem, it is in fact easier and faster to parse the XML properly to
extract the relevant bits. That is, make a pseudo-state machine that looks for
<, >, =, ", />, etc, performing callbacks or inserting into an XML tree
structure when the relevant bits are recognized.

If you are going to hunt around for a RE package in Ada, you can also hunt
around for an XML package. 

Otherwise, writing an XML parser is a lot easier than writing a RE processor.

Regular expressions are often mismused where a real parser would be more
appropriate. Consider an RE that can extract bits from this line:

  <Person Name="Michael" Phone="7980438" Age="22"/>

Fine. Now, would that RE work for this:

  <Person
     Name="Michael" 
     Phone="7980438"
     Age="22"/>

or this:

  <Person Age="22" Phone="7980438" Name="Michael" />

which are both equivalent semantically?

REs work best with line oriented fix-formatted data. Free-form data tends to
break REs.

-- 
Cheers,                                        The Rhythm is around me,
                                               The Rhythm has control.
Ray Blaak                                      The Rhythm is inside me,
blaak@infomatch.com                            The Rhythm has my soul.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Regular expressions???
  2001-06-29  9:23 Regular expressions??? Michael Andersson
  2001-06-29 10:05 ` David C. Hoos, Sr.
  2001-06-29 16:13 ` Ray Blaak
@ 2001-06-30 14:49 ` Florian Weimer
  2001-06-30 20:38 ` R. Srinivasan
  3 siblings, 0 replies; 8+ messages in thread
From: Florian Weimer @ 2001-06-30 14:49 UTC (permalink / raw)


Michael Andersson <michael@ida.his.se> writes:

> I'm trying to write a simple XML-parser and I wonder how regular
> expressions in Ada works.

You cannot parse XML with regular expressions because regular
expressions are not powerful enough for this kind of task.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Regular expressions???
  2001-06-29  9:23 Regular expressions??? Michael Andersson
                   ` (2 preceding siblings ...)
  2001-06-30 14:49 ` Florian Weimer
@ 2001-06-30 20:38 ` R. Srinivasan
  3 siblings, 0 replies; 8+ messages in thread
From: R. Srinivasan @ 2001-06-30 20:38 UTC (permalink / raw)


you may want to take a look at the "expat" library and the binding i
developed for the expat library. expat library is available from
sourceforge. The binding is available from http://alibrowse.sourceforge.net
Look for adabind-expat.zip

"Michael Andersson" <michael@ida.his.se> wrote in message
news:3B3C48FB.E40ECFA9@ida.his.se...
> Hi!
> I'm trying to write a simple XML-parser and I wonder how regular
> expressions in Ada works. Can I use variables as in Pearl so that these
> are assigned a value according to the values found in the string. Say
> for example that I have a file looking like this:
> <Name="Michael" Phone="7980438" Age="22"/>
> I want to extract Michael, 7989438 and 22 from the string above and
> assign these to some variables. Is it possible to use regular
> expressions in Ada to do this or do I have to use procedure/functions
> from the Ada.Strings package?
>
> Thanks!
> /Michael Andersson





^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Regular expressions???
  2001-06-29 16:13 ` Ray Blaak
@ 2001-07-02  9:54   ` M. A. Alves
  2001-07-03  8:25     ` Emmanuel Briot
  0 siblings, 1 reply; 8+ messages in thread
From: M. A. Alves @ 2001-07-02  9:54 UTC (permalink / raw)
  To: comp.lang.ada

> If you are going to hunt around for a RE package in Ada, you can also hunt
> around for an XML package. 

Hunt no more:

  lexis.di.fct.unl.pt/ADaLIB/xml.htm

Also note that GNAT comes with an 'alternative' pattern matching facility
that is more powerfull then REs: GNAT.Spitbol .

-- 
   ,
 M A R I O   data miner, LIACC, room 221   tel 351+226078830, ext 121
 A M A D O   Rua Campo Alegre, 823         fax 351+226003654
 A L V E S   P-4150 PORTO, Portugal        mob 351+939354002




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Regular expressions???
  2001-07-02  9:54   ` M. A. Alves
@ 2001-07-03  8:25     ` Emmanuel Briot
  2001-07-04  4:53       ` Ray Blaak
  0 siblings, 1 reply; 8+ messages in thread
From: Emmanuel Briot @ 2001-07-03  8:25 UTC (permalink / raw)


"M. A. Alves" <maa@liacc.up.pt> writes:
> > If you are going to hunt around for a RE package in Ada, you can also hunt
> > around for an XML package.


You could have a look at
    http://libre.act-europe.fr.

We have made one public release of a beta XML parser, that accepts all
constructions in XML. It has support for SAX and DOM as well (ie you can either
use callbacks to parse the XML stream or build a tree and then manipulate it).

It was noted earlier than an XML parser is much easier to program than a RE
parser. Obviously, this is true. However, writing a parser from scratch is
actually not so simple, since the XML standard is pretty extensive. Support for
entities (&name;), DTDs, ... was kind of a time-consuming task :-)

Feel free to contribute to this package if you think there are some things
missing. I will release a new version hopefully soon, with a much improved
parser (twice as fast) and better support for DTDs (as well as conditional
sections in DTDs, that were not supporting in that first release).



> Also note that GNAT comes with an 'alternative' pattern matching facility
> that is more powerfull then REs: GNAT.Spitbol .

There are three regular expression (or similar) packages in GNAT:

  GNAT.Regexp: simple regular expressions (you can't get the string matched by
       a parenthesis pair), on a whole string (not part of a string).

  GNAT.Regpat: PERL-like regular expressions (basically, we were really inspired
       by the C code from Perl. There are a few known issues in 3.13, that are
       fixed in 3.14

  GNAT.Spitbol: even more powerful syntax. I have never used it personnaly, so
       can't comment on it.

Have a look at the files  g-regexp.ads, g-regpat.ads and g-spitbol.ads for a full
and extensive documentation on these three packages.

Emmanuel



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Regular expressions???
  2001-07-03  8:25     ` Emmanuel Briot
@ 2001-07-04  4:53       ` Ray Blaak
  0 siblings, 0 replies; 8+ messages in thread
From: Ray Blaak @ 2001-07-04  4:53 UTC (permalink / raw)


Emmanuel Briot <briot@gnat.com> writes:
> It was noted earlier than an XML parser is much easier to program than a RE
> parser. Obviously, this is true. However, writing a parser from scratch is
> actually not so simple, since the XML standard is pretty extensive. Support
> for entities (&name;), DTDs, ... was kind of a time-consuming task :-)

True enough. However, a simple XML parser that ignores encodings, validation,
and entities is still useful enough to process a significant number of files
(i.e. most files are in ASCII, don't use &entities too much, and grammars
are irrelevant to final client side processing that needs the grammars "built
in" to the evaluation code anyway).

Certainly such a thing is good enough for educational purposes, and for
"private" application storage formats.

Regular expressions however, are quickly erroneous with the slightest
preturbation of the data, due to the free form nested nature of XML.

-- 
Cheers,                                        The Rhythm is around me,
                                               The Rhythm has control.
Ray Blaak                                      The Rhythm is inside me,
blaak@infomatch.com                            The Rhythm has my soul.



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2001-07-04  4:53 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-06-29  9:23 Regular expressions??? Michael Andersson
2001-06-29 10:05 ` David C. Hoos, Sr.
2001-06-29 16:13 ` Ray Blaak
2001-07-02  9:54   ` M. A. Alves
2001-07-03  8:25     ` Emmanuel Briot
2001-07-04  4:53       ` Ray Blaak
2001-06-30 14:49 ` Florian Weimer
2001-06-30 20:38 ` R. Srinivasan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox