From: "Robert C. Leif" <rleif@rleif.com>
To: <comp.lang.ada@ada-france.org>
Subject: Experiences of XML parser generators for Ada?
Date: Sat, 4 Dec 2004 12:37:07 -0800
Date: 2004-12-04T12:37:07-08:00 [thread overview]
Message-ID: <mailman.171.1102192671.10401.comp.lang.ada@ada-france.org> (raw)
The "HUGE overhead. e.g.: <detectionTreshold>84</detectionTreshold>" is
being solved by the creation of "XML Binary Characterization Properties"
http://www.w3.org/TR/xbc-properties/.
>From Section 4.3.2 Description
" Furthermore, a schema-based encoding of an XML document can achieve a
degree of compactness by using prior knowledge about the structure and
content of a document. A serialization is schema-based if it uses
information from the document's schema to achieve a better degree of
compactness. This information could be used later as the document is
processed or reconstituted. It is worth pointing out that although not self
contained, a schema-based encoding is not inherently lossy given that, in
principle, a decoder can reproduce the data model using both the encoding
and the schema. Thus, as with other techniques, a schema-based encoding can
be lossy or loss-less."
If the schema data-types are the same as the Ada data-types, the space
required should be approximately the same. The real problem is that the Ada
community has not been involved with setting W3C standards. Ada needs a
complete set of XML_IO packages including being able to create XHTML Strict.
Bob Leif
-------
Adrien Plisson wrote:
Message: 2
Date: Sat, 04 Dec 2004 00:33:22 +0100
From: Adrien Plisson <aplisson-news@stochastique.net>
Subject: Re: Experiences of XML parser generators for Ada?
To: comp.lang.ada@ada-france.org
Message-ID: <41b0f749$0$25068$ba620e4c@news.skynet.be>
Content-Type: text/plain; charset=us-ascii; format=flowed
Daniel W wrote:
> Thank you for your succinct clarification. More specifically I'm asking
for
> persons with experience of the parser generator. I actually have
XMLBooster
> downloaded, but as I said, I'm sort of short on experience.... :-)
well, i don't have any experience with parser generator (excepted with
lex), but i would like to share my experience:
i designed a software composed of 2 parts. all parts were written in a
different language, and each part was executing in its own context
(think of 2 different computers). i choosed XML as the format for
marshaled data accross the communication medium.
i first downloaded a standard XML parser (Xerces) and tried it. it was
so slow that i could not continue with it. since i was only using a
subset a XML (no dtd, no validation, no entity reference, only one
encoding), i decided to write my own XML parser and XML generator. i got
70x performance boost.
now if i look back, i think it would have been better if i had defined
my own protocol and not used XML:
- the xml fragment were all generated then parsed by software under my
control, no user intervention. so there was no need for something human
readable.
- i was mostly transmitting numeric values. since xml is a text format,
performances were teared down by all the conversions from binary to
string and back to binary.
- since i was mostly transmitting numeric values, all my text nodes were
shorter than the xml element type enclosing those values. this leads
to HUGE overhead. e.g.: <detectionTreshold>84</detectionTreshold>
encoded in Unicode is 84 bytes long, but the value expressed here is
only 1 byte long.
- the only thing xml allowed me was extensibility at no cost, in a case
were i was not really needing it.
so here comes my advice: think twice before using xml.
xml is a very powerful tool for DYNAMICALLY STRUCTURED HUMAN READABLE
TEXT. for everything else, a basic binary protocol with some well
defined rules to follow (endianness, size of data) will really be more
efficient. plus, a basic binary protocol do not need complicated
parsers...
here was my experience, i hope you find it useful.
--
rien
next reply other threads:[~2004-12-04 20:37 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-12-04 20:37 Robert C. Leif [this message]
2004-12-04 22:53 ` Experiences of XML parser generators for Ada? Adrien Plisson
2004-12-08 23:37 ` Lionel Draghi
2004-12-09 12:29 ` Georg Bauhaus
2004-12-09 21:04 ` Lionel Draghi
2004-12-09 21:09 ` Lionel Draghi
2004-12-10 22:09 ` Simon Wright
2004-12-11 0:02 ` Lionel Draghi
2004-12-11 9:03 ` Pascal Obry
[not found] <20041203110026.6F40B4C408A@lovelace.ada-france.org>
2004-12-03 23:18 ` Robert C. Leif
2004-12-07 19:41 ` Björn Persson
-- strict thread matches above, loose matches on Subject: below --
2004-12-02 21:04 Daniel W
2004-12-02 22:19 ` Georg Bauhaus
2004-12-03 8:57 ` Martin Krischik
2004-12-03 9:16 ` Adrien Plisson
2004-12-03 20:42 ` Daniel W
2004-12-03 23:33 ` Adrien Plisson
2004-12-04 8:05 ` Dmitry A. Kazakov
2004-12-04 11:40 ` Marius Amado Alves
2004-12-04 13:14 ` Martin Krischik
2004-12-05 16:27 ` Jeffrey Carter
2004-12-05 17:58 ` Dmitry A. Kazakov
2004-12-04 13:59 ` Georg Bauhaus
2004-12-05 8:47 ` Martin Krischik
2004-12-06 11:18 ` Georg Bauhaus
2004-12-06 18:12 ` Pascal Obry
2004-12-13 20:34 ` Florian Weimer
2004-12-05 1:50 ` David Botton
2004-12-04 14:01 ` Georg Bauhaus
2004-12-04 16:27 ` Dmitry A. Kazakov
2004-12-06 5:59 ` Daniel W
2004-12-06 14:48 ` Marc A. Criley
replies disabled
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox