From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-2.9 required=5.0 tests=BAYES_00,MAILING_LIST_MULTI
	autolearn=unavailable autolearn_force=no version=3.4.4
X-Google-Thread: 103376,7b97e385047500eb
X-Google-Attributes: gid103376,public
X-Google-Language: ENGLISH,ASCII-7-bit
Path: 
 g2news1.google.com!news1.google.com!proxad.net!freenix!enst.fr!melchior!cuivre.fr.eu.org!melchior.frmug.org!not-for-mail
From: "Robert C. Leif" <rleif@rleif.com>
Newsgroups: comp.lang.ada
Subject: Experiences of XML parser generators for Ada?
Date: Sat, 4 Dec 2004 12:37:07 -0800
Organization: Newport Instruments
Message-ID: <mailman.171.1102192671.10401.comp.lang.ada@ada-france.org>
Reply-To: rleif@rleif.com
NNTP-Posting-Host: lovelace.ada-france.org
Mime-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Trace: melchior.cuivre.fr.eu.org 1102192672 8237 212.85.156.195 (4 Dec 2004
 20:37:52 GMT)
X-Complaints-To: usenet@melchior.cuivre.fr.eu.org
NNTP-Posting-Date: Sat, 4 Dec 2004 20:37:52 +0000 (UTC)
To: <comp.lang.ada@ada-france.org>
Return-Path: <rleif@rleif.com>
X-Authenticated-User: rleif.rleif.com
X-Mailer: Microsoft Office Outlook, Build 11.0.6353
Thread-Index: AcTaQPkbgHmmGgxaQpyJ6rO/h8V4vA==
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180
X-Virus-Scanned: by amavisd-new-20030616-p10 (Debian) at ada-france.org
X-BeenThere: comp.lang.ada@ada-france.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Gateway to the comp.lang.ada Usenet newsgroup"
	<comp.lang.ada.ada-france.org>
List-Unsubscribe: <http://www.ada-france.org/mailman/listinfo/comp.lang.ada>,
	<mailto:comp.lang.ada-request@ada-france.org?subject=unsubscribe>
List-Post: <mailto:comp.lang.ada@ada-france.org>
List-Help: <mailto:comp.lang.ada-request@ada-france.org?subject=help>
List-Subscribe: <http://www.ada-france.org/mailman/listinfo/comp.lang.ada>,
	<mailto:comp.lang.ada-request@ada-france.org?subject=subscribe>
Xref: g2news1.google.com comp.lang.ada:6761
Date: 2004-12-04T12:37:07-08:00

   The "HUGE overhead.  e.g.: <detectionTreshold>84</detectionTreshold>" is
being solved by the creation of "XML Binary Characterization Properties"
http://www.w3.org/TR/xbc-properties/. 
>>From Section 4.3.2 Description
" Furthermore, a schema-based encoding of an XML document can achieve a
degree of compactness by using prior knowledge about the structure and
content of a document.  A serialization is schema-based if it uses
information from the document's schema to achieve a better degree of
compactness. This information could be used later as the document is
processed or reconstituted. It is worth pointing out that although not self
contained, a schema-based encoding is not inherently lossy given that, in
principle, a decoder can reproduce the data model using both the encoding
and the schema. Thus, as with other techniques, a schema-based encoding can
be lossy or loss-less."
   If the schema data-types are the same as the Ada data-types, the space
required should be approximately the same.  The real problem is that the Ada
community has not been involved with setting W3C standards.  Ada needs a
complete set of XML_IO packages including being able to create XHTML Strict.
   Bob Leif
   -------
   Adrien Plisson wrote:
   Message: 2
   Date: Sat, 04 Dec 2004 00:33:22 +0100
   From: Adrien Plisson <aplisson-news@stochastique.net>
   Subject: Re: Experiences of XML parser generators for Ada?
   To: comp.lang.ada@ada-france.org
   Message-ID: <41b0f749$0$25068$ba620e4c@news.skynet.be>
   Content-Type: text/plain; charset=us-ascii; format=flowed
   
   Daniel W wrote:
   > Thank you for your succinct clarification. More specifically I'm asking
for 
   > persons with experience of the parser generator. I actually have
XMLBooster 
   > downloaded, but as I said, I'm sort of short on experience.... :-)
   
   well, i don't have any experience with parser generator (excepted with 
   lex), but i would like to share my experience:
   
   i designed a software composed of 2 parts. all parts were written in a 
   different language, and each part was executing in its own context 
   (think of 2 different computers). i choosed XML as the format for 
   marshaled data accross the communication medium.
   
   i first downloaded a standard XML parser (Xerces) and tried it. it was 
   so slow that i could not continue with it. since i was only using a 
   subset a XML (no dtd, no validation, no entity reference, only one 
   encoding), i decided to write my own XML parser and XML generator. i got 
     70x performance boost.
   
   now if i look back, i think it would have been better if i had defined 
   my own protocol and not used XML:
   - the xml fragment were all generated then parsed by software under my 
   control, no user intervention. so there was no need for something human 
   readable.
   - i was mostly transmitting numeric values. since xml is a text format, 
   performances were teared down by all the conversions from binary to 
   string and back to binary.
   - since i was mostly transmitting numeric values, all my text nodes were 
     shorter than the xml element type enclosing those values. this leads 
   to HUGE overhead. e.g.: <detectionTreshold>84</detectionTreshold> 
   encoded in Unicode is 84 bytes long, but the value expressed here is 
   only 1 byte long.
   - the only thing xml allowed me was extensibility at no cost, in a case 
   were i was not really needing it.
   
   so here comes my advice: think twice before using xml.
   xml is a very powerful tool for DYNAMICALLY STRUCTURED HUMAN READABLE 
   TEXT. for everything else, a basic binary protocol with some well 
   defined rules to follow (endianness, size of data) will really be more 
   efficient. plus, a basic binary protocol do not need complicated
parsers...
   
   here was my experience, i hope you find it useful.
   
   -- 
   rien