From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Thread: 103376,7b97e385047500eb
X-Google-Attributes: gid103376,public
X-Google-Language: ENGLISH,ASCII-7-bit
Path: 
 g2news1.google.com!news2.google.com!news.maxwell.syr.edu!newsfeed.icl.net!newsfeed.fjserv.net!news.tele.dk!news.tele.dk!small.news.tele.dk!news-fra1.dfn.de!news-ham1.dfn.de!news.uni-hamburg.de!cs.tu-berlin.de!uni-duisburg.de!not-for-mail
From: Georg Bauhaus <sb463ba@l1-hrz.uni-duisburg.de>
Newsgroups: comp.lang.ada
Subject: Re: Experiences of XML parser generators for Ada?
Date: Sat, 4 Dec 2004 13:59:47 +0000 (UTC)
Organization: GMUGHDU
Message-ID: <cosfsj$hkq$1@a1-hrz.uni-duisburg.de>
References: <41af8365@news.wineasy.se>
 <2426353.SD16GYvm6f@linux1.krischik.com>
 <41b02dfe$0$25046$ba620e4c@news.skynet.be> <41b0cfc3$1@news.wineasy.se>
 <41b0f749$0$25068$ba620e4c@news.skynet.be>
 <bk4wndvrxe16$.1mmj78jrlyqfp$.dlg@40tude.net>
 <mailman.170.1102160463.10401.comp.lang.ada@ada-france.org>
NNTP-Posting-Host: l1-hrz.uni-duisburg.de
X-Trace: a1-hrz.uni-duisburg.de 1102168787 18074 134.91.1.34 (4 Dec 2004
 13:59:47 GMT)
X-Complaints-To: usenet@news.uni-duisburg.de
NNTP-Posting-Date: Sat, 4 Dec 2004 13:59:47 +0000 (UTC)
User-Agent: tin/1.5.8-20010221 ("Blue Water") (UNIX) (HP-UX/B.11.00
 (9000/800))
Xref: g2news1.google.com comp.lang.ada:6754
Date: 2004-12-04T13:59:47+00:00
List-Id: <comp.lang.ada>

Marius Amado Alves <amado.alves@netcabo.pt> wrote:
:>>so here comes my advice: think twice before using xml.
:>>xml is a very powerful tool for DYNAMICALLY STRUCTURED HUMAN READABLE 
:>>TEXT.
:> 
:> Human readable? I have an impression that punched cards were more readable!
:> XML reminds me one implementation of Algol 60, in which all keyword were
:> upper case put in quotes, e.g. 'BEGIN'. Though, even that was more readable
:> than XML.
: 
: Indeed XML has failed its two main purposes, which were to be readable 
: by humans and efficiently processable by machines.

Complete nonsense.

First, as has been demonstrably envisioned by the creators of
SGML, the XML subset is NOT designed to be readable-in-the-sense-of
"as easy to read as your well typeset sunday newspaper."

What is meant by "human readable text" is that structured values
can be represented as text, explicitly naming the parts and relations.
If a human looks at three bit patterns, he/she will have to
know context and a bit of luck in order to be able to interpret
the bit patterns like a computer does.
If a human looks at

  <issue-date   year="2003"  month="4" day="31" />

the three bitpatterns have been made structured human readable text,
and it is a lot more clear what is meant when compared to three
bit patterns.
Imagine you are hunting bugs, e.g. Year 2000 bugs in a four-company
distributed program, and all you get as diagnostic aid is decimal
representations of bare, unnamed numbers buried in a server program's
trace output, 
Then you cannot tell the month value and the day value apart,
same size, ranges overlap.
There might be ambiguities if there is no minimum markup, like a
comma separating values.
You can't tell without further research or luck whether these value
represent an issue date or a birth date.

If you want less intrusive markup, for example because you are an author,
you want to type the markup yourself, but you still want your document
instances almost as readable as plain text, say so. Write "SHORTAG YES"
for example, into your SGML declaration. Use more than the XML subset
then.

Second, XML parsing can be a lot more efficient than SGML parsing with
SHORTREF YES, SHORTTAG YES, CONCUR etc.
XML validation, if needed, has to be compared to the checking that
your program does when it doesn't use XML for middleware data
communication.
Of course, as has been said in this thread, if you transform
values from their machine representation into text, and back, this
will take time.  But this time is not necessarily lost. What you
get is a program-independent represenation of data.

If you don't need all this, then I guess you can brew your own data
representation, and let all your programs communicate with themselves.


-- Georg