From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00,FORGED_GMAIL_RCVD, FREEMAIL_FROM autolearn=no autolearn_force=no version=3.4.4 X-Google-Thread: 103376,a525118741961e98 X-Google-NewGroupId: yes X-Google-Attributes: gida07f3367d7,domainid0,public,usenet X-Google-Language: ENGLISH,ASCII Path: g2news2.google.com!postnews.google.com!k16g2000vbq.googlegroups.com!not-for-mail From: =?ISO-8859-1?Q?bj=F6rn_lundin?= Newsgroups: comp.lang.ada Subject: Re: xml/ada dropping data when pre-defined entities are separated by space? Date: Tue, 1 Feb 2011 01:49:13 -0800 (PST) Organization: http://groups.google.com Message-ID: <007966e9-4cf4-4f91-ac54-99a3a34ff7ca@k16g2000vbq.googlegroups.com> References: <05aafe44-cdd9-4c28-8e3f-24ecd9067ab3@u6g2000vbh.googlegroups.com> <4d415333$0$6769$9b4e6d93@newsspool3.arcor-online.net> <839dbe65-e971-4db7-ad25-269253f02c69@c10g2000vbv.googlegroups.com> <5e91567e-883f-428c-b01e-ee51e91ca30f@o8g2000vbq.googlegroups.com> <4d4600c9$0$6880$9b4e6d93@newsspool2.arcor-online.net> <782436de-8a75-450d-be79-1efe555c4f5e@u3g2000vbe.googlegroups.com> NNTP-Posting-Host: 83.145.50.10 Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: posting.google.com 1296553753 24805 127.0.0.1 (1 Feb 2011 09:49:13 GMT) X-Complaints-To: groups-abuse@google.com NNTP-Posting-Date: Tue, 1 Feb 2011 09:49:13 +0000 (UTC) Complaints-To: groups-abuse@google.com Injection-Info: k16g2000vbq.googlegroups.com; posting-host=83.145.50.10; posting-account=3_reEwoAAAC163IAIrx427KYmwahFuh9 User-Agent: G2/1.0 X-HTTP-UserAgent: Mozilla/5.0 (Windows; U; Windows NT 5.1; sv-SE; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 (.NET CLR 3.5.30729),gzip(gfe) Xref: g2news2.google.com comp.lang.ada:17792 Date: 2011-02-01T01:49:13-08:00 List-Id: On 31 Jan, 15:56, Emmanuel Briot wrote: > > I was just going to ask what makes xml/ada decide why a textnode is > > sometimes split into several nodes. I see the pattern now, of course, > > 'split on predefined entity' but why? > > Because XML/Ada tries to be as efficient as possible, and normalizing > the document takes time that a lot of application have no need for. If > indeed your application is only able to deal with normalized document > (it really shouldn't, the XML standard is quite clear that a document > is not necessarily normalized), then indeed you should call Normalize. Ok. A colleauge of mine pointed me to http://www.w3.org/TR/2000/REC-DOM-Level-2-Core-20001113/core.html#ID-131229= 5772 which states : "When a document is first made available via the DOM, there is only one Text node for each block of text. Users may create adjacent Text nodes that represent the contents of a given element without any intervening markup, but should be aware that there is no way to represent the separations between these nodes in XML or HTML, so they will not (in general) persist between DOM editing sessions. The normalize() method on Node merges any such adjacent Text objects into a single node for each block of text." Isn't this the case of the first sentence? I parse the document, I do not edit it in any way, I traverse it, and there are several childnodes instead of one. Or how should 'When a document is first made available via the DOM' be interpreted? /Bj=F6rn