From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.2 required=5.0 tests=BAYES_00,FREEMAIL_FROM, URI_TRY_3LD autolearn=no autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII X-Google-Thread: 103376,ac1252c179cf9560 X-Google-Attributes: gid103376,public X-Google-ArrivalTime: 2002-11-16 11:43:06 PST Path: archiver1.google.com!postnews1.google.com!not-for-mail From: gautier_niouzes@hotmail.com (Gautier) Newsgroups: comp.lang.ada Subject: Re: HTML parser in Ada ? Date: 16 Nov 2002 11:43:05 -0800 Organization: http://groups.google.com/ Message-ID: <17cd177c.0211161143.7f8d5842@posting.google.com> References: NNTP-Posting-Host: 80.218.95.120 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Trace: posting.google.com 1037475786 16229 127.0.0.1 (16 Nov 2002 19:43:06 GMT) X-Complaints-To: groups-abuse@google.com NNTP-Posting-Date: 16 Nov 2002 19:43:06 GMT Xref: archiver1.google.com comp.lang.ada:30985 Date: 2002-11-16T19:43:06+00:00 List-Id: Preben Randhol: > If you are not making something that is aimed to read the web-pages on > the net, please make something that reads XHTML only or that it follows > the HTML DTD strictly and rejects all faulty pages. Trying to make > something that can read web-pages is very difficult and your > application gets very error-prone. Most web-pages out there are broken > and does not use propper HTML. So if you want to display the pages > correctly then you have to make a lot of exceptions to the HTML DTD. [x] Yes, I'm aware of it. I would put a "HTML_DTD_Strict: Boolean;" somewhere, since one aim is to filter HTML files "from the Web": remove the evil Javascript, meta's, ... A more ambitious task would be to transform junk HTML into compliant one - but I won't do it (mmmh... unless...). [...] > > http://join.msn.com/?page=features/virus > > Resistance is futile. Didn't you know it ? OK - I'll take a look at proposed solutions: XML/Ada and OpenToken. Thanks! ________________________________________________________ Gautier -- http://www.mysunrise.ch/users/gdm/gsoft.htm NB: Pour une r�ponse directe, adresse sur le site ouaibe!