comp.lang.ada
 help / color / mirror / Atom feed
From: gautier_niouzes@hotmail.com (Gautier)
Subject: Re: HTML parser in Ada ?
Date: 16 Nov 2002 11:43:05 -0800
Date: 2002-11-16T19:43:06+00:00	[thread overview]
Message-ID: <17cd177c.0211161143.7f8d5842@posting.google.com> (raw)
In-Reply-To: slrnata6rh.161.randhol+news@kiuk0152.chembio.ntnu.no

Preben Randhol:

> If you are not making something that is aimed to read the web-pages on
> the net, please make something that reads XHTML only or that it follows
> the HTML DTD strictly and rejects all faulty pages. Trying to make
> something that can read web-pages is very difficult and your
> application gets very error-prone. Most web-pages out there are broken
> and does not use propper HTML. So if you want to display the pages
> correctly then you have to make a lot of exceptions to the HTML DTD.

[x] Yes, I'm aware of it. I would put a "HTML_DTD_Strict: Boolean;"
somewhere, since one aim is to filter HTML files
"from the Web": remove the evil Javascript, meta's, ...
A more ambitious task would be to transform junk HTML into
compliant one - but I won't do it (mmmh... unless...).

[...]
> > http://join.msn.com/?page=features/virus
> 
> Resistance is futile.

Didn't you know it ?

OK - I'll take a look at proposed solutions:
XML/Ada and OpenToken.
Thanks!
________________________________________________________
Gautier  --  http://www.mysunrise.ch/users/gdm/gsoft.htm

NB: Pour une r�ponse directe, adresse sur le site ouaibe!



  parent reply	other threads:[~2002-11-16 19:43 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-11-15 10:49 HTML parser in Ada ? Gautier direct_replies_not_read
2002-11-15 16:06 ` Preben Randhol
2002-11-15 17:00   ` Adrian Knoth
2002-11-16  4:11   ` Randy Brukardt
2002-11-16 19:43   ` Gautier [this message]
2002-11-17 12:00     ` Preben Randhol
2002-12-02 19:50       ` Nicolas Seriot
2002-11-18 14:17     ` Georg Bauhaus
  -- strict thread matches above, loose matches on Subject: below --
2002-11-15 11:08 Grein, Christoph
2002-11-15 14:24 ` Victor Porton
2002-11-18  6:38 Grein, Christoph
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox