From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00,FORGED_GMAIL_RCVD, FREEMAIL_FROM autolearn=no autolearn_force=no version=3.4.4 X-Google-Thread: 103376,bbe592428babd509 X-Google-NewGroupId: yes X-Google-Attributes: gida07f3367d7,domainid0,public,usenet X-Google-Language: ENGLISH,ASCII-7-bit Path: g2news2.google.com!news2.google.com!npeer02.iad.highwinds-media.com!news.highwinds-media.com!feed-me.highwinds-media.com!nx02.iad01.newshosting.com!newshosting.com!198.186.194.250.MISMATCH!news-xxxfer.readnews.com!transit4.readnews.com!transit3.readnews.com!news-out.readnews.com!postnews3.readnews.com!not-for-mail Message-Id: <4bd47aba$0$2386$4d3efbfe@news.sover.net> From: "Peter C. Chapin" Subject: Re: Web browser in Ada Newsgroups: comp.lang.ada Date: Sun, 25 Apr 2010 12:29:10 -0400 References: <02c2bf63-260d-4acc-bd58-c8fb8a591ec3@b6g2000yqi.googlegroups.com> <0bf9425c-32a1-4b93-b938-ae4a4e24a761@c21g2000yqk.googlegroups.com> <4bd23c72$0$2399$4d3efbfe@news.sover.net> <4bd41c70$0$6882$9b4e6d93@newsspool2.arcor-online.net> User-Agent: KNode/0.10.9 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7Bit Organization: SoVerNet (sover.net) NNTP-Posting-Host: 187e5a6f.news.sover.net X-Trace: DXC=G@fd1VK4]f X-Complaints-To: abuse@sover.net Xref: g2news2.google.com comp.lang.ada:11175 Date: 2010-04-25T12:29:10-04:00 List-Id: Georg Bauhaus wrote: > What is an HTML 5 parser supposed to be? > > If it is to parse the SGML text defined by the HTML 5 grammar > then you would, in effect, have to copy browsers' near natural language > processing capabilities, since having only an SGML parser with little > more than moderate error correction capabilities is by far not enough > for HTML. > > Some browsers have a parser switch, IIRC. Switch to best effort > mode for the important, but junk, HTML code that is out there, > inevitably, new or old. Or be more optimistic and > make an attempt at treating input text as if is was well formed > XML text. HTML 5 is intended to address (fix) the current horrible mess by specifying in a reasonably precise way exactly how erroneous documents are to be handled. That is, all HTML 5 implementations should handle bad documents in a similar manner. Note that HTML 5 is *not* an SGML markup... nor is it intended to be. A fully functioning web browser in today's world needs to handle "tag soup" documents. Maybe someday that will no longer be necessary. Still... a clean room implementation of HTML 5, in Ada, might be a nice contribution to the cause of creating a better web browser. I wonder if there are any easily identifyable security critical components that could benefit from SPARK. Peter