From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM
	autolearn=ham autolearn_force=no version=3.4.4
X-Google-Thread: a07f3367d7,8143b93889fe9472
X-Google-Attributes: gida07f3367d7,public,usenet
X-Google-NewGroupId: yes
X-Google-Language: ENGLISH,ASCII-7-bit
X-Received: by 10.224.206.195 with SMTP id fv3mr1049015qab.1.1359485197113;
        Tue, 29 Jan 2013 10:46:37 -0800 (PST)
X-Received: by 10.49.38.194 with SMTP id i2mr223905qek.30.1359485197083; Tue,
 29 Jan 2013 10:46:37 -0800 (PST)
Path: 
 k2ni3907qap.0!nntp.google.com!p13no6096930qai.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail
Newsgroups: comp.lang.ada
Date: Tue, 29 Jan 2013 10:46:36 -0800 (PST)
In-Reply-To: <8dfcf819-e1d0-4578-a795-a4bf724b5014@googlegroups.com>
Complaints-To: groups-abuse@google.com
Injection-Info: glegroupsg2000goo.googlegroups.com;
 posting-host=90.194.162.131;
 posting-account=L2-UcQkAAAAfd_BqbeNHs3XeM0jTXloS
NNTP-Posting-Host: 90.194.162.131
References: <8dfcf819-e1d0-4578-a795-a4bf724b5014@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <80a5c765-e5ff-4e7d-bc1b-e35f92a710a7@googlegroups.com>
Subject: Re: Ada standard and maximum line lengths
From: Lucretia <laguest9000@googlemail.com>
Injection-Date: Tue, 29 Jan 2013 18:46:37 +0000
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Date: 2013-01-29T10:46:36-08:00
List-Id: <comp.lang.ada>

Ok, so this has gone off-topic, I just want put it back on...

I will be implementing a scanner for Ada 2012 and then a parser, I'm wantin=
g to create a subset based on the standard rather than some educational sub=
set which doesn't show how a real compiler actually works.

I will read the entire file into a buffer and then scan the buffer. So, I c=
an either break it into lines which the scanner breaks up into tokens, whic=
h seems like extra work. Or, I can read it like I would normally, scan the =
buffer a character at a time, it's a state machine so I will know when I'm =
in a comment, etc. so will expect to break the comment at EOL. I can still =
check the length of an identifier and chuck out a warning if it's too long =
for the implementation, say 250 characters or whatever.

Also, the method of breaking up the buffer into tokens which have start and=
 end characters is a good one, until you have the AST at this point you sho=
uld make copies of the lexemes you need and store them in the AST and delet=
e the buffer otherwise the compiler will take up too much memory keeping al=
l that around when you start to implement "with," you will have to read in =
more specifications and sometimes bodies re generics.

So, in short, I'm wondering whether it really matters to parse by breaking =
up into lines or not.

Thanks,
Luke.