From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: a07f3367d7,8143b93889fe9472 X-Google-Attributes: gida07f3367d7,public,usenet X-Google-NewGroupId: yes X-Google-Language: ENGLISH,ASCII-7-bit X-Received: by 10.224.206.195 with SMTP id fv3mr1049015qab.1.1359485197113; Tue, 29 Jan 2013 10:46:37 -0800 (PST) X-Received: by 10.49.38.194 with SMTP id i2mr223905qek.30.1359485197083; Tue, 29 Jan 2013 10:46:37 -0800 (PST) Path: k2ni3907qap.0!nntp.google.com!p13no6096930qai.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail Newsgroups: comp.lang.ada Date: Tue, 29 Jan 2013 10:46:36 -0800 (PST) In-Reply-To: <8dfcf819-e1d0-4578-a795-a4bf724b5014@googlegroups.com> Complaints-To: groups-abuse@google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=90.194.162.131; posting-account=L2-UcQkAAAAfd_BqbeNHs3XeM0jTXloS NNTP-Posting-Host: 90.194.162.131 References: <8dfcf819-e1d0-4578-a795-a4bf724b5014@googlegroups.com> User-Agent: G2/1.0 MIME-Version: 1.0 Message-ID: <80a5c765-e5ff-4e7d-bc1b-e35f92a710a7@googlegroups.com> Subject: Re: Ada standard and maximum line lengths From: Lucretia Injection-Date: Tue, 29 Jan 2013 18:46:37 +0000 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Date: 2013-01-29T10:46:36-08:00 List-Id: Ok, so this has gone off-topic, I just want put it back on... I will be implementing a scanner for Ada 2012 and then a parser, I'm wantin= g to create a subset based on the standard rather than some educational sub= set which doesn't show how a real compiler actually works. I will read the entire file into a buffer and then scan the buffer. So, I c= an either break it into lines which the scanner breaks up into tokens, whic= h seems like extra work. Or, I can read it like I would normally, scan the = buffer a character at a time, it's a state machine so I will know when I'm = in a comment, etc. so will expect to break the comment at EOL. I can still = check the length of an identifier and chuck out a warning if it's too long = for the implementation, say 250 characters or whatever. Also, the method of breaking up the buffer into tokens which have start and= end characters is a good one, until you have the AST at this point you sho= uld make copies of the lexemes you need and store them in the AST and delet= e the buffer otherwise the compiler will take up too much memory keeping al= l that around when you start to implement "with," you will have to read in = more specifications and sometimes bodies re generics. So, in short, I'm wondering whether it really matters to parse by breaking = up into lines or not. Thanks, Luke.