Ada standard and maximum line lengths

comp.lang.ada
 help / color / mirror / Atom feed

* Ada standard and maximum line lengths
@ 2013-01-28  5:02 Lucretia
  2013-01-28  6:01 ` J-P. Rosen
                   ` (4 more replies)
  0 siblings, 5 replies; 64+ messages in thread
From: Lucretia @ 2013-01-28  5:02 UTC (permalink / raw)


Hi,

I was reading the Ada 2012 standard and found this, I was just wondering why there is a maximum line length, it's not like we parse the language a line at a time. Why not just accept it as a stream of tokens and if there are line breaks, ignore?

Luke.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-28  5:02 Ada standard and maximum line lengths Lucretia
@ 2013-01-28  6:01 ` J-P. Rosen
  2013-01-28  6:28 ` Jeffrey Carter
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 64+ messages in thread
From: J-P. Rosen @ 2013-01-28  6:01 UTC (permalink / raw)


Le 28/01/2013 06:02, Lucretia a ï¿½crit :
> I was reading the Ada 2012 standard and found this, I was just
> wondering why there is a maximum line length, it's not like we parse
> the language a line at a time. Why not just accept it as a stream of
> tokens and if there are line breaks, ignore?
> 
Since an identifier cannot cross a line boundary, limiting the line
length also limits the max length of an identifier, which can be of
importance for the compiler.

Note that it is a permission to limit the length, no obligation.

-- 
J-P. Rosen
Adalog
2 rue du Docteur Lombard, 92441 Issy-les-Moulineaux CEDEX
Tel: +33 1 45 29 21 52, Fax: +33 1 45 29 25 00
http://www.adalog.fr



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-28  5:02 Ada standard and maximum line lengths Lucretia
  2013-01-28  6:01 ` J-P. Rosen
@ 2013-01-28  6:28 ` Jeffrey Carter
  2013-01-28  8:05   ` Niklas Holsti
  2013-01-28  8:18 ` Dmitry A. Kazakov
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 64+ messages in thread
From: Jeffrey Carter @ 2013-01-28  6:28 UTC (permalink / raw)

On 01/27/2013 10:02 PM, Lucretia wrote:
>
> I was reading the Ada 2012 standard and found this, I was just wondering why
> there is a maximum line length, it's not like we parse the language a line at
> a time. Why not just accept it as a stream of tokens and if there are line
> breaks, ignore?

I presume that by "this", you're referring to ARM 2.2, where it says:

"An implementation shall support lines of at least 200 characters in length, not 
counting any characters used to signify the end of a line. An implementation 
shall support lexical elements of at least 200 characters in length. The maximum 
supported line length and lexical element length are implementation defined."

This does not impose a maximum line length; it imposes a minimum value for the 
maximum line length a compiler must accept. An implementation is free to accept 
lines of any length greater than this minimum length that it chooses, including 
no maximum line length. But since the maximum line length chosen by a compiler 
also defines the maximum identifier length accepted by the compiler, and few 
compilers are willing to accept identifiers of any length, most compilers will 
impose a maximum line length.

The importance of a minimum value for the longest line a compiler must accept in 
the standard is that it allows portable programs to be written. By not exceeding 
200-character lines, even if your compiler will accept them, you know that your 
program will be accepted by any compiler.

-- 
Jeff Carter
"If a sperm is wasted, God gets quite irate."
Monty Python's the Meaning of Life
56

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-28  6:28 ` Jeffrey Carter
@ 2013-01-28  8:05   ` Niklas Holsti
  2013-01-28 16:42     ` Jeffrey Carter
  0 siblings, 1 reply; 64+ messages in thread
From: Niklas Holsti @ 2013-01-28  8:05 UTC (permalink / raw)


On 13-01-28 08:28 , Jeffrey Carter wrote:
> On 01/27/2013 10:02 PM, Lucretia wrote:
>>
>> I was reading the Ada 2012 standard and found this, I was just
>> wondering why
>> there is a maximum line length, it's not like we parse the language a
>> line at
>> a time. Why not just accept it as a stream of tokens and if there are
>> line
>> breaks, ignore?
> 
> I presume that by "this", you're referring to ARM 2.2, where it says:
> 
> "An implementation shall support lines of at least 200 characters in
> length, not counting any characters used to signify the end of a line.
> An implementation shall support lexical elements of at least 200
> characters in length. The maximum supported line length and lexical
> element length are implementation defined."
> 
> This does not impose a maximum line length; it imposes a minimum value
> for the maximum line length a compiler must accept. An implementation is
> free to accept lines of any length greater than this minimum length that
> it chooses, including no maximum line length.

Yes.

> But since the maximum line
> length chosen by a compiler also defines the maximum identifier length
> accepted by the compiler,

Why? I don't see anything in the ARM quote that requires this. The
limits on the line length and lexical element length are independent,
although they minimum happen to have the same minimum value.

A compiler could have 200 characters as the maximum lexical element and
identifier length, but accept lines of any length, couldn't it?

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
      .      @       .



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-28  5:02 Ada standard and maximum line lengths Lucretia
  2013-01-28  6:01 ` J-P. Rosen
  2013-01-28  6:28 ` Jeffrey Carter
@ 2013-01-28  8:18 ` Dmitry A. Kazakov
  2013-01-28 10:02   ` Maciej Sobczak
  2013-01-28 13:49 ` Robert A Duff
  2013-01-29 18:46 ` Lucretia
  4 siblings, 1 reply; 64+ messages in thread
From: Dmitry A. Kazakov @ 2013-01-28  8:18 UTC (permalink / raw)


On Sun, 27 Jan 2013 21:02:09 -0800 (PST), Lucretia wrote:

> I was reading the Ada 2012 standard and found this, I was just wondering
> why there is a maximum line length, it's not like we parse the language a
> line at a time.

It is. I always accumulate a whole line before parsing it.

> Why not just accept it as a stream of tokens and if there
> are line breaks, ignore?

For multiple reasons. Specifically regarding Ada, the language has syntax
elements bound by the line end, e.g. comments etc.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-28  8:18 ` Dmitry A. Kazakov
@ 2013-01-28 10:02   ` Maciej Sobczak
  2013-01-28 11:57     ` Georg Bauhaus
  2013-01-28 15:13     ` Dmitry A. Kazakov
  0 siblings, 2 replies; 64+ messages in thread
From: Maciej Sobczak @ 2013-01-28 10:02 UTC (permalink / raw)
  Cc: mailbox

W dniu poniedziałek, 28 stycznia 2013 09:18:43 UTC+1 użytkownik Dmitry A. Kazakov napisał:

> > I was reading the Ada 2012 standard and found this, I was just wondering
> > why there is a maximum line length, it's not like we parse the language a
> > line at a time.
> 
> It is. I always accumulate a whole line before parsing it.

But it is an implementation detail of your particular parser. There is nothing in the concept of parsing itself that would require it.

> > Why not just accept it as a stream of tokens and if there
> > are line breaks, ignore?
> 
> For multiple reasons. Specifically regarding Ada, the language has syntax
> elements bound by the line end, e.g. comments etc.

There is no problem reading the stream until some token (newline) is found. Especially if the whole purpose of reading is to ignore the input, as is the case with comments.

I do not see anything that would prevent the implementation that accepts potentially indefinite line lengths, while still putting arbitrary limits on identifiers. The implementation details of some existing parser (like yours) is not a satisfactory explanation.

I would not be surprised if the permission to limit is a leftover from the ice age of line printers, without any real association with today's systems.

-- 
Maciej Sobczak * http://www.msobczak.com * http://www.inspirel.com

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-28 10:02   ` Maciej Sobczak
@ 2013-01-28 11:57     ` Georg Bauhaus
  2013-01-28 13:28       ` Niklas Holsti
                         ` (2 more replies)
  2013-01-28 15:13     ` Dmitry A. Kazakov
  1 sibling, 3 replies; 64+ messages in thread
From: Georg Bauhaus @ 2013-01-28 11:57 UTC (permalink / raw)


On 28.01.13 11:02, Maciej Sobczak wrote:
> I would not be surprised if the permission to limit is a leftover from the ice age of line printers, without any real association with today's systems.

Line lengths surely affect source code quality,
and hence the way programmers can address tasks
of ECRs etc. But how?  If automatic parsers can
easily handle short lines or long lines, there is
then more reason to emphasize why we do have lines
in the first place!

I wonder what will be the effect on working in the
programming profession of a general limit on line
lengths that is, say, <= 100 characters:

Will programers take a different approach to adding
structure? Smaller subprograms?

Will editing programs improve?
If long lines are written in order to make the control
structure more easily apparent (since indentation
then serves only the purpose of indicating control
structure, no broken statements), would editors with
better folding support become more popular (so as to
exclude lines from the view that have not been indented
to indicate control structure)?




^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-28 11:57     ` Georg Bauhaus
@ 2013-01-28 13:28       ` Niklas Holsti
  2013-01-28 15:14       ` J-P. Rosen
  2013-01-28 16:13       ` Dmitry A. Kazakov
  2 siblings, 0 replies; 64+ messages in thread
From: Niklas Holsti @ 2013-01-28 13:28 UTC (permalink / raw)


On 13-01-28 13:57 , Georg Bauhaus wrote:
> On 28.01.13 11:02, Maciej Sobczak wrote:
>> I would not be surprised if the permission to limit is a leftover
>> from the ice age of line printers, without any real association
>> with today's systems.
> 
> Line lengths surely affect source code quality, and hence the way
> programmers can address tasks of ECRs etc. But how?  If automatic
> parsers can easily handle short lines or long lines, there is then
> more reason to emphasize why we do have lines in the first place!
> 
> I wonder what will be the effect on working in the programming
> profession of a general limit on line lengths that is, say, <= 100
> characters:

None for me. I am perfectly happy to write programs with at most 80
characters per line. If pressed, perhaps I can go up to 100.

But 200-character lines? How does one sensibly do a side-by-side diff,
for example?

> Will editing programs improve? If long lines are written in order to
> make the control structure more easily apparent (since indentation 
> then serves only the purpose of indicating control structure, no
> broken statements),

But breaking long statements into separate lines *helps* readability,
for example by placing each parameter for a call on its own line. And it
helps diffs, too (diffs are a very important part of version and change
tracking, for me at least).

For prose, narrow columns are known to be easier to read than page-wide
columns. The normal width of a book page is about the upper limit for me
-- any wider and my eyes lose track when doing "carriage return", so to
speak :-)

All hail the 80-column line!

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
      .      @       .



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-28  5:02 Ada standard and maximum line lengths Lucretia
                   ` (2 preceding siblings ...)
  2013-01-28  8:18 ` Dmitry A. Kazakov
@ 2013-01-28 13:49 ` Robert A Duff
  2013-01-29  2:09   ` Randy Brukardt
  2013-01-29 18:46 ` Lucretia
  4 siblings, 1 reply; 64+ messages in thread
From: Robert A Duff @ 2013-01-28 13:49 UTC (permalink / raw)

Lucretia <laguest9000@googlemail.com> writes:

> I was reading the Ada 2012 standard and found this, I was just
> wondering why there is a maximum line length, ...

I don't think the RM should have that, and I think compilers
ought to support arbitrary line lengths and identifier lengths,
up to whatever limits are imposed by the hardware and operating
system, or at least high enough that nobody will run into them
(including in automatically-generated code).

>...it's not like we parse
> the language a line at a time. Why not just accept it as a stream of
> tokens and if there are line breaks, ignore?

Line breaks aren't ignored (they terminate comments, and they are
illegal in string literals), but yeah, there's no problem parsing
files with very long lines.

- Bob

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-28 10:02   ` Maciej Sobczak
  2013-01-28 11:57     ` Georg Bauhaus
@ 2013-01-28 15:13     ` Dmitry A. Kazakov
  1 sibling, 0 replies; 64+ messages in thread
From: Dmitry A. Kazakov @ 2013-01-28 15:13 UTC (permalink / raw)

On Mon, 28 Jan 2013 02:02:54 -0800 (PST), Maciej Sobczak wrote:

> W dniu poniedziaďż˝ek, 28 stycznia 2013 09:18:43 UTC+1 uďż˝ytkownik Dmitry A. Kazakov napisaďż˝:
> 
>>> I was reading the Ada 2012 standard and found this, I was just wondering
>>> why there is a maximum line length, it's not like we parse the language a
>>> line at a time.
>> 
>> It is. I always accumulate a whole line before parsing it.
> 
> But it is an implementation detail of your particular parser.

As much as the reverse would be. [I answered because the OP presumed that
no parser does it this way. At least one (mine) does.]

> There is nothing in the concept of parsing itself that would require it.

Actually there is something. It is the backtracking/look-ahead, which
determines how much of the source need to be cached by the parser. One
stream element is usually not enough, at least not sufficient to generate
reasonable error messages. Lines are most natural to determine the
boundaries of backtracking/look-ahead.

>>> Why not just accept it as a stream of tokens and if there
>>> are line breaks, ignore?
>> 
>> For multiple reasons. Specifically regarding Ada, the language has syntax
>> elements bound by the line end, e.g. comments etc.
> 
> There is no problem reading the stream until some token (newline) is
> found. Especially if the whole purpose of reading is to ignore the input,
> as is the case with comments.

That would not ignore line breaks as the OP suggested.

> I do not see anything that would prevent the implementation that accepts
> potentially indefinite line lengths, while still putting arbitrary limits
> on identifiers.

+ limits on the string literal length.

Otherwise, yes, it would be possible to do, though useless. Because other
tools depend on sane line lengths even more than the parser does e.g.
editors, debuggers, documenting, addr2line after all.

> I would not be surprised if the permission to limit is a leftover from the
> ice age of line printers, without any real association with today's
> systems.

Rather the opposite. If lines are the ice arge then unget is Permian.

[ A parser like mine ensures that each lexical element in the focus is
cached. This allows to avoid secondary caches and related overhead. E.g. an
identifier or string literal is not accumulated during parsing. Rather its
starting and end positions are determined and passed further. It is all
in-place until stored. ]

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-28 11:57     ` Georg Bauhaus
  2013-01-28 13:28       ` Niklas Holsti
@ 2013-01-28 15:14       ` J-P. Rosen
  2013-01-28 16:13       ` Dmitry A. Kazakov
  2 siblings, 0 replies; 64+ messages in thread
From: J-P. Rosen @ 2013-01-28 15:14 UTC (permalink / raw)


Le 28/01/2013 12:57, Georg Bauhaus a écrit :
> I wonder what will be the effect on working in the
> programming profession of a general limit on line
> lengths that is, say, <= 100 characters:
I have seen coding standards that limited the line length to 72
characters, due to historical (punch cards!) reason. The code was quite
horrible to read.

OTOH, a sensible limit is a good thing. My own personal taste is 120
characters.

-- 
J-P. Rosen
Adalog
2 rue du Docteur Lombard, 92441 Issy-les-Moulineaux CEDEX
Tel: +33 1 45 29 21 52, Fax: +33 1 45 29 25 00
http://www.adalog.fr



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-28 11:57     ` Georg Bauhaus
  2013-01-28 13:28       ` Niklas Holsti
  2013-01-28 15:14       ` J-P. Rosen
@ 2013-01-28 16:13       ` Dmitry A. Kazakov
  2 siblings, 0 replies; 64+ messages in thread
From: Dmitry A. Kazakov @ 2013-01-28 16:13 UTC (permalink / raw)


On Mon, 28 Jan 2013 12:57:29 +0100, Georg Bauhaus wrote:

> I wonder what will be the effect on working in the
> programming profession of a general limit on line
> lengths that is, say, <= 100 characters:

I have 72 - the project tree and two source files side by side under GPS.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-28  8:05   ` Niklas Holsti
@ 2013-01-28 16:42     ` Jeffrey Carter
  2013-01-28 20:22       ` Niklas Holsti
  0 siblings, 1 reply; 64+ messages in thread
From: Jeffrey Carter @ 2013-01-28 16:42 UTC (permalink / raw)

On 01/28/2013 01:05 AM, Niklas Holsti wrote:
> On 13-01-28 08:28 , Jeffrey Carter wrote:
>
>> But since the maximum line
>> length chosen by a compiler also defines the maximum identifier length
>> accepted by the compiler,
>
> Why? I don't see anything in the ARM quote that requires this. The
> limits on the line length and lexical element length are independent,
> although they minimum happen to have the same minimum value.

This isn't addressed directly in the quote. An identifier is a sequence of 
characters from certain classes, terminated by a delimiter; a line terminator is 
a delimiter. In the absence of another restriction on identifier length, clearly 
an identifier cannot be longer than the maximum line length.

> A compiler could have 200 characters as the maximum lexical element and
> identifier length, but accept lines of any length, couldn't it?

Presumably it could. I thought efficiency in compiling was the reason behind 
limiting identifier length (being able to build efficient compilers is one of 
the constraints on the ARG), but Duff's comments in this thread lead me to doubt 
that.

-- 
Jeff Carter
"What I wouldn't give for a large sock with horse manure in it."
Annie Hall
42

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-28 16:42     ` Jeffrey Carter
@ 2013-01-28 20:22       ` Niklas Holsti
  2013-01-28 20:46         ` J-P. Rosen
  0 siblings, 1 reply; 64+ messages in thread
From: Niklas Holsti @ 2013-01-28 20:22 UTC (permalink / raw)

On 13-01-28 18:42 , Jeffrey Carter wrote:
> On 01/28/2013 01:05 AM, Niklas Holsti wrote:
>> On 13-01-28 08:28 , Jeffrey Carter wrote:
>>
>>> But since the maximum line
>>> length chosen by a compiler also defines the maximum identifier length
>>> accepted by the compiler,
>>
>> Why? I don't see anything in the ARM quote that requires this. The
>> limits on the line length and lexical element length are independent,
>> although they minimum happen to have the same minimum value.
> 
> This isn't addressed directly in the quote. An identifier is a sequence
> of characters from certain classes, terminated by a delimiter; a line
> terminator is a delimiter. In the absence of another restriction on
> identifier length, clearly an identifier cannot be longer than the
> maximum line length.

Agreed.

But the maximum length of an identifier can be *shorter* than the
maximum length of a line. You wrote, above, that the max line length
"defines" the max identifier length. Perhaps you really meant to say
that the max line length sets an upper bound on the max identifier
length, which is true.

>> A compiler could have 200 characters as the maximum lexical element and
>> identifier length, but accept lines of any length, couldn't it?
> 
> Presumably it could. I thought efficiency in compiling was the reason
> behind limiting identifier length

My point is that the compiler's max identifier length can be *less* than
its max line length. Even if a limit on identifier length makes a
compiler more efficient (which I doubt is a significant factor today, at
least if the majority of identifiers are of moderate length), the
compiler could still accept lines of any length, much longer than the
max identifier length.

Returning to the ARM limits, in principle it is necessary to have some
safe lower limits on what a compiler must accept, so that one can be
sure that some strange compiler will not reject one's Ada program just
because it has too long identifiers or lines.

In practice, today this portability could probably work as well based on
informal "usability" criteria -- clearly an Ada compiler with a maximum
identifier length of 10-15 characters (or whatever minimum maximum is
imposed by the predefined identifiers) would often annoy its users.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
      .      @       .

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-28 20:22       ` Niklas Holsti
@ 2013-01-28 20:46         ` J-P. Rosen
  2013-01-28 21:29           ` Niklas Holsti
  0 siblings, 1 reply; 64+ messages in thread
From: J-P. Rosen @ 2013-01-28 20:46 UTC (permalink / raw)


Le 28/01/2013 21:22, Niklas Holsti a ï¿½crit :
> My point is that the compiler's max identifier length can be *less* than
> its max line length.
Can you point the RM verse that allows you to think that the compiler is
allowed to put a max to an identifier length, other than the one that
results "naturally" from the max line length?

Of course, you can appeal to the "exceed the capacity of the compiler"
argument, but I don't think the ACAA would buy this, since all current
compilers have no such limitation.

-- 
J-P. Rosen
Adalog
2 rue du Docteur Lombard, 92441 Issy-les-Moulineaux CEDEX
Tel: +33 1 45 29 21 52, Fax: +33 1 45 29 25 00
http://www.adalog.fr



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-28 20:46         ` J-P. Rosen
@ 2013-01-28 21:29           ` Niklas Holsti
  2013-01-29  1:42             ` Randy Brukardt
  2013-01-29  6:15             ` J-P. Rosen
  0 siblings, 2 replies; 64+ messages in thread
From: Niklas Holsti @ 2013-01-28 21:29 UTC (permalink / raw)

On 13-01-28 22:46 , J-P. Rosen wrote:
> Le 28/01/2013 21:22, Niklas Holsti a ï¿½crit :
>> My point is that the compiler's max identifier length can be *less* than
>> its max line length.
> Can you point the RM verse that allows you to think that the compiler is
> allowed to put a max to an identifier length, other than the one that
> results "naturally" from the max line length?

ARM 2.2(14), the same part that Jeffrey quoted:

"An implementation shall support lines of at least 200 characters in
length, not counting any characters used to signify the end of a line.
An implementation shall support lexical elements of at least 200
characters in length. The maximum supported line length and lexical
element length are implementation defined."

This is followed by the note (maybe only in the AARM, which is what I
have in hand):

"Implementation defined: Maximum supported line length and lexical
element length."

As I understand ARM 2.2(14), it does not define a coupling between
maximum line length and maximum lexical-element length (i.e. maximum
identifier length). It requires both limits to be at least 200
characters, but does not require them to be equal. The limits are stated
in separate sentences -- the two first sentences in the quote.

Perhaps the third and last sentence in ARM 2.2(14), which mentions both
limits in one sentence, confuses the issue. Occasionally, people write
sentences of the form "The A and B are ..." and imply that A and B are
the same.

If the limits are meant to be coupled (equal), there should some text
such as "An implementation shall accept lexical elements that are as
long as the maximum supported line length". But that is not what the ARM
says. And I don't think that one can deduce such a rule just from the
fact that the required minimum limits are both 200 characters.

In the same post in which Jeffrey quoted ARM 2.2(14), it seemed he was
reading it to mean that there is a coupling. I was asking how he deduced
that coupling. I'm asking the same question as you (J-P.), but from the
other side of the burden of proof.

So, can you show the ARM verse that says that a compiler is not allowed
to limit identifiers to 200 characters, if it supports lines of 10,000
characters?

I haven't scanned the whole ARM to see if there is some other text that
requires max line length and max identifier length to be coupled. But it
would be strange to have such text in some other distant part of the
ARM, and not in 2.2 where it would clearly belong.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
      .      @       .

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-28 21:29           ` Niklas Holsti
@ 2013-01-29  1:42             ` Randy Brukardt
  2013-01-29  6:15             ` J-P. Rosen
  1 sibling, 0 replies; 64+ messages in thread
From: Randy Brukardt @ 2013-01-29  1:42 UTC (permalink / raw)

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1705 bytes --]

"Niklas Holsti" <niklas.holsti@tidorum.invalid> wrote in message 
news:amo8u2Fh3ngU1@mid.individual.net...
> On 13-01-28 22:46 , J-P. Rosen wrote:
>> Le 28/01/2013 21:22, Niklas Holsti a �crit :
>>> My point is that the compiler's max identifier length can be *less* than
>>> its max line length.
>> Can you point the RM verse that allows you to think that the compiler is
>> allowed to put a max to an identifier length, other than the one that
>> results "naturally" from the max line length?
>
> ARM 2.2(14), the same part that Jeffrey quoted:
>
> "An implementation shall support lines of at least 200 characters in
> length, not counting any characters used to signify the end of a line.
> An implementation shall support lexical elements of at least 200
> characters in length. The maximum supported line length and lexical
> element length are implementation defined."

Interestingly, the ACATS (and ACVC before it) has always assumed that the 
only limit is the one on the line length. The ACATS has tests parameterized 
by line length, and these include identifiers up to the line length -- an 
implementation would have to get a waiver to make the identifier length 
shorter than the line length. So far as I know, no one has ever asked.

Ada 83 had no explicit allowance for limitations, which is probably where 
the ACATS gets its requirements from. Ada 95 tried to add a minimum line 
length, but seems to have (unintentionally?) allowed identifiers shorter 
than the full line length. I tend to agree with your reading of the RM, but 
it doesn't match practice because of course implementations also have to 
follow the ACATS requirements.

                                   Randy.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-28 13:49 ` Robert A Duff
@ 2013-01-29  2:09   ` Randy Brukardt
  0 siblings, 0 replies; 64+ messages in thread
From: Randy Brukardt @ 2013-01-29  2:09 UTC (permalink / raw)

"Robert A Duff" <bobduff@shell01.TheWorld.com> wrote in message 
news:wccwquxo1se.fsf@shell01.TheWorld.com...
> Lucretia <laguest9000@googlemail.com> writes:
>
>> I was reading the Ada 2012 standard and found this, I was just
>> wondering why there is a maximum line length, ...
>
> I don't think the RM should have that, and I think compilers
> ought to support arbitrary line lengths and identifier lengths,
> up to whatever limits are imposed by the hardware and operating
> system, or at least high enough that nobody will run into them
> (including in automatically-generated code).

FYI, Janus/Ada stores (all) identifiers in a single array of bytes with each 
one preceeded by the length in bytes (stored as a byte), so there is a 
natural limit of 250ish characters for such an implementation. I'd be pretty 
annoyed to have to change the scanner/parser/etc. to eliminate that limit 
(it's not like there is a flood of complaints about it), especially as it 
would slow down scanning (tokenizing) by a fairly significant amount.

So there is a bit of value to allowing a limit. Probably a new compiler 
design could and perhaps should avoid it, but there are a lot of existing 
Ada compilers out there and it would be silly to force them to change this 
way (there are zillions of things more important to change in Janus/Ada).

                                            Randy.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-28 21:29           ` Niklas Holsti
  2013-01-29  1:42             ` Randy Brukardt
@ 2013-01-29  6:15             ` J-P. Rosen
  2013-01-29 10:25               ` Niklas Holsti
  1 sibling, 1 reply; 64+ messages in thread
From: J-P. Rosen @ 2013-01-29  6:15 UTC (permalink / raw)


Le 28/01/2013 22:29, Niklas Holsti a ï¿½crit :
> In the same post in which Jeffrey quoted ARM 2.2(14), it seemed he was
> reading it to mean that there is a coupling. I was asking how he deduced
> that coupling. I'm asking the same question as you (J-P.), but from the
> other side of the burden of proof.
> 
> So, can you show the ARM verse that says that a compiler is not allowed
> to limit identifiers to 200 characters, if it supports lines of 10,000
> characters?
I don't have much to add to Randy's response. Yes, you can read the RM
this way, and no it's not the way it has been interpreted either by the
ACATS or implementers.

Since I see no benefit in having them different, I think a fix in the RM
is the best thing to do.

-- 
J-P. Rosen
Adalog
2 rue du Docteur Lombard, 92441 Issy-les-Moulineaux CEDEX
Tel: +33 1 45 29 21 52, Fax: +33 1 45 29 25 00
http://www.adalog.fr



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-29  6:15             ` J-P. Rosen
@ 2013-01-29 10:25               ` Niklas Holsti
  2013-01-29 11:31                 ` Georg Bauhaus
  2013-01-29 20:36                 ` Niklas Holsti
  0 siblings, 2 replies; 64+ messages in thread
From: Niklas Holsti @ 2013-01-29 10:25 UTC (permalink / raw)

On 13-01-29 08:15 , J-P. Rosen wrote:
> Le 28/01/2013 22:29, Niklas Holsti a ï¿½crit :
>> In the same post in which Jeffrey quoted ARM 2.2(14), it seemed he was
>> reading it to mean that there is a coupling. I was asking how he deduced
>> that coupling. I'm asking the same question as you (J-P.), but from the
>> other side of the burden of proof.
>>
>> So, can you show the ARM verse that says that a compiler is not allowed
>> to limit identifiers to 200 characters, if it supports lines of 10,000
>> characters?
> I don't have much to add to Randy's response. Yes, you can read the RM
> this way,

Good that we agree on that.

> and no it's not the way it has been interpreted either by the
> ACATS or implementers.

Seems to be legacy from Ada 83, which should have been changed (in my
reading of the ARM, which I still see as more correct) for Ada 95.

> Since I see no benefit in having them different, I think a fix in the RM
> is the best thing to do.

In many programming languages and implementations from the last century,
maximum identifier length was much less than maximum line length. BASIC
originally had 1-letter identifiers. IIRC Fortran had a 6-character limit.

I don't mind if the ARM is changed (or interpreted) to require th a
compiler to support identifiers as long as the longest possible source
line. I don't think it is important one way or the other, as long as my
compiler supports reasonable lengths. 200 characters is ok as a max line
length. 200 for an identifier is overkill, IMO.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
      .      @       .

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-29 10:25               ` Niklas Holsti
@ 2013-01-29 11:31                 ` Georg Bauhaus
  2013-01-29 12:11                   ` Simon Wright
  2013-01-29 12:31                   ` Niklas Holsti
  2013-01-29 20:36                 ` Niklas Holsti
  1 sibling, 2 replies; 64+ messages in thread
From: Georg Bauhaus @ 2013-01-29 11:31 UTC (permalink / raw)


On 29.01.13 11:25, Niklas Holsti wrote:
> 200 characters is ok as a max line
> length. 200 for an identifier is overkill, IMO.

When the meaning of "meaning" is defined by a programming
environment, naming capabilities may need to tackle names that the
environment imposes on the source. The names may exist for, and be
shaped by, hysterical raisins. For example, the following is listed
as a symbol from the text section of a standard C++ library:

__ZNSt3__112basic_stringIwNS_11char_traitsIwEENS_9allocatorIwEEE6insertIPKwEENS_9enable_ifIXsrNS_21__is_forward_iteratorIT_EE5valueENS_11__wrap_iterIPwEEE4typeENSD_IS8_EESB_SB_






^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-29 11:31                 ` Georg Bauhaus
@ 2013-01-29 12:11                   ` Simon Wright
  2013-01-29 12:31                   ` Niklas Holsti
  1 sibling, 0 replies; 64+ messages in thread
From: Simon Wright @ 2013-01-29 12:11 UTC (permalink / raw)


Georg Bauhaus <rm.dash-bauhaus@futureapps.de> writes:

> On 29.01.13 11:25, Niklas Holsti wrote:
>> 200 characters is ok as a max line
>> length. 200 for an identifier is overkill, IMO.
>
> When the meaning of "meaning" is defined by a programming
> environment, naming capabilities may need to tackle names that the
> environment imposes on the source. The names may exist for, and be
> shaped by, hysterical raisins. For example, the following is listed
> as a symbol from the text section of a standard C++ library:
>
> __ZNSt3__112basic_stringIwNS_11char_traitsIwEENS_9allocatorIwEEE6insertIPKwEENS_9enable_ifIXsrNS_21__is_forward_iteratorIT_EE5valueENS_11__wrap_iterIPwEEE4typeENSD_IS8_EESB_SB_

176 characters. Any advance?!



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-29 11:31                 ` Georg Bauhaus
  2013-01-29 12:11                   ` Simon Wright
@ 2013-01-29 12:31                   ` Niklas Holsti
  2013-01-29 12:37                     ` Niklas Holsti
  2013-01-29 15:29                     ` Georg Bauhaus
  1 sibling, 2 replies; 64+ messages in thread
From: Niklas Holsti @ 2013-01-29 12:31 UTC (permalink / raw)

On 13-01-29 13:31 , Georg Bauhaus wrote:
> On 29.01.13 11:25, Niklas Holsti wrote:
>> 200 characters is ok as a max line
>> length. 200 for an identifier is overkill, IMO.
> 
> When the meaning of "meaning" is defined by a programming
> environment, naming capabilities may need to tackle names that the
> environment imposes on the source. The names may exist for, and be
> shaped by, hysterical raisins. For example, the following is listed
> as a symbol from the text section of a standard C++ library:
> 
> __ZNSt3__112basic_stringIwNS_11char_traitsIwEENS_9allocatorIwEEE6insertIPKwEENS_9enable_ifIXsrNS_21__is_forward_iteratorIT_EE5valueENS_11__wrap_iterIPwEEE4typeENSD_IS8_EESB_SB_

As you well know, that is a compiler-generated mangled "symbol", not a
user-written identifier. Its extreme length is an artefact of the
historical limitations on linkers and the linking process, in particular
the lack of name-space control and of overloaded symbols (resolved by
type or profile) and the requirement for separate compilation of modules
without any central compilation/information library. This forces the C++
compiler to encode many properties of the C++ object, named by some much
shorter source-code identifier, into the symbol used for linking.

Even for hypothetical SW of some sort that generates Ada source code
automatically, I think it very unlikely that it would generate
identifiers of that length, since Ada identifiers can be qualified and
sorted into packages etc. much more flexibly than is the case for linker
symbols.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
      .      @       .

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-29 12:31                   ` Niklas Holsti
@ 2013-01-29 12:37                     ` Niklas Holsti
  2013-01-29 15:29                     ` Georg Bauhaus
  1 sibling, 0 replies; 64+ messages in thread
From: Niklas Holsti @ 2013-01-29 12:37 UTC (permalink / raw)


On 13-01-29 14:31 , Niklas Holsti wrote:
> On 13-01-29 13:31 , Georg Bauhaus wrote:
>> On 29.01.13 11:25, Niklas Holsti wrote:
>>> 200 characters is ok as a max line
>>> length. 200 for an identifier is overkill, IMO.
>>
>> When the meaning of "meaning" is defined by a programming
>> environment, naming capabilities may need to tackle names that the
>> environment imposes on the source. The names may exist for, and be
>> shaped by, hysterical raisins. For example, the following is listed
>> as a symbol from the text section of a standard C++ library:
>>
>> __ZNSt3__112basic_stringIwNS_11char_traitsIwEENS_9allocatorIwEEE6insertIPKwEENS_9enable_ifIXsrNS_21__is_forward_iteratorIT_EE5valueENS_11__wrap_iterIPwEEE4typeENSD_IS8_EESB_SB_
> 
> As you well know, that is a compiler-generated mangled "symbol", not a
> user-written identifier.

One more point: even if you want to link Ada code to this C++ library,
that long symbol would be a string in a pragma, not an Ada identifier.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
      .      @       .



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-29 12:31                   ` Niklas Holsti
  2013-01-29 12:37                     ` Niklas Holsti
@ 2013-01-29 15:29                     ` Georg Bauhaus
  2013-01-29 16:58                       ` Niklas Holsti
  1 sibling, 1 reply; 64+ messages in thread
From: Georg Bauhaus @ 2013-01-29 15:29 UTC (permalink / raw)

On 29.01.13 13:31, Niklas Holsti wrote:
> Even for hypothetical SW of some sort that generates Ada source code
> automatically, I think it very unlikely that it would generate
> identifiers of that length, since Ada identifiers can be qualified and
> sorted into packages etc. much more flexibly than is the case for linker
> symbols.

GNAT writes similar identifiers into object files, and
the same hysterical raisins spoil the cake. At least the
names---those to be exported to other compilers---seem shorter
if package hierarchies are not used extensively.

Alas, the problem of exceedingly long names in object files
can be worked around by allowing lengthy identifiers in source
text. There is a better alternative. It needs to address the
issue at some object interchange level, which is typed.
Something between function libraries from around 1970 and
{.NET, JVM, Corba, Zero-C Ice, ...}.

An incentive is needed that creates fruitful cooperation of
languages, if that is where long identifiers are a pain, and
a source of cumbersome, costly configuration.
For example, mathematicians and/or computer scientists might
like the subject if dressed like this:

  "On a denotational semantics of typed object descriptions.
   With an application to the Intel ABI."

(Anyone who could suggest this as a topic for a few dissertations?)

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-29 15:29                     ` Georg Bauhaus
@ 2013-01-29 16:58                       ` Niklas Holsti
  2013-01-29 17:51                         ` Georg Bauhaus
  0 siblings, 1 reply; 64+ messages in thread
From: Niklas Holsti @ 2013-01-29 16:58 UTC (permalink / raw)

On 13-01-29 17:29 , Georg Bauhaus wrote:
> On 29.01.13 13:31, Niklas Holsti wrote:
>> Even for hypothetical SW of some sort that generates Ada source code
>> automatically, I think it very unlikely that it would generate
>> identifiers of that length, since Ada identifiers can be qualified and
>> sorted into packages etc. much more flexibly than is the case for linker
>> symbols.
> 
> GNAT writes similar identifiers into object files, and
> the same hysterical raisins spoil the cake. At least the
> names---those to be exported to other compilers---seem shorter
> if package hierarchies are not used extensively.

So what? These are *linker symbols*, not Ada (source) identifiers. I
don't understand why you bring them into a discussion of source-code
identifier lengths.

In many cases, for example for the C++ symbol that you gave, the symbols
are not even lexically legal Ada identifiers.

> Alas, the problem of exceedingly long names in object files
> can be worked around by allowing lengthy identifiers in source
> text.

Incomprehensible. Is there a "not" missing somewhere in that sentence?

While long linker symbols are ugly, and a poor work-around for linker
weaknesses, they are troublesome only for people who deal with low-level
debugging, or other tools on that level.

> There is a better alternative. It needs to address the
> issue at some object interchange level, which is typed.
> Something between function libraries from around 1970 and
> {.NET, JVM, Corba, Zero-C Ice, ...}.

Certainly the linker principles and operations could stand improvement
and modernisation. Not to speak of the unspeakably (as I said :-)) poor
documentation of symbolic information emitted by compilers and stored in
"standard" forms like DWARF (which is still vastly better than the older
standards for debug info).

> An incentive is needed that creates fruitful cooperation of
> languages, if that is where long identifiers are a pain, and
> a source of cumbersome, costly configuration.

As I've understood it, the GNAT Ada/C++ linkage does *not* involve
typing those long, mangled linker symbols into an Ada program as
identifiers. GNAT understands the GNU C++ compiler well enough to hide them.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
      .      @       .

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-29 16:58                       ` Niklas Holsti
@ 2013-01-29 17:51                         ` Georg Bauhaus
  2013-01-29 18:18                           ` Shark8
  2013-01-29 19:54                           ` Niklas Holsti
  0 siblings, 2 replies; 64+ messages in thread
From: Georg Bauhaus @ 2013-01-29 17:51 UTC (permalink / raw)

On 29.01.13 17:58, Niklas Holsti wrote:
> On 13-01-29 17:29 , Georg Bauhaus wrote:
>> On 29.01.13 13:31, Niklas Holsti wrote:
>>> Even for hypothetical SW of some sort that generates Ada source code
>>> automatically, I think it very unlikely that it would generate
>>> identifiers of that length, since Ada identifiers can be qualified and
>>> sorted into packages etc. much more flexibly than is the case for linker
>>> symbols.
>>
>> GNAT writes similar identifiers into object files, and
>> the same hysterical raisins spoil the cake. At least the
>> names---those to be exported to other compilers---seem shorter
>> if package hierarchies are not used extensively.
> 
> So what? These are *linker symbols*, not Ada (source) identifiers. I
> don't understand why you bring them into a discussion of source-code
> identifier lengths.

Linker symbols reappear in source text.

Off the island of GCC-only systems, for example. While it might
seem perfectly clear that language L defines a type of specific
objects that have primitive operations, there is no way to express
this in standard Ada. Fallback: linker symbols, function libraries,
and implementation-defined pragmas.

So, linker symbols are two-sided creatures: on the one side, they
are intended for use by linkers; on the other side, they reappear
in source text, even when "just in a pragma" or some such.
And not portably, at that, thereby destroying the value of
standardization of, for example, Ada and C++.

> In many cases, for example for the C++ symbol that you gave, the symbols
> are not even lexically legal Ada identifiers.
> 
>> Alas, the problem of exceedingly long names in object files
>> can be worked around by allowing lengthy identifiers in source
>> text.
> 
> Incomprehensible. Is there a "not" missing somewhere in that sentence?

By the response above, I need to accept long identifiers in
practice; it is a fact that compilers support the reality
of excessively long identifiers and so these reappear in source
text, because(!) language design is only infrequently responsive
to anything outside the fence; it seems a happy coincidence when
some of the respective teams join to talk about the effects
of the fence. I recall that, just recently, modernization, and
correction, of Ada's Fortran compatibility was considered because
someone took initiative. Just that.

So no, the above sentence is not missing a "not": the lack of
advancement in linking is worked around by allowing terribly
simplistic naming to reappear in source text:

   valid standard C++ program     valid standard Ada program
              |                             |
                              _
                              |  (non-standard pragmas)
                              _
                 /                       \
            compilers A+B            compilers C+E
                 |                       |
                OK                      NOK

The same valid programs, translated on the same platform, can
be rejected because the program needs to imports the long,
non-standard, non-portable names.

So it seems obvious that there is an opportunity for
improvement and modernization of linking by addressing
naming in programs written in Ada, C++, and practically
related languages. We have type theory. We have objects.
What we don't have is types at the object code level
("object code. Ha!"). But could we?

Hence, what will a .o/.obj files look like if both C++ and
Ada wanted to let programmers write "normal length" identifiers
for cross-language types, say?

Going back to my initial question. Suppose linker symbols
were structured. Won't the rather technical need for allowing
long identifiers in Ada source text just vanish?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-29 17:51                         ` Georg Bauhaus
@ 2013-01-29 18:18                           ` Shark8
  2013-01-29 19:54                           ` Niklas Holsti
  1 sibling, 0 replies; 64+ messages in thread
From: Shark8 @ 2013-01-29 18:18 UTC (permalink / raw)

On Tuesday, January 29, 2013 11:51:52 AM UTC-6, Georg Bauhaus wrote:
> On 29.01.13 17:58, Niklas Holsti wrote:
> 
> So it seems obvious that there is an opportunity for
> improvement and modernization of linking by addressing
> naming in programs written in Ada, C++, and practically
> related languages. We have type theory. We have objects.
> What we don't have is types at the object code level
> ("object code. Ha!"). But could we?

Yes, we could. It would require a completely different OBJ format than is currently used and would likely benefit from not using any existing OBJ format as a base, forcing a reworking of the linking as well.

> 
> Hence, what will a .o/.obj files look like if both C++ and
> Ada wanted to let programmers write "normal length" identifiers
> for cross-language types, say?

Well IIUC, that *is* doable -- just look at OpenVMS. Of course this is achieved by enforcing (1) a common language runtime, ["Common Language Environment" in VMS parlance], and (2) the "OpenVMS Calling Standard".

> 
> Going back to my initial question. Suppose linker symbols
> were structured. Won't the rather technical need for allowing
> long identifiers in Ada source text just vanish?

I don't know. As someone mentioned up-thread there's also the 'need' to allow for tool/auto-generated code -- I can see that as more prone to producing long-names than the OBJ-situation, especially because the OBJ situation *IS* an instance of tool-generated names.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-28  5:02 Ada standard and maximum line lengths Lucretia
                   ` (3 preceding siblings ...)
  2013-01-28 13:49 ` Robert A Duff
@ 2013-01-29 18:46 ` Lucretia
  2013-01-29 20:53   ` Robert A Duff
                     ` (3 more replies)
  4 siblings, 4 replies; 64+ messages in thread
From: Lucretia @ 2013-01-29 18:46 UTC (permalink / raw)


Ok, so this has gone off-topic, I just want put it back on...

I will be implementing a scanner for Ada 2012 and then a parser, I'm wanting to create a subset based on the standard rather than some educational subset which doesn't show how a real compiler actually works.

I will read the entire file into a buffer and then scan the buffer. So, I can either break it into lines which the scanner breaks up into tokens, which seems like extra work. Or, I can read it like I would normally, scan the buffer a character at a time, it's a state machine so I will know when I'm in a comment, etc. so will expect to break the comment at EOL. I can still check the length of an identifier and chuck out a warning if it's too long for the implementation, say 250 characters or whatever.

Also, the method of breaking up the buffer into tokens which have start and end characters is a good one, until you have the AST at this point you should make copies of the lexemes you need and store them in the AST and delete the buffer otherwise the compiler will take up too much memory keeping all that around when you start to implement "with," you will have to read in more specifications and sometimes bodies re generics.

So, in short, I'm wondering whether it really matters to parse by breaking up into lines or not.

Thanks,
Luke.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-29 17:51                         ` Georg Bauhaus
  2013-01-29 18:18                           ` Shark8
@ 2013-01-29 19:54                           ` Niklas Holsti
  2013-01-29 23:12                             ` Georg Bauhaus
  2013-01-29 23:47                             ` Jeffrey Carter
  1 sibling, 2 replies; 64+ messages in thread
From: Niklas Holsti @ 2013-01-29 19:54 UTC (permalink / raw)


On 13-01-29 19:51 , Georg Bauhaus wrote:
> On 29.01.13 17:58, Niklas Holsti wrote:
>> On 13-01-29 17:29 , Georg Bauhaus wrote:
>>> On 29.01.13 13:31, Niklas Holsti wrote:
>>>> Even for hypothetical SW of some sort that generates Ada source code
>>>> automatically, I think it very unlikely that it would generate
>>>> identifiers of that length, since Ada identifiers can be qualified and
>>>> sorted into packages etc. much more flexibly than is the case for linker
>>>> symbols.
>>>
>>> GNAT writes similar identifiers into object files, and
>>> the same hysterical raisins spoil the cake. At least the
>>> names---those to be exported to other compilers---seem shorter
>>> if package hierarchies are not used extensively.
>>
>> So what? These are *linker symbols*, not Ada (source) identifiers. I
>> don't understand why you bring them into a discussion of source-code
>> identifier lengths.
> 
> Linker symbols reappear in source text.

Show an example, please, where it is necessary to write a linker symbol
verbatim as an Ada identifier.

> So, linker symbols are two-sided creatures: on the one side, they
> are intended for use by linkers; on the other side, they reappear
> in source text, even when "just in a pragma" or some such.

If they are written in a pragma as a string, they are not Ada
identifiers, and the ARM limit on identifier length does not apply.

> By the response above, I need to accept long identifiers in
> practice; it is a fact that compilers support the reality
> of excessively long identifiers and so these reappear in source
> text,

Not as identifiers. As strings, occasionally.

>    valid standard C++ program     valid standard Ada program
>               |                             |
>                               _
>                               |  (non-standard pragmas)
>                               _
>                  /                       \
>             compilers A+B            compilers C+E
>                  |                       |
>                 OK                      NOK
> 
> The same valid programs, translated on the same platform, can
> be rejected because the program needs to imports the long,
> non-standard, non-portable names.

Sure, using link-name pragmas is unportable, since different compilers
mangle identifiers into symbols in different ways.

Sure, this is a poor mess and should be improved.

But it has nothing to do with limits on identifier length in Ada.

> Going back to my initial question. Suppose linker symbols
> were structured. Won't the rather technical need for allowing
> long identifiers in Ada source text just vanish?

There has never been, and is not now, such a need.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
      .      @       .



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-29 10:25               ` Niklas Holsti
  2013-01-29 11:31                 ` Georg Bauhaus
@ 2013-01-29 20:36                 ` Niklas Holsti
  2013-01-29 21:01                   ` Robert A Duff
  2013-01-29 21:14                   ` Dmitry A. Kazakov
  1 sibling, 2 replies; 64+ messages in thread
From: Niklas Holsti @ 2013-01-29 20:36 UTC (permalink / raw)

Adding a note to my own post:

On 13-01-29 12:25 , Niklas Holsti wrote:

> I don't mind if the ARM is changed (or interpreted) to require a
> compiler to support identifiers as long as the longest possible source
> line. I don't think it is important one way or the other, as long as my
> compiler supports reasonable lengths. 200 characters is ok as a max line
> length. 200 for an identifier is overkill, IMO.

The limits in ARM 2.2(14) apply to line length and (separately, IMO) to
lexical element length. While 200 characters is IMO overkill for an
identifier, which is a lexical element, it is not overkill for a string
literal, which is also a lexical element.

I could live with, say, a 32-character identifier length limit, but a
32-character limit on string literal would be uncomfortable, to say it
nicely. The 200-character requirement in ARM 2.2(14) feels about right
to me.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
      .      @       .

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-29 18:46 ` Lucretia
@ 2013-01-29 20:53   ` Robert A Duff
  2013-01-29 21:22   ` Dmitry A. Kazakov
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 64+ messages in thread
From: Robert A Duff @ 2013-01-29 20:53 UTC (permalink / raw)


Lucretia <laguest9000@googlemail.com> writes:

> So, in short, I'm wondering whether it really matters to parse by
> breaking up into lines or not.

There's no reason why breaking into lines should be separate from
lexical analysis (i.e. scanning).

- Bob



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-29 20:36                 ` Niklas Holsti
@ 2013-01-29 21:01                   ` Robert A Duff
  2013-01-29 21:14                   ` Dmitry A. Kazakov
  1 sibling, 0 replies; 64+ messages in thread
From: Robert A Duff @ 2013-01-29 21:01 UTC (permalink / raw)

Niklas Holsti <niklas.holsti@tidorum.invalid> writes:

> The limits in ARM 2.2(14) apply to line length and (separately, IMO) to
> lexical element length. While 200 characters is IMO overkill for an
> identifier, which is a lexical element, it is not overkill for a string
> literal, which is also a lexical element.

The limit on string literals is different, though, because if you
need to interface to a 400-character mangled linker symbol,
you can concatenate several shorter string literals in the pragma.

> I could live with, say, a 32-character identifier length limit, but a
> 32-character limit on string literal would be uncomfortable, to say it
> nicely. The 200-character requirement in ARM 2.2(14) feels about right
> to me.

I'm opposed to such limitations, unless they really buy you something,
which I don't think is the case here.  It's fine to have an option
that lets users impose limits on themselves, but they shouldn't be
built into programs.

- Bob

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-29 20:36                 ` Niklas Holsti
  2013-01-29 21:01                   ` Robert A Duff
@ 2013-01-29 21:14                   ` Dmitry A. Kazakov
  1 sibling, 0 replies; 64+ messages in thread
From: Dmitry A. Kazakov @ 2013-01-29 21:14 UTC (permalink / raw)


On Tue, 29 Jan 2013 22:36:54 +0200, Niklas Holsti wrote:

> While 200 characters is IMO overkill for an
> identifier, which is a lexical element, it is not overkill for a string
> literal, which is also a lexical element.

Surely you would write large constant strings rather using concatenation or
else array aggregates, for which no limit apply.

E.g. I am using array aggregates for embedded Gtk images. They could be of
many kilobytes size.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-29 18:46 ` Lucretia
  2013-01-29 20:53   ` Robert A Duff
@ 2013-01-29 21:22   ` Dmitry A. Kazakov
  2013-01-30  3:22     ` Lucretia
  2013-01-29 21:29   ` Dmitry A. Kazakov
  2013-01-29 21:53   ` Adam Beneschan
  3 siblings, 1 reply; 64+ messages in thread
From: Dmitry A. Kazakov @ 2013-01-29 21:22 UTC (permalink / raw)


On Tue, 29 Jan 2013 10:46:36 -0800 (PST), Lucretia wrote:

> I will be implementing a scanner for Ada 2012 and then a parser,

You don't need a separate scanner. Ada (as well as any other language) can
be parsed in single pass, including reading the source.

> I will read the entire file into a buffer and then scan the buffer.

A better approach would be to have an abstract type representing the source
with multiple implementations of, e.g. backed by a stream, by a file, by a
text buffer etc.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-29 18:46 ` Lucretia
  2013-01-29 20:53   ` Robert A Duff
  2013-01-29 21:22   ` Dmitry A. Kazakov
@ 2013-01-29 21:29   ` Dmitry A. Kazakov
  2013-01-29 21:53   ` Adam Beneschan
  3 siblings, 0 replies; 64+ messages in thread
From: Dmitry A. Kazakov @ 2013-01-29 21:29 UTC (permalink / raw)


On Tue, 29 Jan 2013 10:46:36 -0800 (PST), Lucretia wrote:

> Also, the method of breaking up the buffer into tokens which have start
> and end characters is a good one, until you have the AST at this point you
> should make copies of the lexemes you need and store them in the AST

Copying would be wasting resources. Nodes of the AST keep operators,
identifiers and literals instead (+ their source locations), at this stage
lexemes are gone.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-29 18:46 ` Lucretia
                     ` (2 preceding siblings ...)
  2013-01-29 21:29   ` Dmitry A. Kazakov
@ 2013-01-29 21:53   ` Adam Beneschan
  3 siblings, 0 replies; 64+ messages in thread
From: Adam Beneschan @ 2013-01-29 21:53 UTC (permalink / raw)


On Tuesday, January 29, 2013 10:46:36 AM UTC-8, Lucretia wrote:

> So, in short, I'm wondering whether it really matters to parse by breaking up into lines or not.

If every source input that your program will get is a valid Ada program, then there's probably no need to "break up" the source into lines.  But if there are errors, then it can be useful for the program to at least be aware of where lines begin and end--possibly for error reporting (so that you can display a line number and maybe the whole input line), and if you want to get really fancy, the program might be able to look at the indentation of a line to make an educated guess about what the programmer intended, when there's a missing "end" or something like that, which sometimes can help reduce the number of cascading error messages.  It depends on what you want your program to do.  But it's probably important to think about how your program will handle errors in the input; and that's something that should be considered right away in the design process, not as an afterthought.

I don't know if this answers your question.  Making sure the program is aware of where a line is isn't the same as "breaking up" into lines, but the term "breaking up" isn't really well defined.  However, if you're thinking about writing a lexical scanning or parsing subroutine that takes one line of input as a parameter, I don't see any advantage in that.

                             -- Adam




^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-29 19:54                           ` Niklas Holsti
@ 2013-01-29 23:12                             ` Georg Bauhaus
  2013-01-30  9:18                               ` Niklas Holsti
  2013-01-30  9:37                               ` Simon Wright
  2013-01-29 23:47                             ` Jeffrey Carter
  1 sibling, 2 replies; 64+ messages in thread
From: Georg Bauhaus @ 2013-01-29 23:12 UTC (permalink / raw)

On 29.01.13 20:54, Niklas Holsti wrote:

Upfront, my question was about the possible effects of
enforcing shorter identifiers  on the programming
profession.  I venture to read Robert Duff's recent post
as: that is a tool/setting to put into programmers' hands
if they want it, and otherwise there should be no limits.

>> Linker symbols reappear in source text.
>
> Show an example, please, where it is necessary to write a linker symbol
> verbatim as an Ada identifier.

Sure. I'll assume that the subject of good lengths of identifiers
shouldn't be argued in a stick-to-the-letter (200) fashion.

You have mentioned old Fortran's 6 characters, so lets go crazy
and start with doubling this, twice, to 24 characters per identifier.

This is from Win32Ada:

   function CreateDialogIndirectParamW

Ouch. 26 characters. You have mentioned that 32 is uncomfortable,
so 24 certainly is more uncomfortable. Still, the Win32 bindings
show that most programmers want these foreign names to be just
like the originals. So, there is a need to not impose a lower
bound that excludes unchanged Win32 identifiers.

(I vaguely remember that there were two OS/2 bindings. One
copied the names of the C style object oriented OS/2 API.
The other binding offered packages, one for each name space
prefix found in the C style function identifiers. TTBOMK,
programmers preferred the former and found the latter confusing.)

Alternatively, if I assign an Ada identifier a foreign function
using pragma Import, and choose a shorter Ada identifier of
character content different from the string, I create an indirection
and that is always hard to trace. Understandably, there should
be as much of a 1:1 correspondence as possible. This is all the
more true the more compilers arrive at handling exceptions across
language boundaries. It is necessarily puzzling to see an exception
trace mention a name that cannot be found in the source text.
(Because a pragma Import has assigned a different, shorter name.)

The issue is present both ways. Say, I export this to C++:

    procedure A.B.C.Foo (Arg : T) is
       Flop : exception;
    begin
       ...
    end Foo;

And assume that compilers involved can handle exceptions across
language boundaries. With A, B, and C standing for identifiers
of average length, when Foo is mangled function library style,
what should the C++ side catch if Foo is "naturally" mapped to

  __adaE_A...__B...__C...__Foo?

Chances are that a limit on lengths of C++ identifiers will require
an assignment of a different exception identifier with the help
of #pragma or __attribute__. There is no standard way to address
this. *But*! There could be a standard way to address this, if
the profession is offered an incentive to move away from library
style name mangling.

Third, GNAT.

When GNAT started to support ISO/IEC 10646 ("universal characters"),
GNAT's upper bound on identifier lengths had to be lifted. Even
in the sense that line length implies identifier length: There
are names in the Unicode character database that have made the
source files exceeded GNAT's upper bound on line lengths (79).

A unicode character's name is not a linker symbol, but still it
is a name from a "library" of characters that is "imported".

> Sure, this is a poor mess and should be improved.

Yes.

> But it has nothing to do with limits on identifier length in Ada.

Yes, it does: practical naming enforces a lower bound on names,
as seen.

>> Going back to my initial question. Suppose linker symbols
>> were structured. Won't the rather technical need for allowing
>> long identifiers in Ada source text just vanish?
>
> There has never been, and is not now, such a need.

That's only formally true presuming
- the indirection tables of pragma Import do work well and
-  stick-to-the-letter interpretations of "200" and
-  "most programs".
But practically, names of imported subprograms import the risk of
hitting _any_ (project or compiler) limit on identifier lengths
when these foreign entities should not be renamed.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-29 19:54                           ` Niklas Holsti
  2013-01-29 23:12                             ` Georg Bauhaus
@ 2013-01-29 23:47                             ` Jeffrey Carter
  2013-01-30  7:24                               ` Niklas Holsti
  1 sibling, 1 reply; 64+ messages in thread
From: Jeffrey Carter @ 2013-01-29 23:47 UTC (permalink / raw)


On 01/29/2013 12:54 PM, Niklas Holsti wrote:
>
> If they are written in a pragma as a string, they are not Ada
> identifiers, and the ARM limit on identifier length does not apply.

There is no ARM limit on identifier length.

-- 
Jeff Carter
"I'm a kike, a yid, a heebie, a hook nose! I'm Kosher,
Mum! I'm a Red Sea pedestrian, and proud of it!"
Monty Python's Life of Brian
77



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-29 21:22   ` Dmitry A. Kazakov
@ 2013-01-30  3:22     ` Lucretia
  2013-01-30  9:49       ` Dmitry A. Kazakov
  2013-02-01  1:48       ` Shark8
  0 siblings, 2 replies; 64+ messages in thread
From: Lucretia @ 2013-01-30  3:22 UTC (permalink / raw)
  Cc: mailbox

On Tuesday, 29 January 2013 21:22:56 UTC, Dmitry A. Kazakov  wrote:
> On Tue, 29 Jan 2013 10:46:36 -0800 (PST), Lucretia wrote:
> 
> > I will be implementing a scanner for Ada 2012 and then a parser,
> 
> You don't need a separate scanner. Ada (as well as any other language) can
> be parsed in single pass, including reading the source.

Well I never said separate passes, it will be a scanner controlled by the parser, which will be recursive descent, i.e. usual form of match(token) type scanning.

> > I will read the entire file into a buffer and then scan the buffer.
> 
> A better approach would be to have an abstract type representing the source
> with multiple implementations of, e.g. backed by a stream, by a file, by a
> text buffer etc.

I will have multiple representations, but that will be mainly due to character types, i.e. utf-8, wide, etc.

But if the tokens are not copied into the AST but a token is a record consisting of a start and end position within the buffer then it cannot be a file, the whole thing has to be read into memory at the start, i.e. before scanning.

Luke.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-29 23:47                             ` Jeffrey Carter
@ 2013-01-30  7:24                               ` Niklas Holsti
  2013-01-30 10:09                                 ` J-P. Rosen
  0 siblings, 1 reply; 64+ messages in thread
From: Niklas Holsti @ 2013-01-30  7:24 UTC (permalink / raw)


On 13-01-30 01:47 , Jeffrey Carter wrote:
> On 01/29/2013 12:54 PM, Niklas Holsti wrote:
>>
>> If they are written in a pragma as a string, they are not Ada
>> identifiers, and the ARM limit on identifier length does not apply.
> 
> There is no ARM limit on identifier length.

Sorry. I was just abbreviating the "ARM lower bound on the length of
identifiers (lexical elements) that an Ada compiler must accept". But of
course abbreviation is not the Ada way. I have sinned, and I blush.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
      .      @       .



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-29 23:12                             ` Georg Bauhaus
@ 2013-01-30  9:18                               ` Niklas Holsti
  2013-01-30  9:51                                 ` Simon Wright
  2013-01-30 15:28                                 ` Robert A Duff
  2013-01-30  9:37                               ` Simon Wright
  1 sibling, 2 replies; 64+ messages in thread
From: Niklas Holsti @ 2013-01-30  9:18 UTC (permalink / raw)

On 13-01-30 01:12 , Georg Bauhaus wrote:
> On 29.01.13 20:54, Niklas Holsti wrote:
> 
> Upfront, my question was about the possible effects of
> enforcing shorter identifiers  on the programming
> profession.  I venture to read Robert Duff's recent post
> as: that is a tool/setting to put into programmers' hands
> if they want it, and otherwise there should be no limits.
> 
>>> Linker symbols reappear in source text.
>>
>> Show an example, please, where it is necessary to write a linker symbol
>> verbatim as an Ada identifier.
> 
> Sure. I'll assume that the subject of good lengths of identifiers
> shouldn't be argued in a stick-to-the-letter (200) fashion.
> 
> You have mentioned old Fortran's 6 characters, so lets go crazy
> and start with doubling this, twice, to 24 characters per identifier.
> 
> This is from Win32Ada:
> 
>   function CreateDialogIndirectParamW
> 
> Ouch. 26 characters. You have mentioned that 32 is uncomfortable,
> so 24 certainly is more uncomfortable. Still, the Win32 bindings
> show that most programmers want these foreign names to be just
> like the originals. So, there is a need to not impose a lower
> bound that excludes unchanged Win32 identifiers.

I agree that one often wants Ada bindings to other-language libraries to
use the same, or similar, identifiers.

But these are *identifiers*, not linker symbols.

Are there some 200-character *identifiers* in the Win32 API? I doubt it.

> The issue is present both ways. Say, I export this to C++:

Sure. But since languages are different, it is often impossible to use
the same identifier. For example, Cobol allows identifiers with hyphens,
Calculate-Average-Salary, IIRC.

C++ has namespaces, which can be used to emulate Ada package scopes. A
harsher example would be exporting Ada subprograms for calling from a C
program. Typically, large C programs use abbreviated prefixes in
identifiers to simulate Ada packages or C++ namespaces.

> Chances are that a limit on lengths of C++ identifiers will require
> an assignment of a different exception identifier with the help
> of #pragma or __attribute__. There is no standard way to address
> this. *But*! There could be a standard way to address this, if
> the profession is offered an incentive to move away from library
> style name mangling.

I agree, in principle. In practice, there is probably not enough
language mixing going on, at least not mixing that requires portablity,
to make the effort motivated.

> Third, GNAT.
> 
> When GNAT started to support ISO/IEC 10646 ("universal characters"),
> GNAT's upper bound on identifier lengths had to be lifted. Even
> in the sense that line length implies identifier length: There
> are names in the Unicode character database that have made the
> source files exceeded GNAT's upper bound on line lengths (79).

What "upper bound" of 79? I've certainly compiled longer lines, even
with very early versions of GNAT. Is that limit in the GNAT style rules?

> A unicode character's name is not a linker symbol, but still it
> is a name from a "library" of characters that is "imported".

OK. Perhaps the Unicode character names should be organized into
packages rather than a flat namespace, but that can't be done if the
names must define an enumerated type.

Unicode is a good example of long identifiers, though. From
http://www.unicode.org/charts/charindex.html I compute that the longest
is/are 73 characters, counting blanks and commas. One of these
73-character names is "Bold Italic Greek Mathematical Symbols, Sans-serif".

>> But it has nothing to do with limits on identifier length in Ada.
> 
> Yes, it does: practical naming enforces a lower bound on names,
> as seen.

But only for convenience (to use similar names) and not close to 200
characters.

> But practically, names of imported subprograms import the risk of
> hitting _any_ (project or compiler) limit on identifier lengths
> when these foreign entities should not be renamed.

Of course it is better if an Ada compiler accepts lines of any length,
and identifiers of any length.

To summarize, I think this thread started by asking why the ARM mentions
maximum line length at all. The discussion has shown that:

- Ada 83 did not discuss line length or lexical-element length, so
compilers were in principle obliged to support any length, but perhaps
many compilers had limits motivated by "capacity limits".

- Ada 95 introduced the lower bounds of 200 characters on lines and
lexical elements. Perhaps they were not meant to be independent, and the
ACATS test require the compiler to support identifiers that are as long
as the longest possible line. Therefore, in practice compilers use the
same limit (if any) for both, if I understood Randy correctly.

- Janus/Ada has a limit of around 250 characters on identifier length,
which Randy would find "annoying" to remove.

- The longest real-life identifier exhibited so far is from the Unicode
character names and is 73 characters.

The second question, posed by George, was:

> I wonder what will be the effect on working in the
> programming profession of a general limit on line
> lengths that is, say, <= 100 characters:

Two people (Dmitry and I) came out in defense of short lines, that is, a
limit on 100 characters would not have any effect on our work.

J-P. reported that code he saw, written with a line-length limit of 72
characters, was horrible to read. On the other hand, some code that I
have written with an 80-char limit has been praised as easy to read, so
there are other factors that affect readability. And different readers
have different preferences, too.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
      .      @       .

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-29 23:12                             ` Georg Bauhaus
  2013-01-30  9:18                               ` Niklas Holsti
@ 2013-01-30  9:37                               ` Simon Wright
  2013-01-30 12:02                                 ` Georg Bauhaus
  1 sibling, 1 reply; 64+ messages in thread
From: Simon Wright @ 2013-01-30  9:37 UTC (permalink / raw)


Georg Bauhaus <rm.dash-bauhaus@futureapps.de> writes:

> When GNAT started to support ISO/IEC 10646 ("universal characters"),
> GNAT's upper bound on identifier lengths had to be lifted. Even in the
> sense that line length implies identifier length: There are names in
> the Unicode character database that have made the source files
> exceeded GNAT's upper bound on line lengths (79).

That's GNAT's default _style_ limit on line length, above which you get
a warning if you ask for style checks (-gnaty). For its own sources
(maybe not for (all) the libraries?), GNAT treats warnings as errors.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-30  3:22     ` Lucretia
@ 2013-01-30  9:49       ` Dmitry A. Kazakov
  2013-01-30 23:28         ` Randy Brukardt
  2013-02-01  1:48       ` Shark8
  1 sibling, 1 reply; 64+ messages in thread
From: Dmitry A. Kazakov @ 2013-01-30  9:49 UTC (permalink / raw)


On Tue, 29 Jan 2013 19:22:44 -0800 (PST), Lucretia wrote:

> I will have multiple representations, but that will be mainly due to
> character types, i.e. utf-8, wide, etc.

Make source-to-parser interface UTF-8. Wide-I/O backend would convert to
UTF-8.
 
> But if the tokens are not copied into the AST but a token is a record
> consisting of a start and end position within the buffer then it cannot be
> a file, the whole thing has to be read into memory at the start, i.e.
> before scanning.

Tokens here are operators/statements = enumeration type. In the AST you
need only three entities: some enumeration type (in the branches),
identifiers and literals (in the leaves).

Here is a sample of Ada 95 expressions parser (to AST):

http://www.dmitry-kazakov.de/ada/components.htm#12.9

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-30  9:18                               ` Niklas Holsti
@ 2013-01-30  9:51                                 ` Simon Wright
  2013-01-30 15:28                                 ` Robert A Duff
  1 sibling, 0 replies; 64+ messages in thread
From: Simon Wright @ 2013-01-30  9:51 UTC (permalink / raw)


Niklas Holsti <niklas.holsti@tidorum.invalid> writes:

> On 13-01-30 01:12 , Georg Bauhaus wrote:

>> When GNAT started to support ISO/IEC 10646 ("universal characters"),
>> GNAT's upper bound on identifier lengths had to be lifted. Even
>> in the sense that line length implies identifier length: There
>> are names in the Unicode character database that have made the
>> source files exceeded GNAT's upper bound on line lengths (79).
>
> What "upper bound" of 79? I've certainly compiled longer lines, even
> with very early versions of GNAT. Is that limit in the GNAT style rules?

Yes.

>> I wonder what will be the effect on working in the
>> programming profession of a general limit on line
>> lengths that is, say, <= 100 characters:
>
> Two people (Dmitry and I) came out in defense of short lines, that is,
> a limit on 100 characters would not have any effect on our work.
>
> J-P. reported that code he saw, written with a line-length limit of 72
> characters, was horrible to read. On the other hand, some code that I
> have written with an 80-char limit has been praised as easy to read,
> so there are other factors that affect readability. And different
> readers have different preferences, too.

GNAT's default style limit of 79 characters has worked well for me in
major projects. I can have two Emacs frames open side-by-side.

I had one guy, who didn't get packages (I think he may have been a 'use'
advocate) who had subprograms named

   <meaningful-package-name>.<meaningful-package-name>_Do_Something

and kept running into the limit. He raised a defect 'allowed line length
too short', which I closed with 'use shorter identifiers'; his response,
I discovered afterwards, was to write 'pragma Style_Checks (Off);'. Grr.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-30  7:24                               ` Niklas Holsti
@ 2013-01-30 10:09                                 ` J-P. Rosen
  0 siblings, 0 replies; 64+ messages in thread
From: J-P. Rosen @ 2013-01-30 10:09 UTC (permalink / raw)


Le 30/01/2013 08:24, Niklas Holsti a ï¿½crit :
> On 13-01-30 01:47 , Jeffrey Carter wrote:
>> On 01/29/2013 12:54 PM, Niklas Holsti wrote:
>>>
>>> If they are written in a pragma as a string, they are not Ada
>>> identifiers, and the ARM limit on identifier length does not apply.
>>
>> There is no ARM limit on identifier length.
> 
> Sorry. I was just abbreviating the "ARM lower bound on the length of
> identifiers (lexical elements) that an Ada compiler must accept". But of
> course abbreviation is not the Ada way. I have sinned, and I blush.
> 
Of course, there is such limit (in practice). The longest identifier in
a language defined package.

-- 
J-P. Rosen
Adalog
2 rue du Docteur Lombard, 92441 Issy-les-Moulineaux CEDEX
Tel: +33 1 45 29 21 52, Fax: +33 1 45 29 25 00
http://www.adalog.fr



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-30  9:37                               ` Simon Wright
@ 2013-01-30 12:02                                 ` Georg Bauhaus
  0 siblings, 0 replies; 64+ messages in thread
From: Georg Bauhaus @ 2013-01-30 12:02 UTC (permalink / raw)

On 30.01.13 10:37, Simon Wright wrote:
> That's GNAT's default_style_  limit on line length, above which you get
> a warning if you ask for style checks (-gnaty). For its own sources
> (maybe not for (all) the libraries?), GNAT treats warnings as errors.

Yes, the GNAT limits enforced by style rules is what I had meant,
thanks for correcting.

I'll guess that these style rules do affect the way sources of GNAT
are structured. For example, deep nesting would need the rules to be
lifted temporarily. Subprogram names lengthened by prefixing package
names with '_' might not work. Thus an implied limit on the length
of identifiers leads to no less than a well known method of OO
analysis, the one that suggests finding the name of objects hidden
in the subprogram's names (Grady Booch expounds).

ada-format-paramlist certainly helps shape the little programs I
write. Happily, this feature, which blends well with limits on line
lengths, is working nicely in the latest Ada mode for Emacs.

Dmitry Kazakov's style of presenting subprogram specs has an
effect on the way I grasp the meaning, I think. An I can see
how it might be compatible with lines of at most 72 characters,
at least for flat structures (everything placed in packages
than can see parent packages, but not look into surrounding
nests, since these are off limits .-)

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-30  9:18                               ` Niklas Holsti
  2013-01-30  9:51                                 ` Simon Wright
@ 2013-01-30 15:28                                 ` Robert A Duff
  2013-01-30 23:24                                   ` Randy Brukardt
  2013-01-31  9:03                                   ` Dmitry A. Kazakov
  1 sibling, 2 replies; 64+ messages in thread
From: Robert A Duff @ 2013-01-30 15:28 UTC (permalink / raw)

Niklas Holsti <niklas.holsti@tidorum.invalid> writes:

> To summarize, I think this thread started by asking why the ARM mentions
> maximum line length at all. The discussion has shown that:
>
> - Ada 83 did not discuss line length or lexical-element length, so
> compilers were in principle obliged to support any length, but perhaps
> many compilers had limits motivated by "capacity limits".

Right.

> - Ada 95 introduced the lower bounds of 200 characters on lines and
> lexical elements. Perhaps they were not meant to be independent, ...

The wording is clear: the two requirements are independent.
That was the intent.  (I wrote those words, although I was opposed
to adding the 200-char requirements.)

>...and the
> ACATS test require the compiler to support identifiers that are as long
> as the longest possible line.

Those tests are wrong, and they were wrong even for Ada 83.
An Ada compiler is allowed to support unlimited line lengths.
The ACATS requires the compiler to have a limit; that's wrong.
It's also wrong to require that the two limits be the same.

By "support unlimited line lengths" I mean no specific limit built into
the compiler.  Of course, there is always SOME limit -- the size of your
disk limits how long a line can be, a (broken) OS might impose a limit,
and if you compile something huge, the compiler might run out of memory.

>...Therefore, in practice compilers use the
> same limit (if any) for both, if I understood Randy correctly.

Yes, they do.

It's not the only time compiler writers have obeyed the ACATS, rather
than disputing incorrect tests.  That could be because it's a big hassle
to dispute tests, and takes a long time, and in this case, it's trivial
to obey the tests.  It could also be because the compiler writer assumed
the test was correct.

> - Janus/Ada has a limit of around 250 characters on identifier length,
> which Randy would find "annoying" to remove.

I think the limit in GNAT is many thousands of characters.
But sources written at AdaCore are limited to 79-character lines.

> - The longest real-life identifier exhibited so far is from the Unicode
> character names and is 73 characters.

I'd prefer not to have such built-in limits.  The fact that I've
never seen an identifier longer than 73 characters doesn't change
my mind.  To argue for a built-in limit of 200 characters, I think
you have to not only argue that "nobody needs lines longer than that",
but also argue that there is some important advantage (efficiency?
simplicity?), which I don't see here.

- Bob

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-30 15:28                                 ` Robert A Duff
@ 2013-01-30 23:24                                   ` Randy Brukardt
  2013-01-31  2:16                                     ` Robert A Duff
  2013-01-31  9:03                                   ` Dmitry A. Kazakov
  1 sibling, 1 reply; 64+ messages in thread
From: Randy Brukardt @ 2013-01-30 23:24 UTC (permalink / raw)


"Robert A Duff" <bobduff@shell01.TheWorld.com> wrote in message 
news:wccfw1iheq9.fsf@shell01.TheWorld.com...
...
> I'd prefer not to have such built-in limits.  The fact that I've
> never seen an identifier longer than 73 characters doesn't change
> my mind.  To argue for a built-in limit of 200 characters, I think
> you have to not only argue that "nobody needs lines longer than that",
> but also argue that there is some important advantage (efficiency?
> simplicity?), which I don't see here.

There is an effciency advantage, but whether it is important today I can't 
say. Specifically, the position value for a line can be limited to 8-bits if 
you limit lines to 250 characters. Having such a limit reduces the size of 
writes done by our compiler's first pass by roughly 15%, and given that the 
compiler was completely disk-bound, that effectively reduced the runtime of 
that pass by a corresponding amount (it also saved a similar amount of time 
in later passes reading the data back in). Whether that's still true on 
current machines I don't know (it seems likely that there is still some cost 
for reading/writing 15% more data that would 98% of the time carry no 
information). (I probably wouldn't architect a new compiler like Janus/Ada 
is, as memory savings is not an important criteria today, but I still 
wouldn't like wasting significant amounts of memory (line/position 
information being stored in almost every symbol table entry) to carry almost 
no information.

                                                Randy.





^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-30  9:49       ` Dmitry A. Kazakov
@ 2013-01-30 23:28         ` Randy Brukardt
  0 siblings, 0 replies; 64+ messages in thread
From: Randy Brukardt @ 2013-01-30 23:28 UTC (permalink / raw)

"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message 
news:115ytn2r5nvqf$.11fx46ufzkxhn.dlg@40tude.net...
> On Tue, 29 Jan 2013 19:22:44 -0800 (PST), Lucretia wrote:
>
>> I will have multiple representations, but that will be mainly due to
>> character types, i.e. utf-8, wide, etc.
>
> Make source-to-parser interface UTF-8. Wide-I/O backend would convert to
> UTF-8.

Agreed. Ada 2012 requires Ada compilers to accept UTF-8. I rather doubt that 
there is much value to accepting anything else (given that 7-bit ASCII is a 
straight subset of UTF-8). Straight 8-bit Latin-1 might be an exception for 
historical reasons, but for a new implementation, start with UTF-8 and worry 
about other formats later (and possibly never).

                                    Randy.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-30 23:24                                   ` Randy Brukardt
@ 2013-01-31  2:16                                     ` Robert A Duff
  2013-01-31  9:10                                       ` Stefan.Lucks
  2013-01-31 23:54                                       ` Randy Brukardt
  0 siblings, 2 replies; 64+ messages in thread
From: Robert A Duff @ 2013-01-31  2:16 UTC (permalink / raw)


"Randy Brukardt" <randy@rrsoftware.com> writes:

>...(I probably wouldn't architect a new compiler like Janus/Ada 
> is, as memory savings is not an important criteria today, ...

Memory savings (we're talking about the host, not the target)
is not important today directly -- but it affects cache behavior,
which is still important for speed.

>...but I still 
> wouldn't like wasting significant amounts of memory (line/position 
> information being stored in almost every symbol table entry) to carry almost 
> no information.

There are various ways to store line/column numbers compactly without
limiting line lengths.

- Bob



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-30 15:28                                 ` Robert A Duff
  2013-01-30 23:24                                   ` Randy Brukardt
@ 2013-01-31  9:03                                   ` Dmitry A. Kazakov
  1 sibling, 0 replies; 64+ messages in thread
From: Dmitry A. Kazakov @ 2013-01-31  9:03 UTC (permalink / raw)

On Wed, 30 Jan 2013 10:28:46 -0500, Robert A Duff wrote:

> Niklas Holsti <niklas.holsti@tidorum.invalid> writes:
> 
>> - The longest real-life identifier exhibited so far is from the Unicode
>> character names and is 73 characters.
> 
> I'd prefer not to have such built-in limits.  The fact that I've
> never seen an identifier longer than 73 characters doesn't change
> my mind.  To argue for a built-in limit of 200 characters, I think
> you have to not only argue that "nobody needs lines longer than that",
> but also argue that there is some important advantage (efficiency?
> simplicity?), which I don't see here.

An argument might be that doing so you enlarge the set of legal yet
non-compilable programs. Ideally, each legal program should be compilable.
Furthermore, a program successfully compiled by the compiler X on the
machine M, should also be compilable by the compiler Y on N.

Theoretically, limiting the identifier length (and everything else from
subprogram body size to record members number) makes sense. The actual
problem is IMO that any choice of such a limit would be arbitrary, and
there are far too many such limits to define.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-31  2:16                                     ` Robert A Duff
@ 2013-01-31  9:10                                       ` Stefan.Lucks
  2013-01-31  9:30                                         ` Niklas Holsti
  2013-01-31 18:02                                         ` Jeffrey Carter
  2013-01-31 23:54                                       ` Randy Brukardt
  1 sibling, 2 replies; 64+ messages in thread
From: Stefan.Lucks @ 2013-01-31  9:10 UTC (permalink / raw)

On Wed, 30 Jan 2013, Robert A Duff wrote:

> There are various ways to store line/column numbers compactly without
> limiting line lengths.

It certainly is possible to do that without *any* limit on line lengths. 
But in practice, wouldn't you limit it to some constant, say, 
Integer'Last?

In any case, nobody should need line lengths of more than about 200 
characters. This is not a technological limit, this is related to human 
vision. The typical line length (about 80 characters) has been chosen to 
be well-readable by humans. (Actually, most people consider this a bit too 
large ... but let us stick with "about 80" for the sake of the argument.)

Note that 80 is the limit for prosaic texts, not for programs.

Programmers may argue that indention eats up a lot of columns at the left, 
so they need more on the right. But that has to end at some point -- if 
your program is too deeply nested, you should refactorize it, rather than 
indent even more. So 120 characters should be plenty. Anything beyond that 
should not be considered as readable for humans, anymore.

You many push this further by writing in some two-column style (commands 
on the left, comments on the right). But even then, 200 characters are 
plenty (close to 120 for the commands, a few spaces and "--" to separate 
the comments from the commands, and another 80 for the comments).

The only reason to support longer lines would be to handle automatically 
generated code, that is *not* meant to be human-readable. In fact, writing 
tools to generate source code is (slightly) easier if one does not need to 
care about line ends at all, writing a source file into a single line ...

Stefan

--
------  I  love  the  taste  of  Cryptanalysis  in  the morning!  ------
     <http://www.uni-weimar.de/cms/medien/mediensicherheit/home.html>
--Stefan.Lucks (at) uni-weimar.de, Bauhaus-Universität Weimar, Germany--

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-31  9:10                                       ` Stefan.Lucks
@ 2013-01-31  9:30                                         ` Niklas Holsti
  2013-01-31  9:51                                           ` Simon Wright
  2013-01-31 10:56                                           ` Georg Bauhaus
  2013-01-31 18:02                                         ` Jeffrey Carter
  1 sibling, 2 replies; 64+ messages in thread
From: Niklas Holsti @ 2013-01-31  9:30 UTC (permalink / raw)

On 13-01-31 11:10 , Stefan.Lucks@uni-weimar.de wrote:

> In any case, nobody should need line lengths of more than about 200
> characters. This is not a technological limit, this is related to human
> vision. The typical line length (about 80 characters) has been chosen to
> be well-readable by humans. (Actually, most people consider this a bit
> too large ... but let us stick with "about 80" for the sake of the
> argument.)

 [snip]

> The only reason to support longer lines would be to handle automatically
> generated code, that is *not* meant to be human-readable. In fact,
> writing tools to generate source code is (slightly) easier if one does
> not need to care about line ends at all, writing a source file into a
> single line ...

"Slightly" is right. It is very simple to make the source-code generator
insert newlines between lexical elements to keep lines from growing
unreasonably long.

In practice, when one is developing a program that generates source-code
(in Ada or other languages), it is often necessary to look at the output
to check that it is correct, or to understand why it is wrong, and this
is much easier if the output is not only broken into lines, but also
indented in the usual way.

A pretty-printer post-processor could be used to format single-long-line
output, but this raises the questions of how long input lines the
pretty-printer can handle, and whether it can handle syntactically or
lexically illegal or incomplete input. Also, if the source-code
generator itself does the indenting, incorrect indentation can be a good
clue for finding errors in the generator, in my experience.

So, developing a working automatic source-code generator is easier if
the generator produces code that is nicely broken into lines and nicely
indented. Generating a single long line is not the best way, unless the
output must be as short as possible (avoiding consecutive spaces or tabs
for indentation).

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
      .      @       .

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-31  9:30                                         ` Niklas Holsti
@ 2013-01-31  9:51                                           ` Simon Wright
  2013-01-31 10:56                                           ` Georg Bauhaus
  1 sibling, 0 replies; 64+ messages in thread
From: Simon Wright @ 2013-01-31  9:51 UTC (permalink / raw)


Niklas Holsti <niklas.holsti@tidorum.invalid> writes:

> In practice, when one is developing a program that generates
> source-code (in Ada or other languages), it is often necessary to look
> at the output to check that it is correct, or to understand why it is
> wrong, and this is much easier if the output is not only broken into
> lines, but also indented in the usual way.

I adopted the rule that, within reasonable bounds (eg, no user-defined
identifiers > 50 characters, say) the generated code shouldn't raise any
style check warnings. And should act as an example of the code style
desired for the hand-written code.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-31  9:30                                         ` Niklas Holsti
  2013-01-31  9:51                                           ` Simon Wright
@ 2013-01-31 10:56                                           ` Georg Bauhaus
  1 sibling, 0 replies; 64+ messages in thread
From: Georg Bauhaus @ 2013-01-31 10:56 UTC (permalink / raw)

On 31.01.13 10:30, Niklas Holsti wrote:
> In practice, when one is developing a program that generates source-code
> (in Ada or other languages), it is often necessary to look at the output
> to check that it is correct, or to understand why it is wrong, and this
> is much easier if the output is not only broken into lines, but also
> indented in the usual way.

Google's Javascript -> Javascript compiler works this way.

It has options whose effects include the following instructions
obeyed by the compiler. (For reasons of style, the options
are named differently, and obfuscation is subsumed under "renaming
symbols" and "compressing output", referring to pretty printing
and optimizing.)

2) just obfuscate
3) obfuscate and remove most white space from lines
4) obfuscate and remove most white space including line endings

Option 4) is very popular.

Option 2) is the second most useful one for finding and reporting
errors, because finding locations will not require the assistance
of a "Javalisp system". (Google is working on a new "Inspector"
to be built into browsers, which features source code mapping.)

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-31  9:10                                       ` Stefan.Lucks
  2013-01-31  9:30                                         ` Niklas Holsti
@ 2013-01-31 18:02                                         ` Jeffrey Carter
  1 sibling, 0 replies; 64+ messages in thread
From: Jeffrey Carter @ 2013-01-31 18:02 UTC (permalink / raw)


On 01/31/2013 02:10 AM, Stefan.Lucks@uni-weimar.de wrote:
>
> Programmers may argue that indention eats up a lot of columns at the left, so
> they need more on the right. But that has to end at some point -- if your
> program is too deeply nested, you should refactorize it, rather than indent even
> more. So 120 characters should be plenty. Anything beyond that should not be
> considered as readable for humans, anymore.

For about 30 yrs, I have used a personal limit of 130 characters for program 
source code. (For work I adhere to the project's coding standard.) I find the 
resulting code quite readable, easier to read than code constrained to 
80-character lines. I have always been able to use an editor that handles entire 
lines easily, and have been able to print 130-character lines when needed.

-- 
Jeff Carter
"Strange women lying in ponds distributing swords
is no basis for a system of government."
Monty Python & the Holy Grail
66



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-31  2:16                                     ` Robert A Duff
  2013-01-31  9:10                                       ` Stefan.Lucks
@ 2013-01-31 23:54                                       ` Randy Brukardt
  2013-02-01  9:15                                         ` Niklas Holsti
  1 sibling, 1 reply; 64+ messages in thread
From: Randy Brukardt @ 2013-01-31 23:54 UTC (permalink / raw)


"Robert A Duff" <bobduff@shell01.TheWorld.com> wrote in message 
news:wccwquucd1x.fsf@shell01.TheWorld.com...
...
>>...but I still
>> wouldn't like wasting significant amounts of memory (line/position
>> information being stored in almost every symbol table entry) to carry 
>> almost
>> no information.
>
> There are various ways to store line/column numbers compactly without
> limiting line lengths.

Really? Do tell! I know of no such technique that still would be usable 
(meaning does not involve indirection, which would add complication and 
fragmentation issues, and can be used at any time without calculation, 
because you don't know ahead of time when and where you'll need to generate 
error messages, debugging information, and traces). If you're not willing to 
waste space, you have to limit the size to 24-bits (maybe 32-bits at the 
outside); you might be able to use a couple of bits to control the divide 
between line numbers and positions, but really long line at the end of a 
really long file couldn't be handled (you'd run out of bits). For 24-bits, 
it's not worth it, and even using 32 you might as well just use a 10-bit 
position number and be done with it. If you only needed relative encoding, 
you could store the change from the last item in each time (limited to a 
small number of bits); that might work in the token stream [not the way we 
use it, but that's a different issue], but it wouldn't work in the symbol 
table or debugging information.

Janus/Ada has line number and line position types declared in the Host 
package; these can be adjusted of course but we do the Ada thing and use 
strong typing so that someone doesn't use a line number as if it is a type 
handle. [Someone, ahem, has done that in the past.] So the storage options 
are limited.

                                                     Randy.





^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-30  3:22     ` Lucretia
  2013-01-30  9:49       ` Dmitry A. Kazakov
@ 2013-02-01  1:48       ` Shark8
  1 sibling, 0 replies; 64+ messages in thread
From: Shark8 @ 2013-02-01  1:48 UTC (permalink / raw)
  Cc: mailbox

On Tuesday, January 29, 2013 9:22:44 PM UTC-6, Lucretia wrote:
> 
> I will have multiple representations, but that will be mainly due to character types, i.e. utf-8, wide, etc.

Hm, will you perhaps use something like:
 (1) define interface type in base package,
 (2) define generic [child] packages,
 (3) detect encoding, and
 (4) instantiate (with character-encoding) the generic, and
 (5) use the object implementing the interface to parse?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-01-31 23:54                                       ` Randy Brukardt
@ 2013-02-01  9:15                                         ` Niklas Holsti
  2013-02-01 23:13                                           ` Randy Brukardt
  0 siblings, 1 reply; 64+ messages in thread
From: Niklas Holsti @ 2013-02-01  9:15 UTC (permalink / raw)

On 13-02-01 01:54 , Randy Brukardt wrote:
> "Robert A Duff" <bobduff@shell01.TheWorld.com> wrote in message 
> news:wccwquucd1x.fsf@shell01.TheWorld.com...
> ...
>>> ...but I still
>>> wouldn't like wasting significant amounts of memory (line/position
>>> information being stored in almost every symbol table entry) to carry 
>>> almost
>>> no information.
>>
>> There are various ways to store line/column numbers compactly without
>> limiting line lengths.
> 
> Really? Do tell! I know of no such technique that still would be usable 
> (meaning does not involve indirection, which would add complication and 
> fragmentation issues, and can be used at any time without calculation, 
> because you don't know ahead of time when and where you'll need to generate 
> error messages, debugging information, and traces).

I'm not sure that I understand your conditions on what is "usable", but
the LEB128 encoding (little-endian base 128) used in DWARF works pretty
well for storing unbounded but usually small integers without wasting a
lot of space. An integer is encoded as a sequence of octets, with one
end-marker bit and 7 significand bits in each octet. Numbers from 0 to
127 take one octet, number from 128 to 2**14 - 1 take two octets, and so on.

All records of course become variable-length octet sequences, which
causes processing overhead when storing and loading data. But if the
bottleneck is the disk I/O, that may be tolerable.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
      .      @       .

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-02-01  9:15                                         ` Niklas Holsti
@ 2013-02-01 23:13                                           ` Randy Brukardt
  2013-02-02  1:24                                             ` Lucretia
  0 siblings, 1 reply; 64+ messages in thread
From: Randy Brukardt @ 2013-02-01 23:13 UTC (permalink / raw)

"Niklas Holsti" <niklas.holsti@tidorum.invalid> wrote in message 
news:an1fdkFjimtU1@mid.individual.net...
> On 13-02-01 01:54 , Randy Brukardt wrote:
>> "Robert A Duff" <bobduff@shell01.TheWorld.com> wrote in message
>> news:wccwquucd1x.fsf@shell01.TheWorld.com...
...
>>> There are various ways to store line/column numbers compactly without
>>> limiting line lengths.
>>
>> Really? Do tell! I know of no such technique that still would be usable
>> (meaning does not involve indirection, which would add complication and
>> fragmentation issues, and can be used at any time without calculation,
>> because you don't know ahead of time when and where you'll need to 
>> generate
>> error messages, debugging information, and traces).
>
> I'm not sure that I understand your conditions on what is "usable", but
> the LEB128 encoding (little-endian base 128) used in DWARF works pretty
> well for storing unbounded but usually small integers without wasting a
> lot of space. An integer is encoded as a sequence of octets, with one
> end-marker bit and 7 significand bits in each octet. Numbers from 0 to
> 127 take one octet, number from 128 to 2**14 - 1 take two octets, and so 
> on.

Thanks, sort of. That would work in the stream of tokens representation, but 
it wouldn't work in the symbol table, debugging information, intermediate 
code, and the like. As I noted, we use strong typing by having a Line_Number 
and Line_Position type, and it's preferable to use those types everywhere. A 
variable-length format would work fine in a stream, but it wouldn't work in 
the various records (you'd have to allocate space for the largest 
representation, and then why have all of the complication?)

Also note that the symbol table records are written in and out (that's how 
we implement "with"), so increasing the size would have an impact on the 
compilation speed.

I could see reasons for giving more bits to these things (in particular, a 
representation using 22-bits for the line number and 10-bits for the 
position would add only a little bit more space but effectively remove the 
limits), but I don't see any way to truely eliminate the limits. (Making the 
line number 32-bits and position 16-bits would increase the I/O requirements 
of the compiler by around 5%; that would probably only be noticable in large 
compilations or via a stopwatch but it surely is not "free".)

In any case, there is a disincentive to make any changes so long as the 
ACATS insists on having a line length and identifier length being the same 
and thus relatively short. And it would be hard to justify rewriting those 
ACATS tests (they're among the least important tests, but not so much that 
they have so little value that removing them outright could be considered). 
Dan Eilers had the best suggestion, which was to convert the tests to tests 
that 200 character identifiers are properly supported (all compilers have to 
support those), but that would also be the most work. Such work would be 
better spent on tests for Ada 2005 and Ada 2012 features, I think.

                                       Randy.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-02-01 23:13                                           ` Randy Brukardt
@ 2013-02-02  1:24                                             ` Lucretia
  2013-02-02 14:12                                               ` Robert A Duff
  2013-02-05  2:09                                               ` Randy Brukardt
  0 siblings, 2 replies; 64+ messages in thread
From: Lucretia @ 2013-02-02  1:24 UTC (permalink / raw)

On Friday, 1 February 2013 23:13:48 UTC, Randy Brukardt  wrote:

> In any case, there is a disincentive to make any changes so long as the 
> ACATS insists on having a line length and identifier length being the same 

I thought someone (Bob Duff?) stated that:

1) some ACATS tests are wrong and people don't tell them they are
2) that there is no length that must be specified as a maximum (i.e. 200)
3) that an implementation can have unlimited values for these lengths

> and thus relatively short. And it would be hard to justify rewriting those 
> ACATS tests (they're among the least important tests, but not so much that 

But surely it's better to have tests that are correct?

Luke.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-02-02  1:24                                             ` Lucretia
@ 2013-02-02 14:12                                               ` Robert A Duff
  2013-02-05  2:09                                               ` Randy Brukardt
  1 sibling, 0 replies; 64+ messages in thread
From: Robert A Duff @ 2013-02-02 14:12 UTC (permalink / raw)

Lucretia <laguest9000@googlemail.com> writes:

> On Friday, 1 February 2013 23:13:48 UTC, Randy Brukardt  wrote:
>
>> In any case, there is a disincentive to make any changes so long as the 
>> ACATS insists on having a line length and identifier length being the same 
>
> I thought someone (Bob Duff?) stated that:
>
> 1) some ACATS tests are wrong and people don't tell them they are
> 2) that there is no length that must be specified as a maximum (i.e. 200)
> 3) that an implementation can have unlimited values for these lengths

Yes, I said something like that (although 2 and 3 above seem
synonymous, so I'm not 100% sure what you mean).

>> and thus relatively short. And it would be hard to justify rewriting those 
>> ACATS tests (they're among the least important tests, but not so much that 
>
> But surely it's better to have tests that are correct?

I suppose so, but it's up to some implementer to dispute the tests.
In this case, the wrong test is pretty harmless.  It's not requiring
implementations to do something wrong.  It's requiring them to
do something they are allowed to do anyway.

I believe somebody has volunteered to fix the tests in question.
That's good, but it will have zero effect on any implementation.
That's because the "fix" is to weaken the tests to match the RM;
an implementation that passes the tests now will still pass the
weaker version, and will have no incentive to change.  It will
also have zero effect on any programmer -- programmers will
continue to obey whatever line/lexeme-length limitations
their compiler(s) require.  And if they want to be portable,
they will continue to avoid writing lines longer than 200
characters.

- Bob

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Ada standard and maximum line lengths
  2013-02-02  1:24                                             ` Lucretia
  2013-02-02 14:12                                               ` Robert A Duff
@ 2013-02-05  2:09                                               ` Randy Brukardt
  1 sibling, 0 replies; 64+ messages in thread
From: Randy Brukardt @ 2013-02-05  2:09 UTC (permalink / raw)


"Lucretia" <laguest9000@googlemail.com> wrote in message 
news:9f1fb966-cc23-4499-b50c-571ffc0c7f01@googlegroups.com...
...
>> and thus relatively short. And it would be hard to justify rewriting 
>> those
>> ACATS tests (they're among the least important tests, but not so much 
>> that
>
> But surely it's better to have tests that are correct?

Sure, in an ideal world. But fixing tests is not free (especially in this 
case, where there are quite a few tests that would need to be changed). 
Would that time be better spent on adding tests for untested rules in the 
language? Or streamlining the testing process? Even if a volunteer submits 
revised tests, and those tests are perfect (and how likely is that?), there 
still is a significant (in terms of the overall budget) non-zero cost to 
adding them to the test suite. As Bob notes, the priority of this particular 
fix is rather low, because it really would not have much impact on users.

                                                              Randy.





^ permalink raw reply	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2013-02-05  2:09 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-28  5:02 Ada standard and maximum line lengths Lucretia
2013-01-28  6:01 ` J-P. Rosen
2013-01-28  6:28 ` Jeffrey Carter
2013-01-28  8:05   ` Niklas Holsti
2013-01-28 16:42     ` Jeffrey Carter
2013-01-28 20:22       ` Niklas Holsti
2013-01-28 20:46         ` J-P. Rosen
2013-01-28 21:29           ` Niklas Holsti
2013-01-29  1:42             ` Randy Brukardt
2013-01-29  6:15             ` J-P. Rosen
2013-01-29 10:25               ` Niklas Holsti
2013-01-29 11:31                 ` Georg Bauhaus
2013-01-29 12:11                   ` Simon Wright
2013-01-29 12:31                   ` Niklas Holsti
2013-01-29 12:37                     ` Niklas Holsti
2013-01-29 15:29                     ` Georg Bauhaus
2013-01-29 16:58                       ` Niklas Holsti
2013-01-29 17:51                         ` Georg Bauhaus
2013-01-29 18:18                           ` Shark8
2013-01-29 19:54                           ` Niklas Holsti
2013-01-29 23:12                             ` Georg Bauhaus
2013-01-30  9:18                               ` Niklas Holsti
2013-01-30  9:51                                 ` Simon Wright
2013-01-30 15:28                                 ` Robert A Duff
2013-01-30 23:24                                   ` Randy Brukardt
2013-01-31  2:16                                     ` Robert A Duff
2013-01-31  9:10                                       ` Stefan.Lucks
2013-01-31  9:30                                         ` Niklas Holsti
2013-01-31  9:51                                           ` Simon Wright
2013-01-31 10:56                                           ` Georg Bauhaus
2013-01-31 18:02                                         ` Jeffrey Carter
2013-01-31 23:54                                       ` Randy Brukardt
2013-02-01  9:15                                         ` Niklas Holsti
2013-02-01 23:13                                           ` Randy Brukardt
2013-02-02  1:24                                             ` Lucretia
2013-02-02 14:12                                               ` Robert A Duff
2013-02-05  2:09                                               ` Randy Brukardt
2013-01-31  9:03                                   ` Dmitry A. Kazakov
2013-01-30  9:37                               ` Simon Wright
2013-01-30 12:02                                 ` Georg Bauhaus
2013-01-29 23:47                             ` Jeffrey Carter
2013-01-30  7:24                               ` Niklas Holsti
2013-01-30 10:09                                 ` J-P. Rosen
2013-01-29 20:36                 ` Niklas Holsti
2013-01-29 21:01                   ` Robert A Duff
2013-01-29 21:14                   ` Dmitry A. Kazakov
2013-01-28  8:18 ` Dmitry A. Kazakov
2013-01-28 10:02   ` Maciej Sobczak
2013-01-28 11:57     ` Georg Bauhaus
2013-01-28 13:28       ` Niklas Holsti
2013-01-28 15:14       ` J-P. Rosen
2013-01-28 16:13       ` Dmitry A. Kazakov
2013-01-28 15:13     ` Dmitry A. Kazakov
2013-01-28 13:49 ` Robert A Duff
2013-01-29  2:09   ` Randy Brukardt
2013-01-29 18:46 ` Lucretia
2013-01-29 20:53   ` Robert A Duff
2013-01-29 21:22   ` Dmitry A. Kazakov
2013-01-30  3:22     ` Lucretia
2013-01-30  9:49       ` Dmitry A. Kazakov
2013-01-30 23:28         ` Randy Brukardt
2013-02-01  1:48       ` Shark8
2013-01-29 21:29   ` Dmitry A. Kazakov
2013-01-29 21:53   ` Adam Beneschan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox