comp.lang.ada
 help / color / mirror / Atom feed
* Help parsing the language manual on Get'ing integers from Strings
@ 2020-12-21  0:11 John Perry
  2020-12-21  7:44 ` Niklas Holsti
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: John Perry @ 2020-12-21  0:11 UTC (permalink / raw)


Hello all

Sorry if the subject is unclear. I recently tried to use

   Get(S, Value, Last);

...in a program where Value was a Natural and S has the value "29: 116 82 | 119 24". GNAT gave me a Data_Error.

I don't understand why. Here's what the language manual says:

"Reads an integer value from the beginning of the given string, following the same rules as the Get procedure that reads an integer value from a file, but treating the end of the string as a file terminator. ...The exception Data_Error is propagated if the sequence input does not have the required syntax or if the value obtained is not of the subtype Num."

The referenced Get procedure says, (some irrelevant (?) parts omitted)

"...skips any leading blanks, line terminators, or page terminators, then ...reads the longest possible sequence of characters matching the syntax of a numeric literal without a point."

I've used this procedure before, and as far as I can tell:

   - GNAT is fine with "29:"
   - GNAT is NOT fine with "29: " or any larger substring of S

So:

1) Apparently GNAT thinks the colon is a character that matches the syntax of a numeric literal; do I interpret this correctly?

2) Where does the language manual say this? I didn't see it in Section 3.5.4 ("Integer Types").

3) or is this a bug?

4) or do I misinterpret the language manual?

sincere thanks
john perry

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Help parsing the language manual on Get'ing integers from Strings
  2020-12-21  0:11 Help parsing the language manual on Get'ing integers from Strings John Perry
@ 2020-12-21  7:44 ` Niklas Holsti
  2020-12-21  9:33   ` AdaMagica
  2020-12-21  7:57 ` Dmitry A. Kazakov
  2020-12-21 11:30 ` John Perry
  2 siblings, 1 reply; 11+ messages in thread
From: Niklas Holsti @ 2020-12-21  7:44 UTC (permalink / raw)


On 2020-12-21 2:11, John Perry wrote:
> Hello all
> 
> Sorry if the subject is unclear. I recently tried to use
> 
>     Get(S, Value, Last);
> 
> ...in a program where Value was a Natural and S has the value "29:
> 116 82 | 119 24". GNAT gave me a Data_Error.

I get the same.

I'm using GNATLS Community 2019 (20190517-83) on a Mac.


> I've used this procedure before, and as far as I can tell:
> 
>     - GNAT is fine with "29:"
>     - GNAT is NOT fine with "29: " or any larger substring of S


I get the same. However, I also see:

    "12:      44   "  works, Value = 12 and Last = 2
    "18:      44   "  fails with Data_Error.

Very weird.

    "12: 116 82 | 119 24" works, Value = 12 and Last = 2
    "18: 116 82 | 119 24" fails with Data_Error.

Also:

    "18:44:" fails with Data_Error
    "12:44:" works, Value = 52, Last = 6.

Note (44 base 12) = 56 decimal!


> 1) Apparently GNAT thinks the colon is a character that matches the
> syntax of a numeric literal; do I interpret this correctly?

It seems that the Get procedure understands ':' as a base indicator, as in

    "12#44#" works, Value = 52, Last = 6.
    "12#44"  fails with Data_Error.

"29:..." and "18:..." fail because 18 and 29 are too large to be bases; 
the max is 16.

"12:..." works because 12 is an acceptable base.


> 2) Where does the language manual say this? I didn't see it in
> Section 3.5.4 ("Integer Types").


I don't think the manual says this anywhere. RM 2.4.2 "Based Literals" 
shows only '#' as the base indicator.


> 3) or is this a bug?


I think it is.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Help parsing the language manual on Get'ing integers from Strings
  2020-12-21  0:11 Help parsing the language manual on Get'ing integers from Strings John Perry
  2020-12-21  7:44 ` Niklas Holsti
@ 2020-12-21  7:57 ` Dmitry A. Kazakov
  2020-12-21  8:06   ` Niklas Holsti
  2020-12-21  8:16   ` Dmitry A. Kazakov
  2020-12-21 11:30 ` John Perry
  2 siblings, 2 replies; 11+ messages in thread
From: Dmitry A. Kazakov @ 2020-12-21  7:57 UTC (permalink / raw)


On 2020-12-21 01:11, John Perry wrote:

> Sorry if the subject is unclear. I recently tried to use
> 
>     Get(S, Value, Last);
> 
> ...in a program where Value was a Natural and S has the value "29: 116 82 | 119 24". GNAT gave me a Data_Error.
> 
> I don't understand why. Here's what the language manual says:
> 
> "Reads an integer value from the beginning of the given string, following the same rules as the Get procedure that reads an integer value from a file, but treating the end of the string as a file terminator. ...The exception Data_Error is propagated if the sequence input does not have the required syntax or if the value obtained is not of the subtype Num."
> 
> The referenced Get procedure says, (some irrelevant (?) parts omitted)
> 
> "...skips any leading blanks, line terminators, or page terminators, then ...reads the longest possible sequence of characters matching the syntax of a numeric literal without a point."
> 
> I've used this procedure before, and as far as I can tell:
> 
>     - GNAT is fine with "29:"
>     - GNAT is NOT fine with "29: " or any larger substring of S
> 
> So:
> 
> 1) Apparently GNAT thinks the colon is a character that matches the syntax of a numeric literal; do I interpret this correctly?
> 
> 2) Where does the language manual say this? I didn't see it in Section 3.5.4 ("Integer Types").
> 
> 3) or is this a bug?
> 
> 4) or do I misinterpret the language manual?

I think the problem is that the implementation tries to interpret

   29: 116 ...

as a based number. Colon : is a replacement character for # (see 
allowable replacements of characters). So it might think of 29: 116 as a 
malformed base-29 number with wrong base and missing closing :.

You could use this

    http://dmitry-kazakov.de/ada/strings_edit.htm

instead. Its Get has the base as the argument and uses string index 
rather than inconvenient in case of parsing Last.

Parsing your string would look like:

    Pointer := S'First; -- Start here
    Get (S, Pointer);   -- Skip blanks
    Get (S, Pointer, Value_1);
    Get (S, Pointer);
    Get (S, Pointer, Value_2);
    Get (S, Pointer);
    Get (S, Pointer, Value_3);
    Get (S, Pointer);
    if not Is_Prefix ("|", S, Pointer) then
       raise Data_Error;
    else
       Pointer := Pointer + 1;
    end if;
    Get (S, Pointer);
    Get (S, Pointer, Value_4);
    Get (S, Pointer);
    Get (S, Pointer, Value_5);

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Help parsing the language manual on Get'ing integers from Strings
  2020-12-21  7:57 ` Dmitry A. Kazakov
@ 2020-12-21  8:06   ` Niklas Holsti
  2020-12-21  9:40     ` Jeffrey R. Carter
  2020-12-21  8:16   ` Dmitry A. Kazakov
  1 sibling, 1 reply; 11+ messages in thread
From: Niklas Holsti @ 2020-12-21  8:06 UTC (permalink / raw)


On 2020-12-21 9:57, Dmitry A. Kazakov wrote:
> On 2020-12-21 01:11, John Perry wrote:
> 
>> Sorry if the subject is unclear. I recently tried to use
>>
>>     Get(S, Value, Last);
>>
>> ...in a program where Value was a Natural and S has the value "29: 116 
>> 82 | 119 24". GNAT gave me a Data_Error.
>>
...
> I think the problem is that the implementation tries to interpret
> 
>    29: 116 ...
> 
> as a based number. Colon : is a replacement character for # (see 
> allowable replacements of characters).


I see, an "obsolescent feature" in RM J.2. I learn something new every 
day (I hope).

Ok, so no bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Help parsing the language manual on Get'ing integers from Strings
  2020-12-21  7:57 ` Dmitry A. Kazakov
  2020-12-21  8:06   ` Niklas Holsti
@ 2020-12-21  8:16   ` Dmitry A. Kazakov
  1 sibling, 0 replies; 11+ messages in thread
From: Dmitry A. Kazakov @ 2020-12-21  8:16 UTC (permalink / raw)


On 2020-12-21 08:57, Dmitry A. Kazakov wrote:

> Parsing your string would look like:
> 
>     Pointer := S'First; -- Start here
>     Get (S, Pointer);   -- Skip blanks

    if not Is_Prefix (":", S, Pointer) then
       raise Data_Error;
    else
       Pointer := Pointer + 1;
    end if;
    Get (S, Pointer);

>     Get (S, Pointer, Value_1);
>     Get (S, Pointer);
>     Get (S, Pointer, Value_2);
>     Get (S, Pointer);
>     Get (S, Pointer, Value_3);
>     Get (S, Pointer);
>     if not Is_Prefix ("|", S, Pointer) then
>        raise Data_Error;
>     else
>        Pointer := Pointer + 1;
>     end if;
>     Get (S, Pointer);
>     Get (S, Pointer, Value_4);
>     Get (S, Pointer);
>     Get (S, Pointer, Value_5);

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Help parsing the language manual on Get'ing integers from Strings
  2020-12-21  7:44 ` Niklas Holsti
@ 2020-12-21  9:33   ` AdaMagica
  0 siblings, 0 replies; 11+ messages in thread
From: AdaMagica @ 2020-12-21  9:33 UTC (permalink / raw)


Niklas Holsti schrieb am Montag, 21. Dezember 2020 um 08:44:33 UTC+1:
> "12: 44 " works, Value = 12 and Last = 2 
> "18: 44 " fails with Data_Error. 
> "18:44:" fails with Data_Error 
> "12:44:" works, Value = 52, Last = 6. 
> > 1) Apparently GNAT thinks the colon is a character that matches the 
> > syntax of a numeric literal; do I interpret this correctly?
> It seems that the Get procedure understands ':' as a base indicator, as in 
> "12#44#" works, Value = 52, Last = 6. 
> "12#44" fails with Data_Error. 

RM J.2(2).

This looks like a GNAT bug. If it accepts "12: 44" as 12, leaving ": 44" in the input stream, it cannot be interpreting : as a replacement of #, since 12# 44"  correctly raises Data_Error and also "12: 44" is incorrect syntax for base 12.
But then it also has to be accept 18: 44" as 18, leaving " 44" in the input stream, although 18: 44" is incorrect syntax for a based numeral.

The behaviour is inconsequent and should be reported.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Help parsing the language manual on Get'ing integers from Strings
  2020-12-21  8:06   ` Niklas Holsti
@ 2020-12-21  9:40     ` Jeffrey R. Carter
  2020-12-22  1:11       ` Randy Brukardt
  0 siblings, 1 reply; 11+ messages in thread
From: Jeffrey R. Carter @ 2020-12-21  9:40 UTC (permalink / raw)


On 12/21/20 9:06 AM, Niklas Holsti wrote:
> 
> I see, an "obsolescent feature" in RM J.2. I learn something new every day (I 
> hope).

Yes. I never worked with a system that required such substitutions, even in 1984 
when it was not an obsolescent feature, but as we can see, it's important to be 
aware of them.

These days they are sometimes used for obfuscation.

-- 
Jeff Carter
"It has been my great privilege, many years ago,
whilst traveling through the mountains of Paraguay,
to find the Yack'Wee Indians drinking the juice of
the cacti."
The Old Fashioned Way
152

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Help parsing the language manual on Get'ing integers from Strings
  2020-12-21  0:11 Help parsing the language manual on Get'ing integers from Strings John Perry
  2020-12-21  7:44 ` Niklas Holsti
  2020-12-21  7:57 ` Dmitry A. Kazakov
@ 2020-12-21 11:30 ` John Perry
  2020-12-21 23:25   ` John Perry
  2 siblings, 1 reply; 11+ messages in thread
From: John Perry @ 2020-12-21 11:30 UTC (permalink / raw)


Hello everyone

Thanks to everyone for the very helpful replies.

I will take AdaMagica's advice and file this as a bug. Worst comes to worst, they'll have an excellent explanation for why it isn't a bug, which I'll report here.

john perry

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Help parsing the language manual on Get'ing integers from Strings
  2020-12-21 11:30 ` John Perry
@ 2020-12-21 23:25   ` John Perry
  2020-12-22  1:19     ` Randy Brukardt
  0 siblings, 1 reply; 11+ messages in thread
From: John Perry @ 2020-12-21 23:25 UTC (permalink / raw)


AdaCore replies:

> Thanks for your report.  We believe the intent of the RM is that
arbitrary lookahead/backtracking should not be required, so GNAT raises
an exception when it encounters something wrong (like a base > 16).  We
believe no Ada compilers are in strict conformance to the rules in this
area, and ARG has agreed not to test this area strictly.  Perhaps
RM-A.10.8(8) should be clarified/corrected.

> The rule about colons is hidden away in RM-J.2(3). But this isn't
specific to colons -- you'd get the same behavior if you used "#"
instead.

Sure enough, the string "29# " leads to a data error. If I read this right, the problem is that the compiler sees it as base 29, which Niklas Holsti had hinted at, but I didn't quite follow the implications.

So, not a bug!

john perry

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Help parsing the language manual on Get'ing integers from Strings
  2020-12-21  9:40     ` Jeffrey R. Carter
@ 2020-12-22  1:11       ` Randy Brukardt
  0 siblings, 0 replies; 11+ messages in thread
From: Randy Brukardt @ 2020-12-22  1:11 UTC (permalink / raw)


"Jeffrey R. Carter" <spam.jrcarter.not@spam.not.acm.org> wrote in message 
news:rrpqi1$kf5$1@dont-email.me...
> On 12/21/20 9:06 AM, Niklas Holsti wrote:
>>
>> I see, an "obsolescent feature" in RM J.2. I learn something new every 
>> day (I hope).
>
> Yes. I never worked with a system that required such substitutions, even 
> in 1984 when it was not an obsolescent feature, but as we can see, it's 
> important to be aware of them.

I believe that restriction had to do with certain keypunches. But hardly 
anyone used keypunches even in 1981. (The Unisaur computer that our CS 
compiler-construction class used still had a few keypunches, but they had 
mostly transitioned to terminals by that time. I think that was the last 
class to use the Unisaur; they just had installed some VAX 780s for research 
and they soon got some for student use as well. My first few programming 
classes at UW used the Unisaurs keypunches.) I think that requirement was 
obsolete by the time Ada was completed (it probably wasn't when the Ada 
design was started).

                                           Randy.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Help parsing the language manual on Get'ing integers from Strings
  2020-12-21 23:25   ` John Perry
@ 2020-12-22  1:19     ` Randy Brukardt
  0 siblings, 0 replies; 11+ messages in thread
From: Randy Brukardt @ 2020-12-22  1:19 UTC (permalink / raw)


"John Perry" <john.perry@usm.edu> wrote in message 
news:e9ed12c5-9254-4d44-823e-a2f3f8da16aan@googlegroups.com...
> AdaCore replies:
>
>> Thanks for your report.  We believe the intent of the RM is that
> arbitrary lookahead/backtracking should not be required, so GNAT raises
> an exception when it encounters something wrong (like a base > 16).  We
> believe no Ada compilers are in strict conformance to the rules in this
> area, and ARG has agreed not to test this area strictly.  Perhaps
> RM-A.10.8(8) should be clarified/corrected.

For what it's worth, we once tried to do that, but couldn't come to an 
agreement on precisely what to change the wording to. As a change is not 
critical, we didn't make one. The ACATS has long had tests in this area that 
require something subtly different than the wording requires, and it didn't 
make any sense to change them (since presumably all implementers are passing 
them, rather than strictly following the RM wording).

In any case, the ":" replacement trips up people from time-to-time, as 
pretty much no one remembers it. I recall we had to change some piece of new 
syntax because the possibility of a colon in a number made it ambiguous.

                        Randy.




^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-12-22  1:19 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-21  0:11 Help parsing the language manual on Get'ing integers from Strings John Perry
2020-12-21  7:44 ` Niklas Holsti
2020-12-21  9:33   ` AdaMagica
2020-12-21  7:57 ` Dmitry A. Kazakov
2020-12-21  8:06   ` Niklas Holsti
2020-12-21  9:40     ` Jeffrey R. Carter
2020-12-22  1:11       ` Randy Brukardt
2020-12-21  8:16   ` Dmitry A. Kazakov
2020-12-21 11:30 ` John Perry
2020-12-21 23:25   ` John Perry
2020-12-22  1:19     ` Randy Brukardt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox