comp.lang.ada
 help / color / mirror / Atom feed
  • * Re: Compiler error messages
           [not found] <01bd278c$bea48680$9dfc82c1@xhv46.dial.pipex.com>
           [not found] ` <En96AJ.JxL@world.std.com>
    @ 1998-01-23  0:00 ` Robert Dewar
      1998-01-23  0:00 ` Robert Dewar
      1998-01-23  0:00 ` Larry Kilgallen
      3 siblings, 0 replies; 7+ messages in thread
    From: Robert Dewar @ 1998-01-23  0:00 UTC (permalink / raw)
    
    
    
    Some other points that Nick makes
    
      <<I do think there are a lot of things that compilers can do to help users.
        Always making reference to the appropriate section(s) of the manual, for
        example (something that precious few compilers actually do -- why???).>>
    
    Let's assume "the manual" means the Ada RM in this case. Indeed many Ada
    compilers do make RM references. We very deliberately decide in GNAT not
    to except in a few unusual cases. Our reasoning is that for expert users,
    this is unlikely to be necessary, since you know the language and you know
    what is right and wrong. For naive users, going and reading the RM tends
    to add confusion on top of confusion.
    
    Yes, we know all the arguments on both sides of this issue. No need to
    rehash them. If you want, go to DejaNews, there have been long threads on
    this issue before. Many people agree with us, some do not. Many people
    comment that they like the error messages in GNAT (it was certainly for
    example one of the reasons that the Air Force academy chose GNAT over other
    competing compilers for teaching Ada). We think this is at least partly a
    reflection of the fact that we are forced to try to come up with a
    clear error message without relying on the RM reference for an explanation
    (that often is inaccesible to beginners).
    
    As I say, the interesting thing here is not general discussions but
    particular examples. Interestingly, when we challenge people who think
    that RM references are a good thing to come up with specific examples
    where an RM reference would help, we have got virtually no input (that
    does not surprise us!)
    
      <<Other ideas are: it is occasionally helpful for the compiler to
        report how long it took to process each file, and usually this
        is very easy to do>>
    
    Does not seem very useful to me, though it would be useful to program. You
    can find out how long you spend in each phase of gcc, using the standard
    gcc option. (read the gcc manual, it has lots of useful stuff!)
    
      <<Something that most compiler writers could provide which would be extremely
        useful to their users -- [extremely odd ethnic reference removed] --
        is to provide a section in the manual which discusses each
        error (or those where it would be useful) in some detail.>>
    
    Actually we don't think a section of the manual as such is the right
    solution here. We have a design for this, it is a program called GNOME
    (Gnat On-line error Message Explanation). THe idea is to bonk on an 
    error message from your editor or IDE or whatever, and you get a menu
    pointing to a full explanation of the error, cross linked via huper
    text to the RM, Rationale, and whatever other useful reference materials
    are around.
    
    Isn't that a nice idea?
    
    Unfortunately all we have in place so far is the great name, and since
    error messages are pretty good in GNAT, it is not something that is on 
    the top of the priority list.
    
    Our users don't complain about error messages. In fact we would like them
    to complain more, or at least send in constructive suggestions. "I know
    this is wrong, but it seems the error message could have been more helpful"
    are useful reports for us
    
    Robert Dewar
    Ada Core Technologies
    
    
      
      
    
    
    
    
    
    
    
    ^ permalink raw reply	[flat|nested] 7+ messages in thread
  • * Re: Compiler error messages
           [not found] <01bd278c$bea48680$9dfc82c1@xhv46.dial.pipex.com>
           [not found] ` <En96AJ.JxL@world.std.com>
      1998-01-23  0:00 ` Robert Dewar
    @ 1998-01-23  0:00 ` Robert Dewar
      1998-01-23  0:00   ` Nick Roberts
      1998-01-23  0:00 ` Larry Kilgallen
      3 siblings, 1 reply; 7+ messages in thread
    From: Robert Dewar @ 1998-01-23  0:00 UTC (permalink / raw)
    
    
    
    Nick Roberts said
    
    <<My advice to compiler writers would be: make SURE that the compiler reports
    any error 100% accurately.  That means making NO assumptions about what
    caused the error ("oh, it was _probably_ because the user forgot to type a
    semicolon", etc...).  It means reporting everything that could possibly
    have caused the error (directly!), even if this means a humungous error
    message.  It means producing a technically precise message, even if you
    feel some users would prefer something more 'down to earth' (because 'down
    to earth' invariably means inaccurate/incomplete/vague/wrong).
    >>
    
    The trouble is that this reasonable prescription is meaningless.
    
    A program is either right or wrong from a formal point of view. Especially
    when it comes to syntax errors, the only possible syntax error that the
    above principle could permit is
    
    "The above program does not meet the syntax in the Ada RM"
    
    without any indication of where or what is wrong. To give *any* more 
    detailed indication of what is wrong requires that you make assumptions
    of the kind that you say you don't like.
    
    I don't know how much you know about compiler techniques, but a compiler
    never really knows anything about what is wrong in the absence of
    assumptions of some kind.
    
    The question always boils down to how to make these assumptions.
    It is of course huge and unuseful hyperbole to say that compilers
    that attempt to give a clear message "invariably [result in]
    inaccurate/incomplete/vague/wrong [messages]".
    
    Your mention of "technically precise" message is not thought through
    carefully. It makes me think that you are a user and not builder of
    compilers, since if you built them, you would be more aware of this
    obvious point.
    
    For example, in the discussion at hand
    
      a := b & + c;
    
    all the following messages are technically precise in the only possible
    sense that this can be meaningful
    
      Missing operand between & and +
      + c must be parenthesized
      Redundant + ignored
      
    These are relatively reasonable, the following are just as precise
    from a formal point of view
    
      identifier You_Did_Not_Want_This_Here missing between & and +
      above statement should have been "accept abc"
      & + replaced by minus operator
    
    etc. The only reason these "technically correct" messages are "wrong" is
    because they are making less likely assumptions than the first set.
    
    Let's take an example where GNAT does a lot of work in trying to cdome
    up with a correct message (try this on various Ada compilers).
    
    Write a big package body that looks like
    
        package body XYZ is
          procedure A;
          procedure B;
          procedure C;
          ...
          procedure Z;
          
          procedure A is ...
          procedure B is ...
          ...
          procedure Z is ...
        end;
    
    that's fine, now change the semicolon after the procedure spec for M to
    an is:
    
          procedure L;
          procedure M is
          procedure N;
    
    that's an *easy* cut and paste error.
    
    GNAT will tell you that the is should be a semicolon.
    
    This is obvious to a human, but not at all obvious to a compiler.
    Why not?
    
    Well the text from procedure M is, up to and including the final end
    statement, is a valid procedure body. 
    
    OOOPS slight mistake for this to be 100% true, add just before the final
    begin a null package body:
    
        begin
           null;
        end;
    
    
    the favorite Ada compiler that I used for years before GNAT simply said
    
    "unexpected end of file" pointing to the end of the program for this.
    
    Easy to see why, it scanned out what it thought was the body of M 
    successfully, and then planned on resuming the scan of the package
    body and was surprised to find an end of file.
    
    THis was a truly horrid error. After a while you got to know it meant
    that somewhere you had is in place of semicolon, and sometimes I 
    would have to do edits in a binary search to find the bad one.
    
    Note that both the GNAT and other compiler errors are both technically
    valid error messages, but one is MUCH more helpful than the other.
    
    My experience in error messages is that it is not something that can
    be addressed by simplistic principles of the type Nick is reaching
    for. On the contrary getting to the point of generating useful
    error messages is extremely difficult.
    
    Most people are pleasantly surprised at how well GNAT does in pinning
    down messages (one of the students in my compiler class last semester,
    where eveyerone was using Ada, sent some email asking how GNAT manages
    to give such accurate error messages.
    
    Now when that student was asking that question, what did he mean by
    accurate?
    
    Technnically accurate?
    
    NOt at all. He meant messages that corresponded to the error he had made.
    
    Now only the programmmer knwos the true fix for an error message. 
    
    An informative error message means guessing correctly at something that
    is close enough to this "real" reason to click.
    
    This is difficult. A huge amount of effort in the GNAT sources goes into
    this. Let's take another example.
    
    
    Suppose during parsing you encounter a junk end line, i.e. one that is
    not what is expected.
    
    There are three possibilities
    
      1. It is a piece of junk that should be ignored
    
      2. It is a corruption of the currently expected end line, and should
    	be accepted as such
    
      3. There is a missing end line, and this one belongs to an outer scope
    
    It is absolutely crucial to make the "right" decision here, since an
    error will cause chaos in cascaded messages.
    
    Of course you can't always make the right decision, but you can try.
    GNAT uses all sorts of heuristics. It pays close attention to any
    tokens used, to help match up end lines, and it even looks at the
    indentation for a clue as to what was meant. If you are interested
    in pursuing this, have a look at unit par-endh.adb in the GNAT sources.
    
    Of course GNAT does not do a perfect job in generating error messages.
    This is not possible, in the sense that it is not a well defined task.
    
    But it does pretty well, and we work on improving it all the time.
    
    It is much more instructive to look at specific examples than to
    speak in generalities here.
    
    I certainly agree with Nick that many compilers have incredibly appallingly
    bad error message generation. In particular, I have never seen a C compiler
    that I thought was even vaguely acceptable in this regard.
    
    Ada compilers have generally been better, partly because Gerry Fisher's
    interest in error detection meant that the original Ada Ed was pretty
    good, and as a result the ACVC tests came to expect pretty decent
    error recovery. Many of the Ada 83 compilers actually directly borrowed
    some of the NYU work here.
    
    We think GNAT takes the generation of good error messages to a stage
    that is a definite notch better than what has been there previously,
    but there is lots of room for improvement. 
    
    We are always happy to get error message suggestions, and examples where
    things did not work well. SOmetimes the answer is "sorry, we can't be
    this telepathic", other times the answer is "this may surprise you,
    but actually this case is easy to fix!"
    
    Robert Dewar
    Ada Core Technologies
    
    
    
    
    
    ^ permalink raw reply	[flat|nested] 7+ messages in thread
  • * Re: Compiler error messages
           [not found] <01bd278c$bea48680$9dfc82c1@xhv46.dial.pipex.com>
                       ` (2 preceding siblings ...)
      1998-01-23  0:00 ` Robert Dewar
    @ 1998-01-23  0:00 ` Larry Kilgallen
      1998-01-23  0:00   ` Robert Dewar
      3 siblings, 1 reply; 7+ messages in thread
    From: Larry Kilgallen @ 1998-01-23  0:00 UTC (permalink / raw)
    
    
    
    In article <01bd278c$bea48680$9dfc82c1@xhv46.dial.pipex.com>, "Nick Roberts" <Nick.Roberts@dial.pipex.com> writes:
    > I've been most interested in the thread about compiler error messages.
    > 
    > Having used many many compilers (BASIC, PASCAL, C, Ada, and all sorts of
    > others) for many many years, I've come to the conclusion that, almost
    > always, the cleverer the compiler tries to be about error messages, the
    > less helpful it ends up being, in reality.
    > 
    > Many is the time when a compiler has reported an error to me, most
    > elaborately and cleverly, and been completely and 100% wrong about the true
    > nature/source of the error.  And boy does it make me spit.  Hands up who
    > hasn't been infuriated by a 'smart' compiler producing reams of completely
    > spurious errors (after one legitimate one), presumably because the compiler
    > writer thought it would be really clever for the compiler to 'ignore' the
    > first error.  I always prefer compilers which simply stop at the first
    > error.  What a sad waste of effort.
    
    I have seen some compilers whch do a horrid job (DEC Scan and Bliss-32)
    and some which do a wonderful job (DEC Ada).  I suppose this depends on
    what sort of errors one is making, but if I knew enough to categorize
    my errors, I wouldn't make them !
    
    Larry Kilgallen
    
    
    
    
    ^ permalink raw reply	[flat|nested] 7+ messages in thread

  • end of thread, other threads:[~1998-01-24  0:00 UTC | newest]
    
    Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
    -- links below jump to the message on this page --
         [not found] <01bd278c$bea48680$9dfc82c1@xhv46.dial.pipex.com>
         [not found] ` <En96AJ.JxL@world.std.com>
    1998-01-23  0:00   ` Compiler error messages Nick Roberts
         [not found]     ` <EnAqpo.2oJ@world.std.com>
    1998-01-24  0:00       ` Nick Roberts
    1998-01-23  0:00 ` Robert Dewar
    1998-01-23  0:00 ` Robert Dewar
    1998-01-23  0:00   ` Nick Roberts
    1998-01-23  0:00 ` Larry Kilgallen
    1998-01-23  0:00   ` Robert Dewar
    

    This is a public inbox, see mirroring instructions
    for how to clone and mirror all data and code used for this inbox