From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00,FORGED_GMAIL_RCVD, FREEMAIL_FROM autolearn=no autolearn_force=no version=3.4.4 X-Google-Thread: 103376,bdebc54a485c13a4,start X-Google-NewGroupId: yes X-Google-Attributes: gida07f3367d7,domainid0,public,usenet X-Google-Language: ENGLISH,ASCII-7-bit Received: by 10.68.231.138 with SMTP id tg10mr11698306pbc.7.1332520147705; Fri, 23 Mar 2012 09:29:07 -0700 (PDT) Path: kz5ni24831pbc.0!nntp.google.com!news2.google.com!goblin3!goblin1!goblin.stu.neva.ru!eternal-september.org!feeder.eternal-september.org!mx04.eternal-september.org!.POSTED!not-for-mail From: Natasha Kerensikova Newsgroups: comp.lang.ada Subject: My first compiler bug: work around or redesign? Date: Fri, 23 Mar 2012 16:29:06 +0000 (UTC) Organization: A noiseless patient Spider Message-ID: Mime-Version: 1.0 Injection-Date: Fri, 23 Mar 2012 16:29:06 +0000 (UTC) Injection-Info: mx04.eternal-september.org; posting-host="Mda950WjNwNLAFOE7yJXQw"; logging-data="5739"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19H9sh/hv7SxY7Kw9gsfaTt" User-Agent: slrn/0.9.9p1 (FreeBSD) Cancel-Lock: sha1:llvnFPagvgr0g1yFfPoL/ns654U= Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Date: 2012-03-23T16:29:06+00:00 List-Id: Hello, I happen to have encountered my very first compiler bug, or at least something that claims to be in the following message: +===========================GNAT BUG DETECTED==============================+ | 4.6.2 20111026 (release) -=> GNAT AUX [FreeBSD64] (x86_64-aux-freebsd9.0) GCC error:| | in gnat_to_gnu_entity, at ada/gcc-interface/decl.c:4134 | | Error detected at dressup-parsers-markdown.adb:184:4 [dressup-parsers-markdown.adb:1651:7 [markdown.adb:32:4]]| | Please submit a bug report; see http://gcc.gnu.org/bugs.html. | | Use a subject line meaningful to you and us to track the bug. | | Include the entire contents of this bug box in the report. | | Include the exact gcc or gnatmake command that you entered. | | Also include sources listed below in gnatchop format | | (concatenated together with no headers between files). | +==========================================================================+ So my first question is, would anyone be kind enough to try and reproduce that bug? The files involved are available individually at http://fossil.instinctive.eu/dressup/dir?ci=bf54531f8dcf7117 or as a tarball at http://fossil.instinctive.eu/dressup/tarball/Dressup-bf54531f8dcf7117.tar.gz?uuid=bf54531f8dcf71174ccb486812b886701222c342 The reason is that I'm using gnat AUX, derived from gcc 4.6.2, so it's both unofficial and old (IIRC tasking in FreeBSD 9 is preventing the next one from being available). It's difficult to create a new building environment, so I would like to be sure it's really a bug before setting out to make a minimalistic test case and reporting it. I guess the problem somehow involves generics: Dressup.Parsers is a generic package; so Dressup.Parsers.Markdown is generic too, despite adding no further formal parameter. markdown.adb:32:4 is the instantiation of Dressup.Parsers.Markdown (off an instance of Dressup.Parsers instantiated on the line before). dressup-parsers-markdown.adb:1651:7 is an instantiation of a subprogram whose specification is at dressup-parsers-markdown.adb:184:4. Could the issue be caused by having a generic instance inside a generic instance inside a generic instance? Or is gnat supposed to handle well such a level of nesting? All this led me to question my approach and design and programming practices, so that if I have to rewrite something to work around the compiler bug, I can rewrite better. So my first "best practices" question is about using generic subprograms confined inside a package body. Here is a brutally-simplified version of what is reported in the compiler bug message: package Stuff is procedure Ordered_List (); function Ordered_Prefix (Line : String) return Natural; function Unordered_Prefix (Line : String) return Natural; -- subprogram bodies here procedure Ordered_List_Instance is new Generic_List (Ordered_Prefix); procedure Ordered_List () renames Oredered_List_Instance; procedure Unordered_List_Instance is new Generic_List (Unordered_Prefix); procedure Unordered_List () renames Unordered_List_Instance; end Stuff; The rationale here is that Ordered_List and Unordered_List are meant to be completely independent, so they are presented in the specification as being completely unrelated. However, at implementation level, it turns out that they are very similar: only the prefix recognition change, and further processing is perfectly identical. So instead of cut-and-pasting code, I would write a generic that handles all the common aspects, using a formal function for the prefix part. Is there something wrong with that approach? Are there some caveat that I missed? Are there advantages in avoiding the generics in that situation, for example using a non-generic common function that takes an access-to-subprogram extra parameter? And as a tangential question, could anyone explain me why the "renames" are required? How come a generic instantiation cannot provide a body for a publicly-specified subprogram? And the last part of the message here is about the general design of the library. I have ended up using a lot of generics and access to subprograms, but no tagged types (actually some types are tagged, but only for future expansions, none of the code written here use any tagged type feature). I would understand anyone skipping that part of the discussion, but any constructive comment will be appreciated (though not necessarily acted upon). The initial problem I was set out to solve was converting markdown into HTML, but with enough modularity so that I can convert markdown into PDF without changing the "markdown" part (that I call "parser", I hope I got the word right), or convert creole into HTML without changing the "HTML" part (that I call "renderer"). And as an extra requirement, I want features of a parser to be easily and individually turned off (e.g. removing the raw HTML inclusion in markdown for untrusted sources, or removing the "wiki link" feature of creole where it is used outside of a wiki). In my previous iteration of markdown-to-HTML code (in C), I found that a usable description of a renderer is a bunch of callbacks that operate on the same shared state. So for my Ada library, I decided to describe a renderer as a state object and a set of accesses to procedure. The idea being that each procedure renders a particular element (e.g. an ordered list, and the callback for HTML would output "
    ", the contents and "
"). Language elements without a renderer callback are considered as disabled. That way, the renderer does not need to know anything about the parser, and the parser only handles callbacks and an opaque, so it is also independent from any particular renderer. Only the client has to care about both the particular renderer and the particular parser in use. I went for access to subprogram rather than a tagged type for element renderers to ensure that all callbacks do share the same state, since in the tagged type version each element renderer object would have its own state (presumably referring to some shared state like the output string or stream), there would then be no compile-time guarantee that all renderer elements indeed belong to the same renderer. A client who mistakenly mixes callbacks and ends up with a set of callbacks referring to one state and another set referring to another, would have no indication of their mistake before seeing garbage at run time. Moreover, using dynamic dispatching of tagged type instead of access to subprogram would mean storing somewhere object of a class-wide type, i.e. indefinite. So it would mean extra complications like holders objects, which make the program harder to read and to understand. These drawbacks without benefit (unless I'm missing something) was enough for me to rule out the option. I then proceeded to write the (X)HTML renderer. While thinking about the implementation, I realized that I would only need to append string fragments. So I wrote it as a generic package, with an Accumulator formal type and an Append procedure. Again it looked much simpler than using an approach based on tagged type (and interfaces), but this time en client side rather than on library side: Unbounded_String are bundled with an Append procedure that fits perfectly, streams might be useful out of box if String'Write can be used directly for Append. With a interfaces, the client would have to maintain a wrapper around Unbounded_String or streams or whatever accumulator they re-use, and it feels to me like unnecessary clutter. Moreover it seems possible and relatively simple to instance the generic markdown renderer with an interface type, while the advantages of the generic version seem out of reach of a version based on interfaces. With this representation of renderers, I started shaping the parser with a generic ancestor package Dressup.Parsers, that only defines the type Element_Renderer used for the callbacks. The extra genericity and accesses to subprograms of Dressup.Parsers.Lexers follows the same rationale as for renderers. I think that covers all the debatable choices, though if you feel like discussion another one, feel free to do so. Thanks a lot in advance for your helpful insights, Natasha