From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,314d661a32522d8a,start X-Google-Attributes: gid103376,public,usenet X-Google-Language: ENGLISH,ASCII-7-bit Path: g2news1.google.com!postnews.google.com!a70g2000hsh.googlegroups.com!not-for-mail From: baldrick Newsgroups: comp.lang.ada Subject: Announcement: GNAT ported to LLVM Date: Sun, 23 Mar 2008 15:05:59 -0700 (PDT) Organization: http://groups.google.com Message-ID: NNTP-Posting-Host: 82.232.58.132 Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Trace: posting.google.com 1206309959 31575 127.0.0.1 (23 Mar 2008 22:05:59 GMT) X-Complaints-To: groups-abuse@google.com NNTP-Posting-Date: Sun, 23 Mar 2008 22:05:59 +0000 (UTC) Complaints-To: groups-abuse@google.com Injection-Info: a70g2000hsh.googlegroups.com; posting-host=82.232.58.132; posting-account=H52IZQoAAAAXM-RV35eCd20PYXCi5WQf User-Agent: G2/1.0 X-HTTP-UserAgent: Mozilla/5.0 (X11; U; x86_64 Linux; en_GB; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4,gzip(gfe),gzip(gfe) Xref: g2news1.google.com comp.lang.ada:20553 Date: 2008-03-23T15:05:59-07:00 List-Id: Hi, this is to let people know that the recently released LLVM 2.2 compiler toolkit contains experimental support for Ada through the llvm-gcc-4.2 compiler. Currently the only platform it works on is linux running on 32 bit intel x86. This is because that's what I run, and I'm the only one who's been working on this. I would appreciate help from other Ada guys, both for porting to new platforms and adding support for missing features, not to mention testing and bug fixing!. LLVM (http://llvm.org/) is a set of compiler libraries and tools for optimization and static and just-in-time code generation. Personally I find LLVM a lot of fun, and pleasant to work with due to the good design and clean implementation. I hope you will too! llvm-gcc is gcc with the gcc optimizers replaced by LLVM's; llvm-gcc-4.2 is the version of llvm-gcc based on gcc-4.2. The way llvm-gcc works (this is transparent to users) is that the gcc-4.2 GNAT front-end converts Ada into "gimple", gcc's internal language independent representation. The gimple is then turned into LLVM's internal form, referred to as IR. This in then run through LLVM's optimizers, followed by LLVM's code generators which squirt it out as assembler or object code. In practice you can use llvm-gcc as a drop in replacement for gcc. However the use of LLVM opens up other possibilities too. For example, it is possible to have llvm-gcc squirt out LLVM IR rather than object code (by using -emit-llvm on the command line). It is possible to link the LLVM IR for different compilation units together and reoptimize them. In other words you can do link-time optimization. This is all language independent, so if part of your program is written in Ada and other parts in C/C++/Fortran etc, you can link them all together and mutually optimize them, resulting in C routines being inlined into Ada etc. The compiler works quite well, but it is still experimental. All of the ACATS testsuite passes except for c380004 and c393010. Since c380004 also fails with gcc-4.2, that makes c393010 the only failure due to the use of the LLVM infrastructure (the problem comes from the GNAT front-end which produces a bogus type declaration that the gimple -> LLVM convertor rejects). On the other hand, many of the tests in the GNAT testsuite fail. The release notes give some more details of what works and what doesn't: http://llvm.org/releases/2.2/docs/ReleaseNotes.html The precompiled llvm-gcc-4.2 shipped with the LLVM 2.2 release was built without support for Ada, so you will need to build the compiler yourself. You can find instructions at http://llvm.org/docs/GCCFEBuildInstrs.html Please report bugs and problems to the LLVM mailing lists, or using http://llvm.org/bugs/ One nice thing about LLVM is that people are responsive and quickly fix bugs (often by the next day). The LLVM IR is easy to read (with a bit of practice), and since it contains the entire LLVM state you get to see exactly what has happened to your program. This might be useful for static analysis, it is certainly useful for understanding how the various Ada constructs are implemented. To give you a taste for what it looks like, here is an example showing what a simple Ada program gets turned into. Here is the Ada: with Ada.Text_IO; procedure Hello is begin Ada.Text_IO.Put_Line ("Hello world!"); end; Here's the result of compiling it: $ gcc -S -O2 -emit-llvm -o - hello.adb ... %struct.string___XUB = type { i32, i32 } ... @.str = internal constant [12 x i8] c"Hello world!" ; <[12 x i8]*> [#uses=1] @C.168.1155 = internal constant %struct.string___XUB { i32 1, i32 12 } ; <%struct.string___XUB*> [#uses=1] define void @_ada_hello() { entry: tail call void @ada__text_io__put_line__2( i8* getelementptr ([12 x i8]* @.str, i32 0, i32 0), %struct.string___XUB* @C.168.1155 ) ret void } declare void @ada__text_io__put_line__2(i8*, %struct.string___XUB*) I've dropped the declarations of some uninteresting types and other info, thus the ... Note that passing -S -emit-llvm results in LLVM assembler being output (the human readable version of LLVM IR); using -c -emit- llvm would result in the compact binary form of LLVM IR, known as bitcode. Passing -o - causes the assembler to be dumped to the terminal. Here you can see: (1) The declaration of Ada.Text_IO.Put_Line: declare void @ada__text_io__put_line__2(i8*, %struct.string___XUB*) The name ada__text_io__put_line__2 is that generated by GNAT for this routine. The function returns no value ("void") and has two arguments: a pointer to an i8 (an i8 is an 8 bit integer, in this case a character) and a pointer to a %struct.string___XUB, which is a record type. The declaration of the type is %struct.string___XUB = type { i32, i32 } which is a record containing two 32 bit integers. These are the lower and upper bounds for the string. Thus a call two Ada.Text_IO.Put_Line in fact passes two arguments, a pointer to the string contents and a pointer to the string bounds. (2) The code defining Hello (_ada_hello). There is one basic block, the entry block marked "entry:". It contains two instructions: a call and a return instruction. The call tail call void @ada__text_io__put_line__2( i8* getelementptr ([12 x i8]* @.str, i32 0, i32 0), %struct.string___XUB* @C.168.1155 ) is marked as a "tail call". If you don't know what that means, don't worry about it. The call is to the function @ada__text_io__put_line__2, see (1) above. The first parameter is an i8*, a pointer to an 8 bit integer, and has the value getelementptr ([12 x i8]* @.str, i32 0, i32 0) What is this? First off, @.str is the string constant @.str = internal constant [12 x i8] c"Hello world!" ; <[12 x i8]*> [#uses=1] This is an internal constant, meaning that it is not visible outside this compilation unit. It has type [12 x i8], which is an array of 12 i8's. It has the value "Hello world!", which is indeed 12 characters long. There is a comment on the end of the line (starting with ";") pointing out the type of @.str, which [12 x i8]*, a pointer to an array of 12 characters, and the fact that @.str is only used in one place. The getelementptr instruction is explained in the LLVM docs, see http://llvm.org/docs/LangRef.html and also http://llvm.org/docs/GetElementPtr.html Here it just converts @.str from a [12 x i8]* into an i8* before passing it to @ada__text_io__put_line__2. In short: a pointer to the H in Hello World! is passed as the first parameter of the call. The second parameter is a pointer to a %struct.string___XUB, a record holding the lower and upper bounds for the string. The value passed is @C. 168.1155, which is the constant declared as: @C.168.1155 = internal constant %struct.string___XUB { i32 1, i32 12 } ; <%struct.string___XUB*> [#uses=1] This is a constant record containing the values 1 (the lower bound) and 12 (the upper bound). The return instruction "ret void" completes execution of the function, and returns control to the caller. The "void" indicates that this routine does not actually return anything. I hope you have fun playing with LLVM! Duncan.