From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=unavailable autolearn_force=no version=3.4.4 Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!mx02.eternal-september.org!feeder.eternal-september.org!newsfeed.fsmpi.rwth-aachen.de!newsfeed.straub-nv.de!reality.xs3.de!news.jacob-sparre.dk!loke.jacob-sparre.dk!pnx.dk!.POSTED!not-for-mail From: "Randy Brukardt" Newsgroups: comp.lang.ada Subject: Re: Languages don't matter. A mathematical refutation Date: Thu, 9 Apr 2015 18:26:53 -0500 Organization: Jacob Sparre Andersen Research & Innovation Message-ID: References: <04f0759d-0377-4408-a141-6ad178f055ed@googlegroups.com> <871tk1z62n.fsf@theworld.com><87oan56rpn.fsf@jester.gateway.sonic.net><877fts7fvm.fsf@jester.gateway.sonic.net><87twwv66pk.fsf@jester.gateway.sonic.net><32ecaopodr1g.1xqh7ssdpa2ud.dlg@40tude.net><87pp7j62ta.fsf@jester.gateway.sonic.net><87pp7hb2xo.fsf@jester.gateway.sonic.net><5rxvgyes5xg8.1mqq86gacbsb1.dlg@40tude.net><87lhi5ayuv.fsf@jester.gateway.sonic.net> <87oan0aote.fsf@jester.gateway.sonic.net> <878ue3ff6y.fsf@jester.gateway.sonic.net> <87sicadqvs.fsf@jester.gateway.sonic.net> NNTP-Posting-Host: rrsoftware.com X-Trace: loke.gir.dk 1428622013 29088 24.196.82.226 (9 Apr 2015 23:26:53 GMT) X-Complaints-To: news@jacob-sparre.dk NNTP-Posting-Date: Thu, 9 Apr 2015 23:26:53 +0000 (UTC) X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 6.00.2900.5931 X-RFC2646: Format=Flowed; Original X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 Xref: news.eternal-september.org comp.lang.ada:25500 Date: 2015-04-09T18:26:53-05:00 List-Id: "Paul Rubin" wrote in message news:87sicadqvs.fsf@jester.gateway.sonic.net... > "Randy Brukardt" writes: >> Let me be clear here: Ada is used by programmers that want to get their >> programs right *and* need native code performance (performance might mean >> space or mean speed). That means that we're typically worrying about the >> last 25%. GC sucks up that last 25%. > > Hmm, ok, but that's a very narrow niche of programs in my experience. > Non real-time programs where the last 25% matters. If it's realtime, GC > is off the table and there's no decision to make. Right, and that's Ada's #1 market area. And the second group is the other bunch where Ada really has something to offer. Otherwise, you're probably better off using the dynamic language de-jeure, especially if you are trying to get young programmers excited. > For something like a > compiler, people use GCC all day long despite other compilers being 10x > faster, so I'd say the 25% isn't of consequence. So why wouldn't I use > GC in a compiler if it was going to make my job easier? Because it won't, at least in a compiler. You don't need to free anything in a compiler on a modern large memory machine, 'cause it runs for a short time and then quits. It's possible but very unlikely to run out of memory (I've never seen a program that uses more than 16 megabytes in our compiler [not that I check that very often]; it would take a lot larger program to get to 2GB on a 32-bit machine, not to mention the larger sizes available on a 64-bit machine). And of course, GC covers up dangling pointer bugs (of course, Ada does worse with such bugs, so it's a wash at best. But that's not an argument in favor of GC, but a problem with Ada). It's trivial to use a dead object that one mistakenly has a pointer to. It's probably an improvement to not corrupt memory in that case, but that means that such bugs are even less likely to be found (they're symptom-free). ... >> That's what I call "trained ape" programming. It's so easy, anyone can >> do it, so anyone does, > > If GC gives those trained apes such a boost that their output competes > with your experts, just imagine how much better your experts would be if > they used it too. Naw, experts don't need those crutches. Their apparent productivity will be lower, because they're writing preconditions and constraints and assertions and package specs rather than slinging code. And they'll probably get laid off because the buggy mess will be "done" well before the expert code that actually works right reaches that point. ... >>> You trace from each pointer just once, either by setting a mark bit... >> That makes no sense to me whatsoever. The whole point of tracing is to >> determine what is reachable > > I just mean you trace from an object to all the other objects reachable > from it, then you set a bit in the object marking it as already having > been traced. Then the next time you find a pointer to the object, you > see that the bit is set, and you don't have to trace it again. Sure, but it's the large mess of top-level pointers (not the ones within objects) that are so expensive to trace. And there's at least as many of those in my code (once you count parameters, local variables, and the like). Plus the pointers within objects can change (and sometimes frequently change). So there has to be overhead to clear the tracing whenever something changes in the object. That's a global, distributed overhead (it's in every object). ... >> Our compiler was barely usable 30 years ago.... (On the order of 5 >> minutes >> per compilation unit.) There were many things that we could not do >> because >> of lack of CPU power. The current version is faster by a lot, but it >> still >> can take minutes per compilation unit (although most units are much >> faster). > > OK, so let's say a little under 2 minutes per CU on the average (single > threaded compilation). Your customer's program has 2000 CU's and he > wants to compile it and run regression tests every night, so he wants > the compilation to complete in 2 hours. His build farm uses commodity > 4-core Intel CPU's so he needs 8 compilation servers. Your competitor's > programmers are about equal to yours, and his compiler does about the > same thing, but he chose to use GC, so his compiler is 25% slower, which > means the customer needs 10 servers instead of 8. So the customer will > pay more for your compiler--but not much, since x86 servers are cheap. > Your costs are mostly from programmer salaries developing the compiler. > Did your competitor's productivity gains from GC let him underbid you by > enough that the customer chooses him? If yes, maybe you ought to > reconsider GC. I've yet to find a customer that doesn't want their compiler faster. And the point of the 25% isn't any particular 25%, but the fact that you need to find a bunch of 5% improvements in order to make any sort of significant difference. There is also the cognetive part of compilation speeds; the time taken to do something is not perceived linearly by humans. There is a point at which people tend to go off and do something else while waiting, and that is a productivity drag that far outweighs anything that a programming language could offer. A two hour build would be unacceptable to many projects (I personally hate anything that runs over 10 minutes). Anyway, the "productivity gains from GC" are an illusion. If you use local variables in Ada (which can be dynamically sized, remember), the compiler manages the memory and surely GC is not easier than that. If you use containers (including the map and tree containers for complex data structures, and whose elements can be classwide so that they will allow any member of a class), then the memory management is done by the container. With the Ada 2012 syntax, they work like an array in many ways. Easy. GC only could possibly have an advantage when one uses allocated objects, but the use of "access" and allocated objects should be a last resort in modern Ada -- to be used only in the rare case when performance of the built-in data structures is inadequate. Anyway, GC is realistically incompatible with modern Ada. Since Ada 95, finalization of objects has been defined to happen at a specified time (depending on how and when they are declared). For objects that are allocated, that's when the type goes away (unless the object is specifically freed with Unchecked_Deallocation or Unchecked_Deallocate_Subpool). Since most access types are allocated at library-level, any object with a controlled component (which should be true of most ADTs) cannot be collected until the program ends. That pretty much makes GC useless (at a minimum, it makes it of very restricted utility). I've tried on several occassions to change those rules to allow "unreachable" objects to be finalized sooner, but those proposals have never gotten any traction. It's a chicken-and-egg problem: hardly anyone wants to fix the language unless there is serious interest in GC, but there cannot be serious interest in GC because the language doesn't really allow it. > The economics of computing have > changed so that programmer time is more valuable than machine time by a > far larger ratio than before. Quite possibly you're right, in which case there is no need for me or for Ada (at least not the Ada I know). ... >> People who don't know better... are probably not trying to integrate >> proof into the compiler. :-) > > What exactly is your compiler doing? The compilers I know of with > serious proof automation use external SMT solvers. Admittedly these > tend to be written in C++ ;-) Not much yet. But I don't trust code that I can't fix; all of the foreign code we've integrated over the years has caused trouble and ended up needing to be replaced. I'd replace the whole OS with Ada if I could afford to do it. :-) In any event, I think the proof stuff has to be an intergral part of the compiler, because it seriously effects the code that gets generated. (If, after all, you can prove F(X) = 10 is True, you can replace F(X) with 10 appropriately. That can be huge win in runtime, especially in things like the preconditions of Ada.) Randy.