From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=unavailable autolearn_force=no version=3.4.4 Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail From: "Alejandro R. Mosteo" Newsgroups: comp.lang.ada Subject: Re: Killing software and certification Date: Wed, 28 Mar 2018 19:06:14 +0200 Organization: A noiseless patient Spider Message-ID: References: <9ed9edb1-3342-4644-89e8-9bcf404970ee@googlegroups.com> <26a1fe54-750c-45d7-9006-b6fecaa41176@googlegroups.com> <656fb1d7-48a4-40fd-bc80-10ba9c4ad0a4@googlegroups.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Date: Wed, 28 Mar 2018 17:06:15 -0000 (UTC) Injection-Info: h2725194.stratoserver.net; posting-host="460fb46a4c0350d70ba2f75d850e57e2"; logging-data="21929"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/eTNf6x+lJ2ubRXchhgZj1" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 In-Reply-To: Content-Language: en-US Cancel-Lock: sha1:t7HFtnCOo2Gcx4ELcuOMCvNpU+8= Xref: reader02.eternal-september.org comp.lang.ada:51237 Date: 2018-03-28T19:06:14+02:00 List-Id: On 28/03/18 16:23, Dmitry A. Kazakov wrote: > On 28/03/2018 15:54, Alejandro R. Mosteo wrote: >> On 27/03/18 21:25, Dmitry A. Kazakov wrote: >>> On 2018-03-27 20:32, Alejandro R. Mosteo wrote: > >>>> I'm not in the industry, and I'd be surprised that unverified >>>> software were allowed to run in civilian environments where failures >>>> basically amount to a very dangerous situation. >>> >>> Why should it surprise you? How are you going to verify it? Black box >>> test is impossible. White box test isn't either, assuming any NN >>> involved. There is nothing to prove. >> >> I can think of an spectrum of regulatory/practical positions between >> 'nothing can be done, so everyone brace' and 'this won't ever fail, >> under every [un]conceivable situation'. It's the apparent nonchalance >> of the general public that coexists with these testing cars, the >> brashness/recklessness of those expecting to get rich with it and the >> apparent willingness of politicians that I find fascinating (that's the >> first ones that comes to mind). >> >> I can understand the appeal for politicians to be the first city with a >> working fleet (or whatever contributions they're getting to favor live >> testing). As a technophile, I want autonomous cars to become reality, >> so I can understand that too. As a researcher familiar with the >> algorithms involved and with the kind of C/C++/Python heaps that >> implement them I get chills about thinking that a car can be on the >> highway with a semi-awake safety driver as the only fallback in a split >> second. > > As much as I hate C and the ilk, the problem is much deeper, I'm afraid. > Even if it were 100% SPARK Ada, a self teaching system [I don't consider > here simple cases where one can prove convergence of the training method > to a defined goal], is in the same relation as a program is to the CPU. > > You can verify and certify the CPU as much as you wish. That would say > nothing about the program running on it. The weights of a NN is a > "program". The system core written in whatever language is a "CPU". > > I have no slightest idea of an approach to define correctness of the > trained weights, even less about proving it. It is a fundamental > challenge we will have to deal with, if this kind of "emerging > programming" to take hold. > > A kind of biblical disaster when you, like God did, see in dismay what > your creatures would do after you granted them "free will". Would you > drown them all? (:-)) This line of thinking is often brought by a colleague working on "classical" solutions to problems that are nowadays trendy on deep learning circles. When feeling optimistic I see it as the complexity of the simplex method: linear on average but worst case exponential. I tend to think that the same you could have a watchdog for a stray simplex, you could have some fallback for a DNN behaving badly (if you can detect it in the first place :P). In other words: if DNNs prove to be as useful as they promise, they'll find a way to statistically cover the desired percent of reliability. Maybe this is another field of research in the making. I'm also told that "we solve X with a DNN" is no longer acceptable in the main conferences, that you have to also provide some insight on the DNN workings. But this is hearsay that I pass along.