From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00,
	REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4
Path: 
 border2.nntp.dca1.giganews.com!nntp.giganews.com!news.glorb.com!aioe.org!.POSTED!not-for-mail
From: "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de>
Newsgroups: comp.lang.ada
Subject: Re: Languages don't  matter.  A mathematical refutation
Date: Fri, 27 Mar 2015 18:36:40 +0100
Organization: cbb software GmbH
Message-ID: <1ht5q4lxmtf3p.mntbczbpti5n.dlg@40tude.net>
References: <b3592526-729a-4198-a630-696542b3f3be@googlegroups.com>
 <59ac455c-72f6-43e2-8a79-efc0f3e16d9a@googlegroups.com>
 <19qfgu5pjszm5.s5y5u8r0zx8k.dlg@40tude.net>
 <161a69af-a392-4214-bd92-0e20e7522cca@googlegroups.com>
Reply-To: mailbox@dmitry-kazakov.de
NNTP-Posting-Host: w2sqUGEBZqsVBYNL7Ky3Kg.user.speranza.aioe.org
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 8bit
X-Complaints-To: abuse@aioe.org
User-Agent: 40tude_Dialog/2.0.15.1
X-Notice: Filtered by postfilter v. 0.8.2
Xref: number.nntp.giganews.com comp.lang.ada:192601
Date: 2015-03-27T18:36:40+01:00
List-Id: <comp.lang.ada>

On Fri, 27 Mar 2015 04:25:21 -0700 (PDT), Jean François Martinez wrote:

> On Thursday, March 26, 2015 at 4:21:45 PM UTC+1, Dmitry A. Kazakov wrote:
>> On Thu, 26 Mar 2015 06:43:01 -0700 (PDT), Maciej Sobczak wrote:
>> 
>>> Unless you prove that this had no influence on the results in question, I
>>> refute the whole of your mathematical proof.
>> 
>> Yep, when *mathematical* statistics is used, then the burden of showing it
>> applicable lies on the author's shoulders. In particular, the software
>> metric used for some SW design must be shown to be a random variable. The
>> elementary outcomes presented. Their independence explained etc.
>> 
>>> But more seriously, it would be interesting to flip languages every year
>>> and see whether the advantages of Ada hold in terms of better results. Or
>>> run two classes in parallel (with random assignments of students to each
>>> class) and compare results over the years.
>> 
>> I doubt the process of random selection of a developing team from a pool of
>> "indistinguishable" teams were a good basis for a statistical model of
>> software development.
>> 
>> The bottom line: statistic were applied incorrectly and the results
>> obtained carry no prediction power.
> 
> Really?  Text books on statistics are choke a full of similar examples and
> exercises like comparing the proportion of defective products at
> manufacting plant A and B then checking the null hypothesis before
> deciding that A's products are better.

It is not similar. A production process is a physical one of which we know
that the behavior is stochastic because the laws of physics are (e.g. laws
of thermodynamics are statistical).

A developing team is not governed by these laws, because decision making is
not random and the system as a whole has memory (people learn in process).

> In our case we have a random
> variable and that is group 
> quality.

How is this random? A given group has given quality. It is nowhere random:
if you measure the quality repeatedly you will get same quality.

> And it is common in this field to assume taht your population is a sample
> of an infinitely-sized conceptual population.

Which is a way different model. If you randomly select a team, all
randomness is not in the team but in the selection process. For this to be
correct you should show that selected teams are equivalent in terms of
given SW metrics. It would be quite difficult to do because students have
different exposures to programming, everything is changing and you cannot
"recycle" teams.

Now, let you managed to do this. What would be the conclusion? You would
have shown that a randomly selected team of uneducated pupils produce
better SW metrics with Ada than with C when faced given educational
objective. So what? Why should this imply anything for SW processes as
performed by professionals?

Again, we know Ada is better for that. But this study would not show any
casualty or statistically proven correlation between both. It is nothing
more than a "by-the-way" argument from the statistical POV.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de