From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 103376,4fdb1bcbd0d9e4b7
X-Google-Attributes: gid103376,public
From: ok@goanna.cs.rmit.EDU.AU (Richard A. O'Keefe)
Subject: Re: Ada 95 Books for Undergraduate Teaching
Date: 1996/06/03
Message-ID: <4otsnk$qs2@goanna.cs.rmit.EDU.AU>
X-Deja-AN: 158175633
references: <00001a73+00002d54@msn.com>
organization: Comp Sci, RMIT, Melbourne, Australia
nntp-posting-user: ok
newsgroups: comp.lang.ada
Date: 1996-06-03T00:00:00+00:00
List-Id: <comp.lang.ada>


KMays@msn.com (Kenneth Mays) writes:

>The BEST books for CS1 and CS2, 
>in my opinion, are these two books from Dr. Michael Feldman
>and one from Elliot Koffman (one of the best Computer Science 
>educators in America).

Having read "Problem Solving and Program Design in C", 2nd edition,
by Hanly and Koffman, I can't quite bring myself to swallow this
"Elliot Koffman is one of the best CS educators in America" biznai.

*Either*
    all of the bad stuff in that book is due to Hanly,
    in which case Koffman is culpable for lending his name to it
    without doing something to ensure the quality of the product
*or*
    Koffman is not a good educator, because a good educator is
    one who teaches people how to do things _well_.

I know this is taking us a bit far from Ada, but the issue of textbooks
is one which I feel extremely strongly about.

Here are some gems:

p 639	telling people to use exit(1), when the C standard assigns no
	meaning to 1 in this context.

p647	telling people that argv[argc] is always an empty string,
	when the C standard says it is a NULL pointer

p537	telling people to use type names of the form [a-z][a-z_]*_t,
	when the C standard says that such identifiers are reserved.

p586	telling people that a C text file always has an <eof> character
	physically stored in it.

p587	telling people to use this loop to read integers:

	for  (status = scanf("%d", &num);
	      status != EOF;
	      status = scanf("%d", &num))
	    process(num);

	instead of the somewhat less buggy

	    while (1 == scanf("%d", &num))
		process(num);

	(Hint: given the input 'x' the first version will go into an
	infinite loop.  Note:  this 'for-with-duplicated-code' style
	is used all through the book.)  Oddly enough, p359 gets this
	right.

p591	telling people to use this loop for reading a file:

	char ch;		/* this is the blunder */
	for  (ch = getc(inp);  ch != EOF;  ch = getc(inp))
	    putc(ch, outp);

p601	telling people that a "database is a vast electronic file of
	information that can be quickly searched using subject headings
	or keywords".  Data bases need not be vast, electronic, files,
	or quickly searchable, nor need they be restricted to text that
	has subject headings or keywords.  What's more, this definition
	does not apply to the example it goes with!

ch12	This chapter is really quite muddled about the binary file concept.
	In a book at this level, it might be better to omit it entirely
	than to present it in a muddled way.

p494	telling people that "if a recursive function's implementation is
	flawed, tracing its execution is an essential part of identifying
	the error."  There is, of course, nothing special about recursive
	functions, nor is tracing the execution of any function *invariably*
	an essential part of identifying an error in it.  (The assert()
	macro is _listed_ on page AP3, but I can find no other mention of
	it in the index or the text, and this _is_ an extremely important
	debugging tool.  Come to think of it, how do you get away with
	claiming to talk about "Program DESIGN" without talking about
	assertions anywhere?)

AP2	I suppose listing the ASCII and EBCDIC character sets has some
	vague relevance to C (though the book fails to identify just
	_which_ of the many variants of EBCDIC it presents), but what is
	the point of listing the 6-bit CDC character set?  C requires
	two cases of letters, and the CDC character set has only one.
	This looks like a carry-over from a Pascal book, and carry-overs
	from Pascal books always give me that sinking feeling.

AP4	The description of clock() is wrong.

p365	As it happens, this page was the very first I saw when I opened
	the book.  I have a little theory that if you want to estimate
	the quality of a textbook, sampling will do very well, because
	a sloppy author is sloppy everywhere about everything.  Here is
	the relevant part of the code:

	#define MAX_ITEM ...
	double x[MAX_ITEM];
	double sum, mean, sum_sqr, st_dev;
	int i;
	...
	/* Computes the sum and the sum of the squares of all data */
	sum = 0;
	sum_sqrt = 0;
	for  (i = 00; i < MAX_ITEM; ++i) {
	    sum += x[i];
	    sum_sqr += x[i]*x[i];
	{
	/* Computes and prints the mean and standard deviation */
	mean = sum/MAX_ITEM;
	st_dev = sqrt(sum_sqr / MAX_ITEM - mean * mean);
	...

	My first-year teachers would have rebuked me for code like that;
	my second-year teachers would have failed me.  For the benefit
	of people with weak or non-existent statistics or numerical
	analysis backgrounds:

	(a) There are two formulas for standard deviation.  In this
	    example, we are computing the standard deviation from an
	    *unknown* true mean, which has been *estimated* by
	    computing the sample mean.  That means that the right
	    formula is the one that divides by N-1, not the one that
	    divides by N.  This is the point my first year statistics
	    teachers would have rebuked me for.

	(b) The correct formula is
	    mean = sum(x)/count(x)
	    variance = sum((x-mean)**2)/(count(x)-1)
	    std_dev = sqrt(variance)
	    It is true that sum((x-mean)**2) *can* be rearranged, and
	    for mathematical purposes it is often convenient to do so.
	    But the rearrangement has very nasty *floating-point*
	    properties (it can, for example, give you a negative variance).

	Here is the corrected code:

	sum = 0.0;
	for (i = 0; i < MAX_ITEM; i++)
	    sum += x[i];
	mean = sum/MAX_ITEM;

	sum_sqr = 0.0;
	for (i = 0; i < MAX_ITEM; i++)
	    sum_sqr += pow(x[i]-mean, 2.0);
	st_dev = sqrt(sum_sqr/(MAX_ITEM - 1));

	Would using better code spoil the pedagogical points Figure 8.3
	is supposed to be making?  Not at all.  One is forced to the
	conclusion that the last author to touch this material didn't
	understand standard deviations or floating point arithmetic.
	There is no shame in not understanding these things; one cannot
	be expert in every area of computing.  But one *would* expect
	"one of the best Computer Science educators in America" to steer
	clear of things he doesn't thoroughly understand.

	For another example of floating point that left me shaken, look
	at p339.  (You will also notice a misplaced \n in the printf()
	on that page.) 

p197	Question 6 "Implement the following flow diagram using a nested
	if structure" (a) is initially confusing because the flaw chart
	in question is not visible at this point (it's on the next page,
	and it's confusing there, because it isn't captioned) and (b)
	drags in flaw charts, which at this late date we should not be
	using in a CS 1 book.  (And in fact flaw charts are introduced
	on p152 _only_ to describe if statements.  Why use them at all?)

p517	I am sick of the towers of Hanoi as an example of recursion.
	(Couldn't they at least use the Reve's puzzle?)  In fact, all
	of the examples of recursion I looked at in this book were
	unconvincing.

p524	"2.  Which is generally more efficient, recursion or iteration?"
	I wonder whether anyone checked the punctuation in this book?
	I note that this question also appears in Feldman & Koffman,
	and therefore suspect that it is a carry-over from Koffman's
	Pascal book.  It is a bad question, because it suggests that
	there is an answer.  (The authors evidently imagine that there
	is only one way to implement recursion.  That is not true.)
	It is *bad* education to try to close students' minds this way.
	The only right answer is "it all depends on the algorithm being
	implemented, the skill of the programmer, the quality of and
	selected optimisations for the compiler, the target CPU, the
	main memory and cache structure, &c &c &c."
	
Of course, the _first_ warning sign for the book was its heavy use of
colour.  Two colours, actually, black and a rather horrible dried-blood
reddish-brown.  And if you look carefully, the two colours are not always
in registration.  My experience with CS 1 textbooks so far has been that
heavy use of colour is a sure sign that there are problems with the
_content_.  Colour _is_ used to good effect in the listings, but given the
two-inch margins, sticking headings out an inch would have been a better
way to distinguish them than putting them into white-on-yuck ellipses.
Calling "{}" 'brackets' instead of 'braces' wasn't a good sign either (516),
given that C uses _both_ brackets and braces and uses them differently.


So, here we have a book which
    - is consistently a bit sloppy
    - contains serious factual errors about C
    - does floating point calculations poorly
but it was written by
    - someone who "has taught software engineering seminars"
    - and someone who "is one of the country's foremost computer science
      educators"
and was *reviewed* by
    - EIGHT people from 
    - EIGHT US universities, including one I have reason to respect highly.


What's going on?
Why is the review process failing us?
How do you get a high reputation with books like this?

-- 
Fifty years of programming language research, and we end up with C++ ???
Richard A. O'Keefe; http://www.cs.rmit.edu.au/~ok; RMIT Comp.Sci.