From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Thread: 103376,e219d94b946dfc26
X-Google-Attributes: gid103376,public
X-Google-Language: ENGLISH,ASCII-7-bit
Path: 
 g2news2.google.com!news3.google.com!border1.nntp.dca.giganews.com!nntp.giganews.com!newsfeed00.sul.t-online.de!t-online.de!news.karotte.org!uucp.gnuu.de!newsfeed.arcor.de!newsspool4.arcor-online.net!news.arcor.de.POSTED!not-for-mail
Newsgroups: comp.lang.ada
Subject: Re: Ada.Command_Line and wildcards
From: Georg Bauhaus <bauhaus@futureapps.de>
In-Reply-To: <2wy7mn8pyd.fsf@hod.lan.m-e-leypold.de>
References: <45dcaed8_6@news.bluewin.ch>
	 <C2026327.9876A%yaldnif.w@blueyonder.co.uk>
	 <1172132169.423514.271890@s48g2000cws.googlegroups.com>
	 <545bgvF1ttrphU1@mid.individual.net>
	 <1495406.QZvfpqijrQ@linux1.krischik.com>
	 <6dy7mn3hhu.fsf@hod.lan.m-e-leypold.de>
	 <1172328891.5496.62.camel@localhost.localdomain>
	 <2wy7mn8pyd.fsf@hod.lan.m-e-leypold.de>
Content-Type: text/plain
Content-Transfer-Encoding: 7bit
Organization: #
Message-ID: <1172348782.20918.131.camel@localhost.localdomain>
Mime-Version: 1.0
X-Mailer: Evolution 2.8.1 
Date: Sat, 24 Feb 2007 21:26:22 +0100
NNTP-Posting-Date: 24 Feb 2007 21:25:34 CET
NNTP-Posting-Host: 4231999a.newsspool2.arcor-online.net
X-Trace: 
 DXC=R]AEP==B<^PAX0F2i><W:SA9EHlD;3YcR4Fo<]lROoRQ8kF<OcfhCO[=7VIJ:OSk\ZPCY\c7>ejVXiNn_:inmG@ZQ8_HkBMVlOY
X-Complaints-To: usenet-abuse@arcor.de
Xref: g2news2.google.com comp.lang.ada:9502
Date: 2007-02-24T21:25:34+01:00
List-Id: <comp.lang.ada>

On Sat, 2007-02-24 at 17:45 +0100, Markus E Leypold wrote:

This discussion _is_ (also) about the quirks of the Unix shells,
and about their consequences...

> > One criterion is the number of wildcard expansion surprises.
> > Not having to mark patterns as patterns (untyped, no syntax)
> > creates an immensely flexible and powerful ... mess.
> >
> > $ echo *.ads   # argument text is a pattern
> > *.ads          #   output text is a pattern
> >
> > $ echo *.adb   # argument text is a pattern
> > main.adb       #   output text is a file name
> >
> > $ ls *.ads
> > ls: *.ads: No such file or directory
> >
> > $ ls *.adb
> > main.adb
> >
> > (On the surface, ls(1) and echo(1) aren't consistently
> > interpreting *.ads, one is reporting an error, the other
> > is reproducing input. 
 [...]
>  Since this has nothing to do with " ls(1) and
> echo(1) aren't consistently interpreting" but with exactly on quirk
> the expansion (in the shell!) has:

It is important to understand the premise here: I wrote: "On the
surface, ls(1) and echo(1) aren't consistently..." *On the surface*,
i.e. at the visible user interface of the command! Understanding why
this inconsistency is not an inconsistency forces the Unix
user to learn about what is happening under the hood. It is
always good to know something about the internal working
of your tools. But still, the same text (a pattern) can trigger
different behavior for the same pattern. The behavior *appears* to
depend on context (the command name, among other things).
Doesn't matter that, technically, Unix/libc/sh is always
consistent with its own, "hidden" workings.

(BTW, if I write

cmd=echo
$cmd *.ad? | while read fname
 do case $fname in
   *.ads) echo "'"$fname"'" is spec;;
   *.adb) echo "'"$fname"'" is body;;
 esac
done

I create yet another quirky situation, and Unix can tell me, again,
"What makes you think that you can iterate over the results of
echo(1) as if each name were on a line by its own, like if you had
been using ls(1) (uh, without -x) ?" Unix is right, technically.
But from a user's point of view, how is this all consistent?
In Makefiles, you list files side by side, and make(1) will deal
with them one by one (at least by the user-visible AS-IF rules).
When I write
cmd=ls
above, I get what I might have expected. Otherwise I have to know why
and how the results of "echo *.adb" all end up in $fname together as
one string---which then happens to match *.adb) . This _is_ all logical,
but I think there are better design choices. Here, it all depends
on things being listed top-down or left-right, and nothing obvious
tells you what to expect. Here, the programmer has to think about very
many aspects *without* being able to express them. They are all implicit
in the rules of sh, exec*, the assignment conventions of the built-in
read function, etc. ...)

Context dependence, echo vs. ls, may be technically interesting but
it sure makes proper sh scripting less uniform and more difficult
and error prone.


BTW, pertinent to the OP's problem, to find out what is actually
passed to some executable program on some OS running some shell,
a nice tool like Unix's strace(1) will do:

$ strace /bin/echo *.ads
execve("/bin/echo", ["/bin/echo", "*.ads"], [/* 34 vars */]) = 0
...

$ strace /bin/echo *.adb
execve("/bin/echo", ["/bin/echo", "main.adb"], [/* 34 vars */]) = 0
...


>  2. If there is no file corresponding to the pattern, the result of the expansion
>     is not '', but rather the pattern itself.

This is what I have been trying to say. The one quirk you concede
is a crucial example. It is a design choice. It has consequences.
It is very much like a null pointer in a pointer-heavy language:
Pointing is flexible and powerful. Terminate arrays using nulls...
You can always try to make sure you don't accidentally
run into a null (pointer)... By analogy, you can always try to
make sure you are properly quoting on the command line for the
selected combination of shell, shell functions, and program so
that there won't be pattern surprises.


>  it might be advantageous to understand a thing before
> asserting "it's all done wrong".

It is equally advantageous to understand what is being criticized
before addressing a different subject. ---I agree that "all wrong"
is probably too much when comparing sh to JCL, say, which was
another design choice at the time, I think.


>  Point (2): If there is no file corresponding to the
> pattern, the result of the expansion is not '', but rather the pattern
> itself.

Exactly. Design choice. A compromise that is bound to have
consequences.


>  good shell programmers know how
> to handle that case).

See?

What if good shell programmers could spend their time working
on the problem instead of avoiding shell mishaps? (Reminds me of:
Good C programmers don't need an Ada style compiler. Not surprisingly,
sh and C  and other languages with quirky syntax have roots in
the Bell Labs :-). I had tried to avoid saying so. For example, SNOBOL4,
a jewel in the treasure trove of PLs, has the "white space operator":

* read lines until user types q(uit)

again   input Any('qQ')		:s(end)f(again)
end

Dewar et al. have replaced the "white space operator" with ?
(and extended matching opportunities) in his implementation SPITBOL.
This was explicitly pointed out as an important change by Griswold at
a PL conference.

again   input ? Any('qQ')	:s(end)f(again)
end

Good SNOBOL4 programmers know how to handle normal pattern matching
without '?' ...?


> > $ foo=`echo .[a-z]*` ; some_cmd $foo ...
> 
> Well -- backtick is "out" anyway for the more complicated
> cases. Especially when writing scripts I suggest to use $(...) --
> which gets the nesting right.

Using $(...) in place of `...` (which I do, but we were talking about
the original Unix design choices) doesn't remove the effects outlined.
When there is no hidden file in the current directory that matches
the pattern, the next command will read ".[a-z]*" in argv. This is
to be expected in Unix which by design assumes that

- you known well how the Unix shell normally works,
- you have thought of this possibility,
- you know how to add a check that "${foo}" != ".[a-z]*"
- you known another way as TIMTOWTDI


> > Caveat scriptor, the computer isn't programmed to help us
> > here with type checking, OOD, etc..
> 
> Just use the scheme shell if you don't like Bourne shell. :-) 

Aren't we talking about the original design choices in Unix
shells, i.e. Bourne Shell and Csh? Not about what we could do to
improve the situation?  (Like, replace bash/auto*/fixincludes 
in GCC? :-)

> > That said, I'm looking forward to running Plan 9 again, on a
> > virtual machine :)
> 
> Last time I checked (rather a bit ago that was) it couldn't run on
> vmware or bochs: Has that changed?

Work has been done on Xen + Plan9.