From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00,
	REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 109fba,df854b5838c3e14
X-Google-Attributes: gid109fba,public
X-Google-Thread: 1014db,df854b5838c3e14
X-Google-Attributes: gid1014db,public
X-Google-Thread: 10db24,fec75f150a0d78f5
X-Google-Attributes: gid10db24,public
X-Google-Thread: 103376,df854b5838c3e14
X-Google-Attributes: gid103376,public
From: seebs@solutions.solon.com (Peter Seebach)
Subject: Re: ANSI C and POSIX (was Re: C/C++ knocks the crap out of Ada)
Date: 1996/04/09
Message-ID: <4kdlgm$10f@solutions.solon.com>
X-Deja-AN: 146532703
references: <JSA.96Feb16135027@organon.com> <dewar.828936837@schonberg>
 <4kb2j8$an0@solutions.solon.com> <dewar.829011320@schonberg>
organization: Usenet Fact Police (Undercover)
reply-to: seebs@solon.com
newsgroups: comp.lang.ada,comp.lang.c,comp.lang.c++,comp.edu
Date: 1996-04-09T00:00:00+00:00
List-Id: <comp.lang.ada>

In article <dewar.829011320@schonberg>, Robert Dewar <dewar@cs.nyu.edu> wrote:
>Peter said

>"How?  No offense meant, but any code which can be affected by this is flat
>out broken.  POSIX-style read is to be given a pointer to at least nbytes
>of space, for the information read.  Period."

>That's really confusing, the code in question DID give a buffer large
>enough to hold nbytes of data, where nbytes is the number of bytes 
>for "the information read". Maybe I don't understand, but reading the
>above sentence, it sounds like you would be surprised by the Linux
>behavior.

If you don't provide enough space for the number of bytes you request,
you are lying to the system.  I cannot imagine a reason to do this, though
I'm curious.

>Here is the exact case. We declare a buffer of 100 bytes. We read a
>1000 bytes from a file whose total length is 68 bytes. On all systems
>that we had experience with other than Linux, this worked fine, the
>first 68 bytes of the buffer is filled, and the remaining 32 bytes
>is unused. 

Why are you reading 1000 bytes if you *know* there aren't that many?

Also, how do you propose to *prove* that, between your last check, and
your read, no one has added to the file?  There's no sane strategy
here.

But mostly, I can't imagine any reason to do this; if you know there are
no more than N bytes of data that you want, what possible reason is
there to read more than N?

>I am not claiming this is "correct" code in some abstract sense. I
>certainly can't tell that it is wrong from the definitions I have
>of the read function. What I am claiming is that this worked on
>all systems we tried it on, and then failed on Linux. I am not saying
>Linux is wrong here, just that its behavior was surprising!

I'm not surprised at all; I'd not be surprised by any syscall doing bounds
checking on arguments.

What's wrong is that you're lying; you are saying "here's a buffer to read
1000 bytes into, it's large enough" and it's not large enough for 1000
bytes.

>The code in question made 100% sure that the data read would never
>exceed the buffer size, and I would have been hard pressed to
>determine that the code was incorrect. 

I'd love to know how you're sure of this in a multitasking environment.

>I am not sure that POSIX is relevant here, almost none of the systems on
>which we ran claimed POSIX compliance. Peter, can you post the POSIX
>wording on read, I don't have it at hand. Does it in fact make it
>clear that the Linux behavior is correct and that the program was
>wrong.

I don't have it at hand either; I can say that the basic statment made
is that it reads at most nbytes bytes from file into the buffer.  I don't
think the issue is explicitly addressed, because no one had ever tried
it.

>Let's suppose that the POSIX standard does in fact make it clear that
>the Linux behavior is correct. I still think the check is unwise
>(note that the check is not against the actual size of the buffer
>given, this is of course impossible in C, it is against the end
>of the address range of the data area). It's a good example of the
>kind of principle I mentioned before. Since almost all systems allow
>the program I described above to work correctly, and it is manifestly
>safe programming evenif the check is not present, I think it would
>be a better choice for Linux not to do this extra check.

It's certainly *possible* for a C implementation to do full and rigorous
bounds checking, even if it's rare.

I disagree; I believe implementations must be *especially* zealous about
catching and crashing common mistakes.  I do not believe conceptually
invalid code should be allowed to run, if there's any way to test for it.

I have only once in my life seen a compiler cause
	i = ++i;
to do anything but increment i.  This doesn't mean that compiler was wrong,
in *any* way.  The code is devoid of meaning, and it's merely bad luck that
so many implementations don't catch it.

I don't think I agree with the claim that it's manifestly safe.  An
unexpected hard error could cause the disk to spew more data than you
just proved it had, and you should *NEVER* give a syscall license
to write past the space you want it to work with.

-s
-- 
Peter Seebach - seebs@solon.com - Copyright 1996 Peter Seebach.
C/Unix wizard -- C/Unix questions? Send mail for help.  No, really!
FUCK the communications decency act.  Goddamned government.  [literally.]
The *other* C FAQ - http://www.solon.com/~seebs/c/c-iaq.html