From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=0.2 required=5.0 tests=BAYES_00,INVALID_MSGID, REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 109fba,df854b5838c3e14 X-Google-Attributes: gid109fba,public X-Google-Thread: 103376,df854b5838c3e14 X-Google-Attributes: gid103376,public X-Google-Thread: 1014db,df854b5838c3e14 X-Google-Attributes: gid1014db,public X-Google-Thread: 10db24,fec75f150a0d78f5 X-Google-Attributes: gid10db24,public From: JohnM@hypatia.pec.co.nz (John Marshall) Subject: Re: ANSI C and POSIX (was Re: C/C++ knocks the crap out of Ada) Date: 1996/04/10 Message-ID: <4khevn$26a@janus.pec.co.nz>#1/1 X-Deja-AN: 146816435 references: <4k9qhe$65r@solutions.solon.com> <4kb2j8$an0@solutions.solon.com> followup-to: comp.lang.ada,comp.lang.c,comp.lang.c++,comp.edu organization: PEC (NZ) Ltd. reply-to: johnm@pec.co.nz newsgroups: comp.lang.ada,comp.lang.c,comp.lang.c++,comp.edu Date: 1996-04-10T00:00:00+00:00 List-Id: Robert Dewar (dewar@cs.nyu.edu) wrote: > Peter said >> "How? No offense meant, but any code which can be affected by this is flat >> out broken. POSIX-style read is to be given a pointer to at least nbytes >> of space, for the information read. Period." > > That's really confusing, the code in question DID give a buffer large > enough to hold nbytes of data, where nbytes is the number of bytes > for "the information read". Maybe I don't understand, but reading the > above sentence, it sounds like you would be surprised by the Linux > behavior. I think your confusion comes from misunderstanding what Peter is using "nbytes" to mean: he is referring to the parameter in the read() call. > Here is the exact case. We declare a buffer of 100 bytes. We read a > 1000 bytes from a file whose total length is 68 bytes. Can I try to demonstrate why this is unreasonable? I don't have any definitive documentation here, but I believe the wording of the *contract* between application writer and library writer is something like: int read(int fd, char *buf, size_t nbytes) Reads up to nbytes from fd into the buffer starting at buf. How's this for a fair library routine which fulfills this: /* "Pseudo-C" -- not real C, but you get the idea */ int read(int fd, char *buf, size_t nbytes) { int avail = fd->bytes_remaining; if (avail >= nbytes) { copy nbytes bytes from fd into buf[0..nbytes-1]; return nbytes; } else { copy avail bytes from fd into buf[0..avail-1]; /* Try to protect the user a little bit from left over garbage: */ memset(&buf[avail], nbytes-avail, 0); /* zero out buf[avail..nbytes-1] */ return avail; } } (And in your case, of course, the memset hits memory in buf[68..999], and probably trashes about a kilobyte of your other variables.) It seems to me that the contract says "the library routine may always expect to have nbytes of buffer space available at buf". (Perhaps it says this only implicitly, which would be a shame.) I think conforming implementations are free to use that buffer: maybe in a crazed attempt at helpfulness as above, or perhaps it's a fabulously tricky implementation and wants to use the buffer as temporary memory for its low-level file I/O. > The code in question made 100% sure that the data read would never > exceed the buffer size, and I would have been hard pressed to > determine that the code was incorrect. So? Does the specification say that read() will only touch what it _really_ needs to, or all of what the application has told it is available? Unfortunately, the answer seems to be "the specification is kinda vague", so pragmatically surely the more conservative assumption is more reasonable? -- John Marshall PEC (NZ) Ltd, Marton, New Zealand.