From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 109fba,df854b5838c3e14 X-Google-Attributes: gid109fba,public X-Google-Thread: 1014db,df854b5838c3e14 X-Google-Attributes: gid1014db,public X-Google-Thread: 10db24,fec75f150a0d78f5 X-Google-Attributes: gid10db24,public X-Google-Thread: 103376,df854b5838c3e14 X-Google-Attributes: gid103376,public From: c2a192@ugrad.cs.ubc.ca (Kazimir Kylheku) Subject: Re: ANSI C and POSIX (was Re: C/C++ knocks the crap out of Ada) Date: 1996/04/09 Message-ID: <4kdvh7INNdjb@keats.ugrad.cs.ubc.ca> X-Deja-AN: 146558010 references: <4kcsnsINNgkb@keats.ugrad.cs.ubc.ca> organization: Computer Science, University of B.C., Vancouver, B.C., Canada newsgroups: comp.lang.ada,comp.lang.c,comp.lang.c++,comp.edu Date: 1996-04-09T00:00:00+00:00 List-Id: In article , Robert Dewar wrote: >Kazimir said: > >>This is poor coding. You are _advertizing_ a buffer of size 1000, but passing a >>pointer to a 100 byte buffer. It wouldn't even occur to me to do this, and >>until now I have been completely oblivious to this difference between Linux >>and other systems. > > The spec of an interface does not depend on what "wouldn't even occur" > to Kazimir, it must be independently defined. I totally agree. But in the absence of the definition, we have to stick with the safer thing. >>Unfortunately, I could not find anything in POSIX.1 that would explicitly >>disallow this. The document is not very assertive in defining undefined >>behavior. I'm going to check it once again in case I missed something. > > This is not a matter of defining undefined, it is a matter of defining > the requirement on the length of the read buffer, and it is truly > amazing to me that none of the references at hand, not even the POSIX > document, specifies this. I might come up with something up if I read the damn thing character by character, cover to cover. But I did spend a fair bit of time chasing around the document, in vain. >>It's not surprising: you lied to the read() function. But you are right, you >>can't tell this from the definition in POSIX.1 or from typical manual pages. > > Sorry, this is wrong, I lied to the *implemention* of the function as > it occurred in Linux. Now it is true that the spec of the function is > different in Linux than in other systems (you quoted the manual pages > that showed this clearly). So of course we have a portability problem > here. Read is different in different systems, not only at the > implementation level, but at the spec level. The program in question > was correct with respect to the spec on "other systems": > >>I checked the manual pages for read() on several systems. Linux documents >>that results if the buffer pointed at by buf is outside of the address space >>of the process. On other systems, it is claimed that EFAULT results if >>the buf pointer is directed outside of the address space. > > Kazimir, perhaps you don't understand the whole idea of specs, but that > quote means that code that makes sure that the pointer is directed inside > the address space is OK if the buffer is not overrun! Right. The distinction is quite clear. The Linux doc talks about the whole buffer object, whereas the SunOS and HP-UX man pages talk about the buffer pointer. >>There are certain unwritten rules, though! > > That's the totally unacceptable viewpoint that is at the center of > the concerns in this thread (the details of read are uninteresting). > The trouble is of course that Kazimir's unwritten rules are clearly > different from other unwritten rules. I believe that my unwritten rules agree with what other UNIX/POSIX programmers also believe about the read() function, the same way that those Fortran 66 programmers held a consensus about the reversed DO loop or large array passing. > I think one of the most essential things to understand in programming > is the importance of abstracting specifications from implementation. > Comments like the above (unwritten rules) one show that there is a > long way to go! My reasoning was not based on any implementation. I actually got the idea of these unwritten rules from your posting about language implementations which give a meaning to certain behaviors that are not standard simply to reflect practice among programmers (like the Fortran 66 unwritten 'at least once' semantics for a reversed DO loop that you mentioned). In this case, the unwritten rule is not that you may misrepresent the buffer size, but rather the opposite. Conduct a survey of UNIX programmers, and see. :) This empirical notion about unwritten rules is thanks to you, not me! I abstract to the safer alternative regardless of what anyone thinks. Of course, this heuristic doesn't work all the time. It failed in the case of select(). How was I to guess that the function will modify the timeval structure, when the program worked properly on two other systems? There is clearly another ``unwritten rule'' about the behavior of select(), but in this case I unconsciously believed in it. After running into the problem, I no longer believe in the ``unwritten rule'' but in the safest rule, and no longer count on the contents of the timeval structure being preserved after a select() call, regardless of the implementation. In the case of read(), my intuition has always been to specify the actual buffer size. It just so happens that in this case the weaker assumption is in agreement with the common belief, whereas in the case of select() the stronger assumption is the common belief. The difference between safest assumptions and unwritten rules is that the latter are subjective, because they are determined by the concensus of community of programmers, whereas the former are not subjective because they are based on rational reasoning. Are you familiar with ``Pascal's Wager''? It is a way to decide between alternate hypotheses. You draw a table like this: call read() with exaggerate the buffer correct buffer size the buffer size to read() lying about You are OK You are OK buffer is OK lying about You are OK You could be screwed buffer is *NOT* OK. I'm not about to pick the lower right hand corner, just because I can interpret the vague spec in a way that could justify a belief in the corresponding hypothesis. Pascal used this method of inference to justify a belief in God, incidentally, hence the name. :) It has nothing to do with my belief about what the implementations are like, or what the unwritten rules are. --