From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,bc1361a952ec75ca X-Google-Attributes: gid103376,public X-Google-ArrivalTime: 2001-08-29 09:48:31 PST Path: archiver1.google.com!newsfeed.google.com!sn-xit-02!supernews.com!newsfeed.direct.ca!look.ca!newsfeed1.cidera.com!Cidera!cyclone1.gnilink.net!news-east.rr.com!cyclone.rdc-detw.rr.com!news.mw.mediaone.net!lsnws01.we.mediaone.net!typhoon.san.rr.com.POSTED!not-for-mail Message-ID: <3B8D1B8F.9BDEDC38@san.rr.com> From: Darren New Organization: Boxes! X-Mailer: Mozilla 4.77 [en] (Windows NT 5.0; U) X-Accept-Language: en MIME-Version: 1.0 Newsgroups: comp.lang.ada Subject: Re: Progress on AdaOS References: <9IFe7.12813$6R6.1221214@news1.cableinet.net> <9lghqu$ac6$1@nh.pace.co.uk> <3B7C3293.76F49097@home.com> <9lhefg$lgd$1@nh.pace.co.uk> <3B7D47F1.25D6FC78@boeing.com> <5ee5b646.0108171856.18631c4c@posting.google.com> <3B7F624B.7294D24F@acm.org> <9lr6je$5hj$1@nh.pace.co.uk> <9ltoi7$4is$1@nh.pace.co.uk> <3B82789B.8D195045@home.com> <9ltuo8$70n$1@nh.pace.co.uk> <3B829450.879B0396@home.com> <9mdh4e$q3v$1@nh.pace.co.uk> <9me03r$c302@news.cis.okstate.edu> <3B8AB6C8.910130C8@san.rr.com> <9metfo$aai2@news.cis.okstate.edu> <3B8BC332.214F95CA@san.rr.com> <9mhmlg$9aa1@news.cis.okstate.edu> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Date: Wed, 29 Aug 2001 16:42:55 GMT NNTP-Posting-Host: 24.165.20.229 X-Complaints-To: abuse@rr.com X-Trace: typhoon.san.rr.com 999103375 24.165.20.229 (Wed, 29 Aug 2001 09:42:55 PDT) NNTP-Posting-Date: Wed, 29 Aug 2001 09:42:55 PDT Xref: archiver1.google.com comp.lang.ada:12560 Date: 2001-08-29T16:42:55+00:00 List-Id: David Starner wrote: > > On Tue, 28 Aug 2001 16:13:42 GMT, Darren New wrote: > > My point is to ask why not. My point is to ask "why limit yourself to > > array-of-byte files" and you're answering "because anything else is not > > a file." That's not helpful. > > The way I feel, you keeping asking "why limit yourself to citrus fruits > as oranges" and I'm answering "because anything else isn't an orange." > Until you give me a concrete definition for a file, this discussion > isn't going anywhere. A file would be a collection of data that can be shared between programs and outlasts any given process. (Of course, if it's represented as a process, it doesn't outlast the process that it is, but that's no different than saying a file doesn't persist past the execution of the "rm" program.) Note, for example, that a shared memory segment falls into this model, even under UNIX. Now imagine taking the UNIX shared-memory-segment model, and expanding it to be a shared-tagged-object model. You do whatever manipulations you want, and your allocations all come from a storage pool that deals with the stuff as a file. > No matter how you define it, most files are arrays of bytes. Look up > how hard drives and RAM work if you don't believe me. Actually, most files are *not* arrays of bytes. Most files are arrays of records, excluding compressed files (records being 1 bit long) and streaming media perhaps. Under CP/M, files were arrays of sectors. Line printers aren't arrays of bytes. The mouse isn't an array of bytes. /dev/audio isn't an array of bytes. Fonts aren't arrays of bytes. For that matter, images aren't arrays of bytes either, they're 3D arrays of pixels. You just serialize them into arrays of bytes. Which is irrelevant, considering that we seem to have no problem storing the kinds of complex pointer-based information I'm talking about in RAM, which is an array of bytes, almost. (Note, no RAM is *not* an array of bytes either, on any hardware with memory management.) > > Why not? Again, you're starting with a preconceived notion of "file" > > being "an array of bytes", then claiming that the OS should be handling > > that as the basic type. > > I don't even know what you mean by "the basic type" here. The fundamental type of data handled by the OS. In UNIX, there's four fundamental types of data in files: files, dirs, block-special, and character-special. > > For example, on the Amiga OS, files were tasks, as were device drivers, > > etc. You could send a message to a file task to read and write portions > > of the file into particular buffers, and when the message returned, the > > I/O was complete. This allowed you to treat windows as files > > (open("con:top/left/width/height/title")) and such. It allowed you to > > write phenomes to the voice synthesizer and read lip positions back as > > the sound was generated. Cool stuff like that. > > And, wow, open ("con:1/1/20/20/My Window / file") is so much clearer than > Create_Window (Top => 1, Left => 1, Width => 20, Height => 20, > Title => "My Window / file"). Uh huh. And /dev/fd0 is so much clearer than INT21 calls, yes? The point is that windows were consistant with the file paradigm in ways that they aren't consistant in X-Windows, and this was possible because they were based on active processes with complex data rather than on stuff built into the kernel. > Oops, we need to escape the / somehow > in your call, don't we . . . No, because the title came last. Also irrelevant. I can see it's pointless for me to give counterexamples to your points if you're going to actively attempt to ignore why I might be giving such counterexamples. > You're welcome to open a pipe to the voice synthesizer. It'll work > just as well. No, it won't, for reasons that are irrelevant to Ada so I won't persue here. > > Imagine an OS where "files" are actually protected objects, fully typed, > > that outlive your program's execution. Directories are simply persistant > > protected objects that act as arrays mapping strings to these "files". > > So we have an space-inefficent tangle of data that's hard to transfer > between systems running different OS's and probably even between different > programs on the same system. It would be easy to transfer them between programs on the same OS, I would think. At least, if those programs were written in Ada or compatible with them. It's easy to move them to a different system: You serialize them. But why make serialization a *required* step *every* time you use a file, *including* the ones you have no intention of ever moving to any other system? Doing so is like using overlays instead of demand paging for your memory management. Why would you, when you can build it into the OS and get everyone using it for free? > >> An OS could certainly flatten that into a file. It would probably > >> take a lot of work to make sure it wasn't an OS/program/libada specific > >> file, though. > > > > I don't see any particular need to flatten it as such. Again, you're > > trying to map "files" onto "array of bytes in a UNIX-like file system". > > That's not the point. > > Hard drives, CDs, floppies and tapes are flat. Hence to store it you're > going to need to flatten it. Since RAM is flat, I guess it's impossible to use non-flat structures at all, right? Why is it OK to use "tangled" data structures in memory, but not in a file? For that matter, why are files contiguous arrays of bytes? Why do you insist that the OS handle the directories but you handle what's inside the files? Why do you insist that the OS handle allocation of space for the files, but not handle allocation of space inside them? That's the point I'm trying to get at. Anyway, no, hard drives, CDs, and floppies all have sectors. So they're arrays of sectors, each of which is an array of bytes. And again, why do you want directories built in? Why not just flatten them yourself? > > There's a reason folks put names on files, on procedures, on users. The > > same reason holds for individual pieces of data in a file. It's a win > > whenever you store more than one piece of data in a file, like more than > > one line in a text file, more than one user in a password file, etc. And > > if you're only storing one atomic piece of data in a file, it's because > > you're using the file's name as an index anyway. Have you never written > > a program where the name of a file is calculated by the program? > > "Image01.jpg", "Image02.jpg", ...? > > And how much of a win is storing that in one file over storing it in > a directory and tarring / zipping it for transport? Significant. My point is that you *do* use such things, you just don't recognise it. You say "I don't need complex intra-file data, because I can make complex inter-file data instead." You're kind of perverting the directories to serve as keys in exactly the same way that whatshisname complains that people perverted file names to include the data type. If you haven't used more sophisticated file systems, you don't see what you're missing, just like if you've never used directories or relational databases or OO databases or strong typing, you might have a hard time seeing the value of it. Why is strong typing good in a programming language and bad in persistant data? > >> > The type for an "Active Server Page" with Ada code embedded in > >> > HTML markup under control of CVS? > >> > >> text/x-asp. > > > > And how does CVS know it can handle this, > > text/* > > and how does the web browser > > know it needs to invoke the Ada compiler and not the Java compiler when > > rebuilkding this page? > > I would assume it reads the start of the file. OK, so you're only solving half the problem here. The content-type on the file isn't sufficient to figure out what you can do with the file without reading the data. Why bother with it at all? Why is strong typing good in a program and bad in a file? > > The point is that no passive tagging system is going to be complete > > enough that you can specify everything you need to know about the > > program in the type tag. > > Sure. And? And why bother? What's the benefit of putting big complex heirarchical types on files that are neither sufficient nor enforced? What keeps me from writing a JPEG image into an Ada source file? If I can do that, why do I need the file type? > I've used Ada. Every so often in Ada, you're juggling twenty different > types and having to constantly convert between them. Oh no. I've never had to juggle file types in a flat-file OS either. Never had a problem with the wrong character set or line ending translation, never had to convert from one image format to another, and never had to change a file extension so some other program would accept a text file and display it as HTML markup, and of course I've never opened an image file with a text editor. > I fail to see the > advantage to that in an OS. I don't want each program having its own > set of types that are incompatible with everything else. I also want > my editor to work any appropriate file, be it C, Ada, HTML, Tex, > a diary, whatever. Welcome to the wonderful world of object-oriented OSes. :-) As long as the file type supports getting and setting lines of text, your editor shouldn't have a problem with it. As long as it supports serialization at all, your binary editor shouldn't have a problem with it. Why wouldn't you want your editor to work with inappropriate file types? > You ask how the web browser knows it needs to invoke the Ada compiler > in my situation. How does the web browser know anything about the > file in your situation? The same way the web browser knows the types of the windows and buttons it displays. It does a "with File.Text.Html" and then uses the routines therein. > How does this type get created? The same way any other type gets created. You write a package for it. > What happens > if you change it from Ada to Chill? Then you have to figure out how to serialize it or figure out how to translate it without serializing it. Always coding every single program to the worst-case conditions is going to lead to all kinds of problems. What if the next HTML standard specifies that all HTML tags should be in Hebrew? What happens if the next version of emacs uses FORTH for its scripting language instead of elisp? > How does the editor know what type > to create? You tell it. Same as now. > > Errr... When talking about designing and writing an OS, saying "oh, > > don't worry, the device drivers will take care of it" doesn't work. > > Sure it does. It's called stratifaction. When talking about user mode > utilities, we can hand certain things off to the kernel, and let > the kernel worry about it. But to do that, you have to come up with some universal semantics for the file system, which is what we're talking about. If your files have keyed records, your copy program copies keys and records. If your files have arrays of bytes, your copy program copies arrays of bytes. The complexity of the file system doesn't preclude writing universal utilities. > > Why > > not let the real data be the one with the pointers and strong types and > > such, and only use streams when you want to communicate with a different > > OS? > > Because a plain text file is readable by 4 million different programs. > Because the format for PNG is set in stone and readable by 40,000 different > programs. Well, why don't you read what I said? Of course, the serialized data formats are readable by any number of programs. That's why it's important for programs that have data they want to port elsewhere to serialize them. However, if the *only* way to store the data is to serialize it, then every program has to go throught the overhead of parsing the data. Look at it this way. To deal with GIF, you use a library. To deal with PNG, you use a library. To deal with XML, you use a library. This library's job is to convert from one serialized format (the GIF or PNG or XML file) to another serialized format (stored in the swap space). It also converts back again. If you're going to have that internal format anyway, why not allow it to get stored directly? If it's the representation of a type that has a defined external serialization, you write a "serialize" and "deserialize" routine for that type. But guess what? You have to do that *anyway* in your model. In addition, this library you use for GIF, why not make *that* the content-type of the file, rather than some arbitrary string? Why not say "this file is a GIF because the interface to it is libgif.so"? Or "this file is a GIF because the interface to it is with File.Image.GIF"? That's what I'm suggesting, in essence. Of course, once you do that, files start acting more like persistant variables than they do arrays of bytes. Indeed, you don't *really* have arrays of bytes in a UNIX-like OS. You have calls like "read" and "write" that transfer *something* from a file into your buffers. The buffers are arrays of characters, but the files aren't. The OS does the serialization for you there too. Surely writing to /dev/audio isn't transfering an array of bytes anywhere. Writing to a fragmented file isn't transfering an array of bytes - the OS is dealing with disk blocks and channel programs and stuff like that. Think about the Windows Registry. The number of times you actually want to turn that into a text file is minute compared to the number of times you want to access it as a structured data store. The registry is not defined in terms of the bytes that are in the file. It's defined in terms of the interfaces to it. This is a *far* superior way of working things, upward-compatibility-wise and ease-of-use wise. (Witness the fact that Win3.1 "INI file" interfaces still work with Windows, but Emacs can't keep its internal formats working. :-) > However, the internal format for Emacs has been known to change > between versions. It's hard to keep internal structures the same; much > easier to conform to a consistent external standard. Errr, and the point is? Perhaps that emacs internal formats, not intending to be ported, don't need to be serialized except for the primitive nature of the data the OS is willing to handle? And that therefore when you upgrade emacs, you have to recompile the scripts the first time you use them, in return for having far better performance for the other 99.44% of the time you're not upgrading emacs? > Also, I've worked with programs that used a database and those that use > plain text files, and the latter seem more reliable. Sure thing. Tell this to the folks running the airline reservation programs. -- Darren New San Diego, CA, USA (PST). Cryptokeys on demand.