From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,6394e5e171f847d1 X-Google-Attributes: gid103376,public X-Google-ArrivalTime: 2001-09-07 21:04:29 PST Path: archiver1.google.com!newsfeed.google.com!newsfeed.stanford.edu!news.tele.dk!small.news.tele.dk!205.231.236.10!newspeer.monmouth.com!news.monmouth.com!shell.monmouth.com!not-for-mail From: ka@sorry.no.email (Kenneth Almquist) Newsgroups: comp.lang.ada Subject: Re: Ada OS Kernel features Date: 7 Sep 2001 23:55:41 -0400 Organization: A poorly-installed InterNetNews site Message-ID: <9nc4rt$ske$1@shell.monmouth.com> References: <9n4euv$t9m$1@slb6.atl.mindspring.net> <3B964C7A.BC04374E@icn.siemens.de> <9n5o9n$37a$1@slb7.atl.mindspring.net> NNTP-Posting-Host: shell.monmouth.com Xref: archiver1.google.com comp.lang.ada:12923 Date: 2001-09-07T23:55:41-04:00 List-Id: > My first reaction to this was "Not Possible". However, that isn't entirely > true; it is just *VERY VERY* difficult. A driver runs in kernel mode, and > has access to system data structures. If a driver corrupts a system data > structure, how do you detect this, repair it, and continue? In such > instances, it is much better to bugcheck (blue screen) the system than > try to continue. Consider, if the system is slightly corrupted and > continues to operate, there is the very real possibility that your data > will be corrupted without your knowledge. This was Win98's philosophy, > and it was a disaster. VMS and NT (and others) stop the system dead in > its tracks to prevent hidden corruption. There are several related risks here. One is system data structures being overwritten. The Intel x86 architecture maps segment addresses to linear addresses and then uses the page table to map linear addresses to physical addresses, so it is possible to give device drivers their own address spaces without invalidating the page table cache every time a device driver runs. However, if the device drivers are written in Ada then there is little need for hardware memory protection. Another risk is resource leaks if a device driver allocates a resource (e.g. allocates memory) and then crashes. This can be dealt with by providing debugging wrappers for kernel routines which allocate resources, which keep track of which device driver holds the resource. Then, when a device driver crashes the resources held by that driver can be reclaimed. I assume that the open routine for a device driver will return a tagged object which is used to perform device operations. Tracking down all the references to these objects may not be practical. One approach is to write a wrapper around the device driver. When you call the open routine for the wrapper, it calls the driver's open routine, and then allocates a wrapper object which points to the object returned by the driver's open routine. If the driver crashes, the wrapper switches to the backup driver. This is done by iterating through all the wrapper objects, freeing the objects they point to, and making them point to objects obtained by calling the open routine for the backup driver. When I say the driver "crashes," that means that one task or interrupt handler executing driver code raised an unhandled exception. There could be other tasks executing driver code at the same time. As long as these tasks do not block, they can be allowed to continue, but if they block then it is necessary to throw an exception in the task. This requires an extention to the Ada run time. In GNAT, aborting a task throws a special exception that cannot be caught, so that the basic logic required to raise an exception in another task is there. The execption should be caught by the wrapper, which will then retry the operation using the backup version of the driver. These ideas add up to a bit of work, but they should allow a new version of a device driver to be tested on a running system with only a small risk of disrupting system activity if the new version doesn't work. Whether this is worth doing or not is an open question. I wrote the initial implementation of modules for Linux, and didn't do any of this stuff. (Traps in loaded modules cause the system to crash, just like traps from the core kernel code.) But when Ada code throws an exception, you can be reasonably confident that it hasn't corrupted data managed by some unrelated piece of code, so there is less risk in keeping the system running that there is when C code goes awry. I would say, though, that dynamic loading of code into a running kernel is the big win. If mistakes which are not caught by the Ada type system cause the system to crash, that is still a lot better than having to reboot every time you want to test a changed line of code. Kenneth Almquist