From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Thread: 103376,b95a522100671708
X-Google-Attributes: gid103376,public
X-Google-Language: ENGLISH,ASCII-7-bit
Path: 
 g2news1.google.com!news1.google.com!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: Nick Roberts <nick.roberts@acm.org>
Newsgroups: comp.lang.ada
Subject: Re: For the AdaOS folks
Date: Tue, 4 Jan 2005 18:22:07 +0000
Message-ID: <gemini.i9t1ou00db7bb00dc.nick.roberts@acm.org>
References: <Gq2dnUK90vVhBVLcRVn-1w@gbronline.com>
 <gemini.i9ih0y000ys1c02s4.nick.roberts@acm.org>
 <1PTAd.1218$0y4.421@read1.cgocable.net>
 <gemini.i9lo7n001cmv901w4.nick.roberts@acm.org>
 <gKiBd.57678$Tn1.1935246@news20.bellglobal.com>
Content-Type: text/plain; charset=us-ascii
X-Trace: individual.net zgDSQQfBVH1xw9xxuJPsRwulBtoKQwgEqt/f4nzR2Rqap0cAg=
X-Orig-Path: not-for-mail
User-Agent: Gemini/1.45d (Qt/3.3.2) (Windows-XP)
Xref: g2news1.google.com comp.lang.ada:7435
Date: 2005-01-04T18:22:07+00:00
List-Id: <comp.lang.ada>

"Warren W. Gay VE3WWG" <ve3wwg@NoSPAM.cogeco.ca> wrote:

> By "process migration", are you referring to "thread-migration"?

No, I'm talking about moving an entire process (its memory, threads, and all
other resources held) from one workstation to another (within the network)
at /any/ arbitrary point during its execution (the execution of its
threads).

I'm assuming that the threads within a process will be closely-coupled --
they will share the same memory address space -- so as to support the
typical monoprocessor and SMP configurations of contemporary PCs.

As such, Mach's concept of thread migration seems inappropriate. Mach was
designed to accomodate experimentation with NUMA, but NUMA machines have not
come into the mainstream marketplace yet, so this is one way in which the
experimental nature of Mach's design is less than ideal, to me.

> > Of course, that support could be added on top, in an extra layer; but
> > obviously I felt there's no point in having that extra layer, I might
> > just as well design a microkernel that supports it directly. There are a
> > number of other minor problems with L4, too.
> 
> One of the remarks made in the above cited paper is that (paraphrasing)
> QNX implements this by default, because of it doesn't support a queue
> (control must immediately pass from sender to receiver). I know L4 works
> the same way, but I also know that L4 talks of a Local-RPC. It would seem
> that Local-RPC supports what you're looking for in L4.

Since priorities must be enforced by the AdaOS kernel at all times (to
prevent a thread subverting a more privileged time-critical thread), there
must be a change of priority associated with /any/ transfer of control to
another thread. This implies queueing of some kind. But I have studied L4
carefully, and I am sure that would need to put in a layer above it for all
processes that needed to be distributed (migratable).

> > but since there is only a binary interface and a C interface (no
> > published Ada interface), this doesn't turn out to be such an advantage.
> > Similar comments apply to other kernels, including Mach.
> 
> I looked at their API and it doesn't look too bad actually. They map a
> number of things into MR (Message Registers), which would be real easy to
> do in Ada, using the for Object'Address use ...;  A tiny bit of _Asm would
> fix the rest. So I don't see that as much of a problem. Define a package
> (or few) and the rest is a slam dunk.

They take the attitude that the C interface must remain the same, regardless
of the machine. I think that's impractical. I've designed the Bachar
interface so that it will use real machine registers, rather than pretend
registers. It means the interface changes from machine to machine, but it
also means that the interface is as efficient as possible for each machine.
Note that the differences in Bachar's interface, from machine to machine,
will vary only in certain details (such as call convention and parameter
passing).

> > This issue is related to the problem of process migration. Mach supports
> > it by not tying communications to threads. But this creates all sorts of
> > communications routing complexity inside the kernel.
> 
> I am not entirely convinced of this. I am currently working with rtmk (a
> stripped down Mach clone of sorts), and from the kernel side it is not
> that difficult (I've had to mess with some of this ;-)
>
> What it _does_ of course require is the need for formatted messages. By
> this I mean that you cannot just send a stream of bytes from one task to
> another, and include a Port somewhere in the middle. The kernel must know
> when a port is being copied/moved, so that it can play its kernel magic
> before the receiving task gets the message (and thus inherit the port
> right). So from this point of view, I would agree with you. But otherwise,
> I don't see it as complex at all.

Assume we are considering one distributed process making an RPC to another
one. Every such call must be dynamically routed, since the target process
could be on any workstation in the network. Not only must the router store a
table of the location (which workstation) of every process, but there must
also be a protocol that allows a router on another workstation to forward
the call (because the process has just moved) and notify the originating
router (so it updates its locations table). It is also necessary to deal
with network partitions (a partial breakdown of the network), and it is
necessary in AdaOS for such RPCs to be made in the context of transactions,
so that if the call unrecoverably fails, the transaction can be aborted. I
don't think all that network routing complexity should be inside a
microkernel.

> The thing is, depending upon design of course, you could easily use
> thousands of send-rights, with little or no overhead. If you must
> associate the message endpoint to a thread, barring lazy allocation
> techniques, you impose all of the overheads that a thread must impose
> (stacks, state etc.)  I just feel that the port paradigm is more flexible
> for the OS designer (it costs less). It is also conceptually cleaner to
> say you have two end-points (ports) of a message queue, than saying you
> have two-threads when threads aren't part of the abstraction.

I have taken the approach of local IPC being supported by the microkernel
(and lower levels of TCB software), and network IPC being supported by a
higher layer of TCB software (Avarus).

> > I think Bachar does much better: communications are thread-addressed,
> > but a mechanism permits proxy threads (which can be in separate
> > processes) to be transparently interposed when non-local communication
> > is required (and removed when it is not). This way, you get simplicity
> > and efficiency, but process migration is still properly supported.
> 
> But if you look at the Mach paper above, you can do the very same thing
> with ports. Even without thread-migration, you can interpose proxies. With
> thread-migration enabled RPC, you achieve both and provide the OS designer
> a nicer interface.

But if we did that, what would be the advantage?

> > Note that there are some other subtleties: all Bachar resources can be
> > addressed (identified) in a way that can be exactly reproduced in the
> > target machine when a process is migrated;
> 
> Actually, Mach does this as well. However, they will tell you however,
> that you must beware:  When writing a pager shared among multiple hosts
> for example, you must be aware that the VM page size may vary according to
> the type of equipment it is run on. For Intel 4K, but another platform (I
> forget which), will use 8K.
> 
> Obviously, it will be difficult to migrate an Intel thread onto a PPC
> platform ;-)

I've made what I think is the reasonable assumption that the page size and
processor architecture remain the same throughout the network (or the
administrative division of the network within which process migration is to
occur).

I know that Mach supports migration (it was designed to), but not in the
form I want it, as I've described above.

> > most microkernels do not seem to support this functionality. Bachar
> > provides functionality that allows any process to be frozen, dissected,
> > reconstructed (on the target machine), and then reanimated. These
> > functions can be done by a higher layer, but it makes more sense for
> > them to be supported directly by the kernel.
> 
> I was planning to do this for my OS outside of the mk. The primary reason
> for this is perhaps two-fold:
> 
>   1. I want to choose my own form of networking for the purpose
>   2. It may require some other OS "coordination"
> 
> So I still favour a minimalist microkernel, but allowing for those nicer
> abstractions above it.

I don't see the point in making the microkernel /that/ minimalist, to be
honest. Whatever the upper layer did, it would have to be dependent upon
intimate details of the kernel. So why not just lump them together?

> >>   1. Clustered systems with all the same endianness and page size
> >>   2. "Networked Clusters" of #1 groupings
> >>
> > > In this way, it would be efficiently possible to cluster similar
> > > equipment, and yet still provide "one-ness" in the different computer
> > > cases, with the necessary overhead and complexity. This would also
> > > permit a tiered development approach.
> > 
> > Interesting idea. But I am assuming a network made up entirely of
> > workstations sufficiently similar in architecture that a process can be
> > migrated from any one to any other. In practice, to start with, that
> > means a set of PCs.
> 
> Yes. You obviously have to track the capabilities in mixed Intel
> environments for example, so you might have some host affinities for i686
> vs i386 etc. among the nodes. But yes, the general idea is to allow the
> pager to page-fault a process over to another cluster host, assuming that
> some sort of automatic load-balancing can work out who it should page
> fault to.

Yes. I intend to take the lowest common denominator approach for the IA-32.
My idea is to have an auxiliary program that Avarus uses to give it
recommendations for the migration of processes.

> To make this work, as I envision it, requires that all participating nodes
> in a cluster see the file system the same way. IOW, there can only be one
> root file system, which obviously is only local to one node. To all other
> nodes, root will be access NFS-like (ie. over the network cable).

This is a fundamental precept of fully distributed networking.

> However, this is not to say that other nodes will not have their own local
> disk, but they will of course be logically mounted on top of root, or
> another node's filesystem(s).

AdaOS will support local processes, that are never migrated. These will be
processes which are closely related to one particular workstation (for
example, a process closely connected with a peripheral connected to the
workstation). Local processes will additionally have access to a local
filesystem (local to the workstation).

> This consistent file view is necessary to make tasks mobile to any node in
> the cluster.

Not only must the filesystem view by fully distributed (the same from every
workstation in the network), but also the temporary memory view must be.
Since AdaOS allows a process to access multiple memory spaces (called
'regions'), distributed processes must have a distributed way of accessing
all of its regions. This is provided a program calssed as a 'distributed
storage manager' (DSM); there will be a DSM program called Cortex, which
automatically shuffles pages around the network on demand.

> Everyone collects old Intel boxes these days. Wouldn't it be nice to be
> able to plug them all into a common hub, and boot a common image and say
> "go forth and cluster!"  The goal is obviously do this with minimal system
> administration with no PhD required.

Exactly what I hope to achieve.

> > Exactly. The main problem with the Registry is that it is not a proper
> > database system.
> 
> Well, define "proper" and define what the "database system" solves that
> the present system doesn't?

The Windows Registry doesn't provide indexing, joining, field selection, or
filtering (record selection). These would all be useful capabilities. It
also doesn't provide the kind of administrative functionality (e.g. backup)
that a full database system usually does.

> SQL databases are designed to deal with large volumes of similar data, in
> SELECTs, joins etc. However, when you look at what goes into the registry,
> a lot of it is custom, hierarchically structured information.

No, the kind of data that goes into the Registry would be naturally
organised in many complex ways. For example, many different applications
store their own font information, but a font disinstaller may well wish to
be able to 'cut across' this font data, to make appropriate adjustments (for
all the applications). As another example, some applications may wish to
store data for several different users, but an adminstrative program may
wish to change all the data (across many applications) for a particular
user.

> So I don't dispute there may be a better paradigm for this information,
> but I haven't seen it yet.

Well, I'm not a big fan of SQL or typical relational databases. But I'm
intending to build a database engine for AdaOS, called Carrot, which
provides the essential facilties: multiple fields, and field selection; a
good coverage of data types (including BLOB), and some basic operations on
them; indexing (on most of the different types); joining; filtering. I might
base it upon the XML Database model.

> Happy New Year!

Thank you, and you too.

-- 
Nick Roberts