From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,5dc0152fc17e5f2c X-Google-Attributes: gid103376,public X-Google-Language: ENGLISH,ASCII-7-bit Path: g2news1.google.com!news2.google.com!fu-berlin.de!uni-berlin.de!not-for-mail From: "Nick Roberts" Newsgroups: comp.lang.ada Subject: Re: Deadlock resolution Date: Wed, 28 Jul 2004 14:53:02 +0100 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; format=flowed; delsp=yes; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Trace: news.uni-berlin.de ltJILOU43EhnZ0BL06TjHgmvWCzRNrqBNR3XZwef+02VbHe5U= User-Agent: Opera M2/7.51 (Win32, build 3798) Xref: g2news1.google.com comp.lang.ada:2440 Date: 2004-07-28T14:53:02+01:00 List-Id: On 28 Jul 2004 11:34:42 +0200, Mark Lorenzen wrote: >> > Eliminate deadlocks in the first place is the best >> > solution. Everything else is just a hack trying to fix a >> > bad design. >> >> That is quite true, but deadlock handling is a bit like >> exception handling, in that it can form the braces of a belt- >> and-braces approach. Is deadlock elimination realistic for >> all kinds of software? > > Good question. Is it actually possible to detect a deadlock in > all situations? I'm sure the answer is 'yes' in theory, but 'no' in practice. >> >> I would appreciate brief descriptions of how deadlock >> >> detection and/or resolution is performed in real Ada >> >> programs (where it is performed in Ada). >> > >> > In some systems, a watchdog is monitoring the CPU to detect >> > if it halts and then resets the system. But again, avoiding >> > deadlocks is better than resetting the system. >> >> Is it necessary for the watchdog to reset the whole system, >> or just to kill one of the tasks/threads participating in the >> cycle of deadlock? > > I was thinking of small embedded systems and there it is often > feasible to just reset the complete board. But in large systems, > it can be necessary to handle such incidents more gracefully. > ... My current design for the AdaOS kernel is as follows. The kernel counts the time that each thread (task) is blocked waiting for another thread (in the same workstation). If the count reaches a certain threshold (possibly about 10 minutes) for a thread, the kernel performs a traceback on the chain of threads it is waiting on; this chain breaks if any thread in it is waiting on I/O or a timer. If the chain doesn't break before the original thread is reached again, deadlock is detected. The current 'resolution' strategy is to select the youngest (most recently created) thread in the chain, and kill it (it gets a special Deadlock exception). There are at least two weaknesses with this approach: (a) it takes at least 10 minutes (or whatever) to detect a deadlock; (b) any chain of deadlock that goes outside the workstation at all will not be detected by this mechanism. I think (a) is unlikely to be a serious problem, in practice. However, since AdaOS will be a fully distributed OS, (b) is very likely to happen, in practice. I believe that the kernel canot be expected to manage deadlocks in these cases, because it would be impractical to implement a mechanism that could be guaranteed immune to false intervention (to act only in cases of a genuine deadlock). I therefore believe that super-kernel software which might be in communication with software executing on another workstation (and in AdaOS, that means most software) must be programmed to either: (1) eliminate (within reason) the possibility of deadlock; (2) detect and resolve potential deadlocks. I think (1) will be impractical in the majority of AdaOS programs. When considering the likely AdaOS (actually CORBA) scenario of a menagerie of programs interacting in potentially deadlocking ways, I think considerations of deadlock management gain in importance. So, I'm pondering about the subject as it impacts the design of the AdaOS kernel (which is called 'Bachar'). I'd appreciate any comments on this design. -- Ta, Nick Roberts