From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM autolearn=unavailable autolearn_force=no version=3.4.4 X-Received: by 10.129.117.70 with SMTP id q67mr868926ywc.57.1435159259031; Wed, 24 Jun 2015 08:20:59 -0700 (PDT) X-Received: by 10.182.234.108 with SMTP id ud12mr173023obc.4.1435159258967; Wed, 24 Jun 2015 08:20:58 -0700 (PDT) Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!mx02.eternal-september.org!feeder.eternal-september.org!news.glorb.com!z60no3074795qgd.1!news-out.google.com!a16ni3487ign.0!nntp.google.com!h15no7974388igd.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail Newsgroups: comp.lang.ada Date: Wed, 24 Jun 2015 08:20:58 -0700 (PDT) In-Reply-To: Complaints-To: groups-abuse@google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=74.203.194.21; posting-account=bXcJoAoAAAAWI5APBG37o4XwnD4kTuQQ NNTP-Posting-Host: 74.203.194.21 References: <27492d6c-3bf8-4eb9-8ebb-4d9f621235eb@googlegroups.com> <247a5033-337c-4bf9-8b37-c82759d8a2dd@googlegroups.com> <6befc07f-d4c4-42a2-8e36-265093ae9546@googlegroups.com> User-Agent: G2/1.0 MIME-Version: 1.0 Message-ID: Subject: Re: Ravenscar and context switching for Cortex-M4 From: Patrick Noffke Injection-Date: Wed, 24 Jun 2015 15:20:59 +0000 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Xref: news.eternal-september.org comp.lang.ada:26455 Date: 2015-06-24T08:20:58-07:00 List-Id: On Thursday, February 19, 2015 at 4:45:00 PM UTC-6, Patrick Noffke wrote: > On Thursday, February 19, 2015 at 4:13:51 PM UTC-6, Patrick Noffke wrote: > > On Thursday, February 19, 2015 at 2:14:45 PM UTC-6, Patrick Noffke wrot= e: > > > On Monday, February 16, 2015 at 3:28:08 PM UTC-6, Simon Wright wrote: > > > > Patrick Noffke writes:=20 > > > > =20 > > > > > Here's what happens now (the order of the interrupts may change= =20 > > > > > between runs, but this is for one capture):=20 > > > > >=20 > > > > > 1. UART interrupt triggers. 2. PO1's entry executes.=20 > > > > =20 > > > > because the entry body is executed in interrupt context. See=20 > > > > below.=20 > > > > =20 > > > > > 3. SPI interrupt triggers twice (see below). 4. PO2's entry=20 > > > > > executes. 5. T1 (UART task) executes. This is the first thing= =20 > > > > > wrong. T2 is higher priority than T1 so T2 should run first.=20 > > > > > 6. T2 (SPI task) executes twice. Upon the second execution, I=20 > > > > > get a program error because Object.Entry_Queue is null. The=20 > > > > > exception is=20 > > > > =20 > > > > Entry_Queue is *not* null, as you said in the next post.=20 > > > > =20 > > > > > raised in s-tposen-raven.adb (line 167 in my copy) in=20 > > > > > Protected_Single_Entry_Call.=20 > > > > >=20 > > > > > This may be relevant -- the SPI interrupt triggers twice. This= =20 > > > > > is because the interrupt is for a DMA completion, and it fires=20 > > > > > both when TX and RX complete (since it's SPI, they complete at=20 > > > > > the same time). I take care in my interrupt handler to release= =20 > > > > > the entry from only one of the two interrupts. Perhaps with the= =20 > > > > > interrupt firing twice, the runtime may get confused and=20 > > > > > activate the task twice (even though the entry only executes=20 > > > > > once). But for the above run, the entry was released during the= =20 > > > > > second SPI interrupt.=20 > > > > =20 > > > > The RTS does this (I hope I have it right):=20 > > > > =20 > > > > The entry call (Protected_Single_Entry_Call): > > > >=20 > > > > locks the entry > > > > if the barrier is open then > > > > asserts that Call_In_Progress isn't set > > > > sets Call_In_Progress > > > > calls the entry body wrapper > > > > clears Call_In_Progress > > > > unlocks the entry > > > > else > > > > if the Entry_Queue isn't null then > > > > unlocks the entry > > > > raises PE > > > > end if > > > > sets the Entry_Queue > > > > unlocks the entry > > > > sleeps > > > > end if > > > >=20 > > > > The handler wrapper: > > > >=20 > > > > locks the entry > > > > calls another wrapper for the handler itself > > > > calls Service_Entry > > > > exits > > > >=20 > > > > Service_Entry: > > > > if the Entry_Queue is set and the barrier is open then > > > > clears the Entry_Queue > > > > asserts that Call_In_Progress isn't set > > > > sets Call_In_Progress > > > > calls the entry body wrapper > > > > clears Call_In_Progress > > > > saves the caller task_id > > > > unlocks the entry > > > > wakes the caller > > > > else > > > > unlocks the entry > > > > end if > > > >=20 > > > > I really don't see how the sequnece you describe happens! > > > >=20 > > > > One thing that puzzles me is the locking/unlocking of the entry: th= is is > > > > done (in that RTS) by raising the caller task's priority to the cei= ling > > > > priority of the task, if necessary. So what about interrupts? And w= hen > > > > the handler wrapper (you can see this by compiling the package with= the > > > > PO in with -gnatdg) locks the entry, it seems to raise the current > > > > task's priority, where the current task has nothing to do with the = PO at > > > > all! > > > >=20 > > >=20 > > > Thank you for all this! It helps a lot. I didn't know about -gnatdg= -- very useful. > > >=20 > > > I suspect the problem may stem from the fact that Leave_Kernel in s-b= bprot.adb can insert a suspended task into the thread queue. I am guessing= somehow this is tripping up the runtime when multiple tasks become runnabl= e at the same time. > > >=20 > > > I stepped through the debugger at startup, and I can see the suspende= d task going into the queue. Then another task is woken up and put at the = front of the queue, so the first task is Runnable and the Next task is Susp= ended. Then when Leave_Kernel resumes running (it enables interrupts after= inserting the suspended task) it may not call Extract (since the running t= hread state is Runnable) -- it never considers that the Suspended task it i= nserted might be later in the queue. > > >=20 > > > I haven't yet been able to directly correlate the Leave_Kernel behavi= or with the task incorrectly waking up twice. I'll keep trying things, but= I wanted to share what I found so far. > > >=20 > > > What I have confirmed is that I can get the system into a state where= there are two Runnable tasks in the thread queue, and the "Next" field of = the last one points to the first one. That is: > > >=20 > > > First_Thread_Table (CPU_Id) =3D First_Thread_Table (CPU_Id).Next.Next > > >=20 > > > Right before this happens is when the two tasks are woken up at the s= ame time. > > >=20 > > > It appears the task that runs twice (when I get the Program_Error I r= eported earlier) is the one that gets put in the queue in the suspended sta= te at startup (an empirical data point after changing priorities of my task= s). > > >=20 > >=20 > > I think I know what's going on. There is a problem with the Insert pro= cedure in s-bbthqu.adb. If (1) a thread is already in the queue, (2) it is= not at the head of the queue, and (3) the priority is greater than the thr= ead at the head of the queue, it can get inserted twice. The error is in t= he "elsif" condition of this code: > >=20 > > if First_Thread_Table (CPU_Id) =3D Thread then > > null; > >=20 > > -- Insert at the head of queue if there is no other thread with = a higher > > -- priority. > >=20 > > elsif First_Thread_Table (CPU_Id) =3D Null_Thread_Id > > or else > > Thread.Active_Priority > First_Thread_Table (CPU_Id).Active_P= riority > > then > > Thread.Next :=3D First_Thread_Table (CPU_Id); > > First_Thread_Table (CPU_Id) :=3D Thread; > >=20 > > -- Middle or tail insertion > >=20 > > else > > -- Look for the Aux_Pointer to insert the thread just after i= t > > ... > >=20 > > Here is what's happening in my case: > >=20 > > (1) UART is at the head of the queue in the suspended state with priori= ty 10. Leave_Kernel code is pending after calling Enable_Interrupts. > > (2) SPI interrupt fires. > > (3) SPI task priority is 252 (interrupt priority). > > (4) SPI task gets inserted at head of queue in the Runnable state. > > (5) SPI task priority gets adjusted to 15 when Unlock_Entry is called. = It is left at the head of the queue. SPI_task.Next =3D UART_task. > > (6) UART interrupt fires. > > (7) UART task priority is 250, and task is set to Runnable. > > (8) UART task gets inserted at head of queue since its priority is grea= ter than SPI_task. Now UART_task.Next =3D SPI_task, and SPI_task.Next =3D = UART_task. > > (9) UART task priority is adjusted to 10 when Unlock_Entry is called. = Now SPI_task is head of queue, and SPI_task.Next =3D UART_task. UART_task.= Next =3D UART_task. > > (10) SPI task runs. > > (11) UART task runs twice. Boom. > >=20 > > I think the Insert procedure needs to be modified to look through the e= ntire queue for the thread to be inserted. Might be simplest to just remov= e it if it exists before entering the if statement. > >=20 >=20 > This version of the entire Insert procedure works for me: >=20 > ------------ > -- Insert -- > ------------ >=20 > procedure Insert (Thread : Thread_Id) is > Aux_Pointer : Thread_Id; > CPU_Id : constant CPU :=3D Get_CPU (Thread); >=20 > begin >=20 > -- ??? This pragma is disabled because the Tasks_Activated only > -- represents the end of activation for one package not all the > -- packages. We have to find a better milestone for the end of > -- tasks activation. >=20 > -- -- A CPU can only insert alarm in its own queue, except during > -- -- initialization. >=20 > -- pragma Assert (CPU_Id =3D Current_CPU or else not Tasks_Activat= ed); >=20 > -- It may be the case that we try to insert a task that is already= in > -- the queue. This can only happen if the task was not runnable an= d its > -- context was being used for handling an interrupt. Hence, if the= task > -- is already in the queue and we try to insert it, we need to che= ck > -- whether it is in the correct place. >=20 > -- No insertion if the task is already at the head of the queue >=20 > if First_Thread_Table (CPU_Id) =3D Thread then > null; >=20 > -- Insert at the head of queue if there is no other thread > -- with a higher priority. >=20 > elsif First_Thread_Table (CPU_Id) =3D Null_Thread_Id then > Thread.Next :=3D First_Thread_Table (CPU_Id); > First_Thread_Table (CPU_Id) :=3D Thread; >=20 > else > -- Middle or tail insertion >=20 > -- Remove the thread if it is already in the queue. We know > -- the first thread is not null. > Aux_Pointer :=3D First_Thread_Table (CPU_Id); > while Aux_Pointer.Next /=3D Null_Thread_Id > and then Aux_Pointer.Next /=3D Thread > loop > Aux_Pointer :=3D Aux_Pointer.Next; > end loop; >=20 > if Aux_Pointer.Next =3D Thread then > Aux_Pointer.Next :=3D Thread.Next; > end if; >=20 > if > Thread.Active_Priority > First_Thread_Table (CPU_Id).Active_Pr= iority > then > Thread.Next :=3D First_Thread_Table (CPU_Id); > First_Thread_Table (CPU_Id) :=3D Thread; > else > -- Look for the Aux_Pointer to insert the thread just after = it >=20 > Aux_Pointer :=3D First_Thread_Table (CPU_Id); > while Aux_Pointer.Next /=3D Null_Thread_Id > and then Aux_Pointer.Next.Active_Priority >=3D > Thread.Active_Priority > loop > Aux_Pointer :=3D Aux_Pointer.Next; > end loop; >=20 > -- Insert the thread after the Aux_Pointer >=20 > Thread.Next :=3D Aux_Pointer.Next; > Aux_Pointer.Next :=3D Thread; > end if; >=20 > end if; > end Insert; FYI - This problem is not fixed in GNAT GPL 2015. I reported it to AdaCore= back in February.