From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM
	autolearn=unavailable autolearn_force=no version=3.4.4
X-Received: by 10.129.117.70 with SMTP id q67mr868926ywc.57.1435159259031;
        Wed, 24 Jun 2015 08:20:59 -0700 (PDT)
X-Received: by 10.182.234.108 with SMTP id ud12mr173023obc.4.1435159258967;
 Wed, 24 Jun 2015 08:20:58 -0700 (PDT)
Path: 
 eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!mx02.eternal-september.org!feeder.eternal-september.org!news.glorb.com!z60no3074795qgd.1!news-out.google.com!a16ni3487ign.0!nntp.google.com!h15no7974388igd.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail
Newsgroups: comp.lang.ada
Date: Wed, 24 Jun 2015 08:20:58 -0700 (PDT)
In-Reply-To: <e1ebe3e5-e200-4f9a-90fb-b15a626288b8@googlegroups.com>
Complaints-To: groups-abuse@google.com
Injection-Info: glegroupsg2000goo.googlegroups.com;
 posting-host=74.203.194.21;
 posting-account=bXcJoAoAAAAWI5APBG37o4XwnD4kTuQQ
NNTP-Posting-Host: 74.203.194.21
References: <ba36d86a-cfb6-475f-9cec-12f1cbb39087@googlegroups.com>
 <27492d6c-3bf8-4eb9-8ebb-4d9f621235eb@googlegroups.com>
 <ly8ufxczrt.fsf@pushface.org>
 <247a5033-337c-4bf9-8b37-c82759d8a2dd@googlegroups.com>
 <6befc07f-d4c4-42a2-8e36-265093ae9546@googlegroups.com>
 <e1ebe3e5-e200-4f9a-90fb-b15a626288b8@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <caff2eb8-71a9-4dc4-b0b0-dd7de4df9e11@googlegroups.com>
Subject: Re: Ravenscar and context switching for Cortex-M4
From: Patrick Noffke <patrick.noffke@gmail.com>
Injection-Date: Wed, 24 Jun 2015 15:20:59 +0000
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Xref: news.eternal-september.org comp.lang.ada:26455
Date: 2015-06-24T08:20:58-07:00
List-Id: <comp.lang.ada>

On Thursday, February 19, 2015 at 4:45:00 PM UTC-6, Patrick Noffke wrote:
> On Thursday, February 19, 2015 at 4:13:51 PM UTC-6, Patrick Noffke wrote:
> > On Thursday, February 19, 2015 at 2:14:45 PM UTC-6, Patrick Noffke wrot=
e:
> > > On Monday, February 16, 2015 at 3:28:08 PM UTC-6, Simon Wright wrote:
> > > > Patrick Noffke writes:=20
> > > > =20
> > > > > Here's what happens now (the order of the interrupts may change=
=20
> > > > > between runs, but this is for one capture):=20
> > > > >=20
> > > > > 1. UART interrupt triggers.  2. PO1's entry executes.=20
> > > > =20
> > > > because the entry body is executed in interrupt context. See=20
> > > > below.=20
> > > > =20
> > > > > 3. SPI interrupt triggers twice (see below).  4. PO2's entry=20
> > > > > executes.  5. T1 (UART task) executes.  This is the first thing=
=20
> > > > > wrong.  T2 is higher priority than T1 so T2 should run first.=20
> > > > > 6. T2 (SPI task) executes twice.  Upon the second execution, I=20
> > > > > get a program error because Object.Entry_Queue is null.  The=20
> > > > > exception is=20
> > > > =20
> > > > Entry_Queue is *not* null, as you said in the next post.=20
> > > > =20
> > > > > raised in s-tposen-raven.adb (line 167 in my copy) in=20
> > > > > Protected_Single_Entry_Call.=20
> > > > >=20
> > > > > This may be relevant -- the SPI interrupt triggers twice.  This=
=20
> > > > > is because the interrupt is for a DMA completion, and it fires=20
> > > > > both when TX and RX complete (since it's SPI, they complete at=20
> > > > > the same time).  I take care in my interrupt handler to release=
=20
> > > > > the entry from only one of the two interrupts.  Perhaps with the=
=20
> > > > > interrupt firing twice, the runtime may get confused and=20
> > > > > activate the task twice (even though the entry only executes=20
> > > > > once).  But for the above run, the entry was released during the=
=20
> > > > > second SPI interrupt.=20
> > > > =20
> > > > The RTS does this (I hope I have it right):=20
> > > > =20
> > > >    The entry call (Protected_Single_Entry_Call):
> > > >=20
> > > >      locks the entry
> > > >      if the barrier is open then
> > > >        asserts that Call_In_Progress isn't set
> > > >        sets Call_In_Progress
> > > >        calls the entry body wrapper
> > > >        clears Call_In_Progress
> > > >        unlocks the entry
> > > >      else
> > > >        if the Entry_Queue isn't null then
> > > >          unlocks the entry
> > > >          raises PE
> > > >        end if
> > > >        sets the Entry_Queue
> > > >        unlocks the entry
> > > >        sleeps
> > > >      end if
> > > >=20
> > > >    The handler wrapper:
> > > >=20
> > > >      locks the entry
> > > >      calls another wrapper for the handler itself
> > > >      calls Service_Entry
> > > >      exits
> > > >=20
> > > >    Service_Entry:
> > > >      if the Entry_Queue is set and the barrier is open then
> > > >        clears the Entry_Queue
> > > >        asserts that Call_In_Progress isn't set
> > > >        sets Call_In_Progress
> > > >        calls the entry body wrapper
> > > >        clears Call_In_Progress
> > > >        saves the caller task_id
> > > >        unlocks the entry
> > > >        wakes the caller
> > > >      else
> > > >        unlocks the entry
> > > >      end if
> > > >=20
> > > > I really don't see how the sequnece you describe happens!
> > > >=20
> > > > One thing that puzzles me is the locking/unlocking of the entry: th=
is is
> > > > done (in that RTS) by raising the caller task's priority to the cei=
ling
> > > > priority of the task, if necessary. So what about interrupts? And w=
hen
> > > > the handler wrapper (you can see this by compiling the package with=
 the
> > > > PO in with -gnatdg) locks the entry, it seems to raise the current
> > > > task's priority, where the current task has nothing to do with the =
PO at
> > > > all!
> > > >=20
> > >=20
> > > Thank you for all this!  It helps a lot.  I didn't know about -gnatdg=
 -- very useful.
> > >=20
> > > I suspect the problem may stem from the fact that Leave_Kernel in s-b=
bprot.adb can insert a suspended task into the thread queue.  I am guessing=
 somehow this is tripping up the runtime when multiple tasks become runnabl=
e at the same time.
> > >=20
> > > I stepped through the debugger at startup, and I can see the suspende=
d task going into the queue.  Then another task is woken up and put at the =
front of the queue, so the first task is Runnable and the Next task is Susp=
ended.  Then when Leave_Kernel resumes running (it enables interrupts after=
 inserting the suspended task) it may not call Extract (since the running t=
hread state is Runnable) -- it never considers that the Suspended task it i=
nserted might be later in the queue.
> > >=20
> > > I haven't yet been able to directly correlate the Leave_Kernel behavi=
or with the task incorrectly waking up twice.  I'll keep trying things, but=
 I wanted to share what I found so far.
> > >=20
> > > What I have confirmed is that I can get the system into a state where=
 there are two Runnable tasks in the thread queue, and the "Next" field of =
the last one points to the first one.  That is:
> > >=20
> > > First_Thread_Table (CPU_Id) =3D First_Thread_Table (CPU_Id).Next.Next
> > >=20
> > > Right before this happens is when the two tasks are woken up at the s=
ame time.
> > >=20
> > > It appears the task that runs twice (when I get the Program_Error I r=
eported earlier) is the one that gets put in the queue in the suspended sta=
te at startup (an empirical data point after changing priorities of my task=
s).
> > >=20
> >=20
> > I think I know what's going on.  There is a problem with the Insert pro=
cedure in s-bbthqu.adb.  If (1) a thread is already in the queue, (2) it is=
 not at the head of the queue, and (3) the priority is greater than the thr=
ead at the head of the queue, it can get inserted twice.  The error is in t=
he "elsif" condition of this code:
> >=20
> >      if First_Thread_Table (CPU_Id) =3D Thread then
> >          null;
> >=20
> >       --  Insert at the head of queue if there is no other thread with =
a higher
> >       --  priority.
> >=20
> >       elsif First_Thread_Table (CPU_Id) =3D Null_Thread_Id
> >         or else
> >           Thread.Active_Priority > First_Thread_Table (CPU_Id).Active_P=
riority
> >       then
> >          Thread.Next :=3D First_Thread_Table (CPU_Id);
> >          First_Thread_Table (CPU_Id) :=3D Thread;
> >=20
> >       --  Middle or tail insertion
> >=20
> >       else
> >          --  Look for the Aux_Pointer to insert the thread just after i=
t
> >          ...
> >=20
> > Here is what's happening in my case:
> >=20
> > (1) UART is at the head of the queue in the suspended state with priori=
ty 10.  Leave_Kernel code is pending after calling Enable_Interrupts.
> > (2) SPI interrupt fires.
> > (3) SPI task priority is 252 (interrupt priority).
> > (4) SPI task gets inserted at head of queue in the Runnable state.
> > (5) SPI task priority gets adjusted to 15 when Unlock_Entry is called. =
 It is left at the head of the queue.  SPI_task.Next =3D UART_task.
> > (6) UART interrupt fires.
> > (7) UART task priority is 250, and task is set to Runnable.
> > (8) UART task gets inserted at head of queue since its priority is grea=
ter than SPI_task.  Now UART_task.Next =3D SPI_task, and SPI_task.Next =3D =
UART_task.
> > (9) UART task priority is adjusted to 10 when Unlock_Entry is called.  =
Now SPI_task is head of queue, and SPI_task.Next =3D UART_task.  UART_task.=
Next =3D UART_task.
> > (10) SPI task runs.
> > (11) UART task runs twice.  Boom.
> >=20
> > I think the Insert procedure needs to be modified to look through the e=
ntire queue for the thread to be inserted.  Might be simplest to just remov=
e it if it exists before entering the if statement.
> >=20
>=20
> This version of the entire Insert procedure works for me:
>=20
>    ------------
>    -- Insert --
>    ------------
>=20
>    procedure Insert (Thread : Thread_Id) is
>       Aux_Pointer : Thread_Id;
>       CPU_Id      : constant CPU :=3D Get_CPU (Thread);
>=20
>    begin
>=20
>       --  ??? This pragma is disabled because the Tasks_Activated only
>       --  represents the end of activation for one package not all the
>       --  packages. We have to find a better milestone for the end of
>       --  tasks activation.
>=20
>       --  --  A CPU can only insert alarm in its own queue, except during
>       --  --  initialization.
>=20
>       --  pragma Assert (CPU_Id =3D Current_CPU or else not Tasks_Activat=
ed);
>=20
>       --  It may be the case that we try to insert a task that is already=
 in
>       --  the queue. This can only happen if the task was not runnable an=
d its
>       --  context was being used for handling an interrupt. Hence, if the=
 task
>       --  is already in the queue and we try to insert it, we need to che=
ck
>       --  whether it is in the correct place.
>=20
>       --  No insertion if the task is already at the head of the queue
>=20
>       if First_Thread_Table (CPU_Id) =3D Thread then
>          null;
>=20
>          --  Insert at the head of queue if there is no other thread
>          --  with a higher priority.
>=20
>       elsif First_Thread_Table (CPU_Id) =3D Null_Thread_Id then
>          Thread.Next :=3D First_Thread_Table (CPU_Id);
>          First_Thread_Table (CPU_Id) :=3D Thread;
>=20
>       else
>          --  Middle or tail insertion
>=20
>          --  Remove the thread if it is already in the queue.  We know
>          --  the first thread is not null.
>          Aux_Pointer :=3D First_Thread_Table (CPU_Id);
>          while Aux_Pointer.Next /=3D Null_Thread_Id
>            and then Aux_Pointer.Next /=3D Thread
>          loop
>             Aux_Pointer :=3D Aux_Pointer.Next;
>          end loop;
>=20
>          if Aux_Pointer.Next =3D Thread then
>             Aux_Pointer.Next :=3D Thread.Next;
>          end if;
>=20
>          if
>            Thread.Active_Priority > First_Thread_Table (CPU_Id).Active_Pr=
iority
>          then
>             Thread.Next :=3D First_Thread_Table (CPU_Id);
>             First_Thread_Table (CPU_Id) :=3D Thread;
>          else
>             --  Look for the Aux_Pointer to insert the thread just after =
it
>=20
>             Aux_Pointer :=3D First_Thread_Table (CPU_Id);
>             while Aux_Pointer.Next /=3D Null_Thread_Id
>               and then Aux_Pointer.Next.Active_Priority >=3D
>               Thread.Active_Priority
>             loop
>                Aux_Pointer :=3D Aux_Pointer.Next;
>             end loop;
>=20
>             --  Insert the thread after the Aux_Pointer
>=20
>             Thread.Next :=3D Aux_Pointer.Next;
>             Aux_Pointer.Next :=3D Thread;
>          end if;
>=20
>       end if;
>    end Insert;

FYI - This problem is not fixed in GNAT GPL 2015.  I reported it to AdaCore=
 back in February.