From: "Frédéric PRACA" <frederic.praca@free.fr>
Subject: Re: avoiding builtin memset
Date: Wed, 24 May 2017 08:08:54 -0700 (PDT)
Date: 2017-05-24T08:08:54-07:00 [thread overview]
Message-ID: <fcf8a87c-fc29-4a8d-ba56-953981fba8d2@googlegroups.com> (raw)
In-Reply-To: <c298b2db-ecfe-4597-8eec-7b69650dcc85@googlegroups.com>
Le lundi 24 avril 2017 18:06:11 UTC+2, Jere a écrit :
> GNAT GPL 2016
> Windows 10
> Cross compiled to arm cortex m0+
> Full Optimization
>
> With a small runtime that I am modifying, I am not linking in the standard c
> libraries. This means whenever I do an array initialize, gcc tries to link
> in a non existent memset call. I started working on an Ada version of
> memset which I export out. The problem comes when memset tries to
> recursively call itself. The compiler is too smart for me.
>
> At the end of my memset I have a simple loop:
> while Current_Address < End_Address loop
> Convert_8.To_Pointer (Current_Address).all := Uint8_Value;
> Current_Address := Current_Address + 1;
> end loop;
>
> It serves two purposes:
> 1. It finishes up any leftover bytes on an unaligned array
> 2. If the data set is small enough (<= 16), the function skips immediately
> to this loop and just does a byte copy on the whole thing rather
> than try and do all the extra logic for an aligned copy
>
> However GNAT tries to convert that to another memset call, which doesn't
> work well, since I am in memset.
>
> My immediate workaround is to make the loop more complex:
> Count := Count * 2; -- To avod recursive memset call
> while Count > 0 loop
> Convert_8.To_Pointer (Current_Address).all := Uint8_Value;
> Current_Address := Current_Address + 1;
> Count := Count - 2;
> end loop;
>
> But I don't really like this as it adds unnecessary overhead.
>
> I looked around for gcc switches to inhibit calls to memset, but the only
> one I found ( -fno-builtins ) only works on C files. I really don't want to
> write it in C (or assembly for that matter).
>
> I think I can also do ASM statements, but I would hate to have to do ASM if
> there is an Ada solution. I also don't know if GCC would just optimize
> the ASM to a memset call anyways.
>
> Do I have any other options that would let me do it in Ada? I didn't see
> pragmas that jumped out at me.
In the past, I've been doing the same for a x86 toy OS called Lovelace OS.
The spec:
-- Lovelace Operating System - An Unix Like Ada'Based Operating system
-- Copyright (C) 2013-2014 Xavier GRAVE, Frederic BOYER
-- This program is free software: you can redistribute it and/or modify
-- it under the terms of the GNU General Public License as published by
-- the Free Software Foundation, either version 3 of the License, or
-- (at your option) any later version.
-- This program is distributed in the hope that it will be useful,
-- but WITHOUT ANY WARRANTY; without even the implied warranty of
-- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-- GNU General Public License for more details.
-- You should have received a copy of the GNU General Public License
-- along with this program. If not, see <http://www.gnu.org/licenses/>.
pragma Suppress (All_Checks);
with System;
with System.Storage_Elements;
procedure Oasys.Memset (Destination : in System.Address;
Value : in System.Storage_Elements.Storage_Element;
Count : in System.Storage_Elements.Storage_Count);
pragma Pure (Memset);
pragma Export (C, Memset, "memset");
Then the body for the x86-32 part
-- Lovelace Operating System - An Unix Like Ada'Based Operating system
-- Copyright (C) 2013-2014 Xavier GRAVE, Frederic BOYER
-- This program is free software: you can redistribute it and/or modify
-- it under the terms of the GNU General Public License as published by
-- the Free Software Foundation, either version 3 of the License, or
-- (at your option) any later version.
-- This program is distributed in the hope that it will be useful,
-- but WITHOUT ANY WARRANTY; without even the implied warranty of
-- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-- GNU General Public License for more details.
-- You should have received a copy of the GNU General Public License
-- along with this program. If not, see <http://www.gnu.org/licenses/>.
with System.Storage_Elements; use System.Storage_Elements;
with Oasys.Debug;
procedure Oasys.Memset (Destination : in System.Address;
Value : in Storage_Element;
Count : in Storage_Count) is
pragma Suppress (All_Checks);
-- Storage_unit zones
Byte_Zone_Destination : Storage_Array (1 .. Count);
for Byte_Zone_Destination'Address use Destination;
Offset : Storage_Offset := 0;
Number_Of_Bytes_To_Write : Storage_Count := Count;
-- CPU width (ie 32 or 64 bits) in bytes
Full_Width : constant := System.Word_Size / System.Storage_Unit;
begin
-- see http://www.noxeos.com/2013/08/06/code-optimisations/
-- Algo
-- offset = 0
-- while destination + offset is not aligned,
-- set the byte to value and add 1 to offset
-- decrease count of elements to set
while ((Destination + Offset) mod Full_Width /= 0) loop
-- Warning ! We use offset + 1 for indexing the array
-- because offset starts at 0 and arrays to 1
Byte_Zone_Destination (Offset + 1) := Value;
Offset := Offset + 1;
Number_Of_Bytes_To_Write := Number_Of_Bytes_To_Write - 1;
end loop;
declare
-- Number of word zones
Number_Of_Zones : constant Natural := Natural (Number_Of_Bytes_To_Write) / Full_Width;
type Word is mod 2**System.Word_Size;
-- word long zones
type Long_Word_Zone_Array is array (1 .. Number_Of_Zones) of Word;
Zone_Destination : Long_Word_Zone_Array;
for Zone_Destination'Address use (Destination + Offset);
Remaining_Bytes : constant Natural := Natural (Number_Of_Bytes_To_Write) rem Full_Width;
Long_Value : Word := 0;
Power : Natural := 0;
begin
-- Creating the long value
while Power < System.Word_Size loop
Oasys.Debug.Put_String ("Long value " & Word'Image (Long_Value));
Oasys.Debug.New_Line;
Oasys.Debug.Put_String ("Power = " & Natural'Image (Power));
Oasys.Debug.New_Line;
Oasys.Debug.Put_String ("Value shifted = " & Word'Image (Word (Value) * 2**Power));
Oasys.Debug.New_Line;
Long_Value := Long_Value + Word (Value) * 2**Power;
Power := Power + System.Storage_Unit;
end loop;
Oasys.Debug.Put_String ("Long value " & Word'Image (Long_Value));
Oasys.Debug.New_Line;
-- As we are aligned,
-- find how many aligned word we have
-- build an array from (destination + last_offset) to last aligned count
-- This way, we use the full width of the CPU
for Index in Zone_Destination'Range loop
Zone_Destination (Index) := Long_Value;
Offset := Offset + Full_Width;
end loop;
-- If there are still small bytes to change
if (Remaining_Bytes /= 0) then
-- For the same reason as above, we add one to Offset
for Index in Offset - 1 .. Byte_Zone_Destination'Last loop
Byte_Zone_Destination (Index) := Value;
end loop;
end if;
end;
end Oasys.Memset;
For what it's worth ;)
prev parent reply other threads:[~2017-05-24 15:08 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-24 16:06 avoiding builtin memset Jere
2017-04-24 16:56 ` Shark8
2017-04-25 1:21 ` Anh Vo
2017-04-25 2:57 ` Luke A. Guest
2017-04-25 18:43 ` Shark8
2017-04-25 22:18 ` Luke A. Guest
2017-04-26 7:35 ` Simon Wright
2017-04-26 13:44 ` Lucretia
2017-04-26 15:22 ` Simon Wright
2017-04-27 0:22 ` Jere
2017-04-27 4:35 ` J-P. Rosen
2017-04-27 7:09 ` Simon Wright
2017-05-24 15:08 ` Frédéric PRACA [this message]
replies disabled
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox