comp.lang.ada
 help / color / mirror / Atom feed
* Array and memory usage
@ 2003-03-09 11:19 Preben Randhol
  2003-03-09 11:52 ` John R. Strohm
                   ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Preben Randhol @ 2003-03-09 11:19 UTC (permalink / raw)


I was a bit puzzeled to see that if I do:

   type Str_Array_Type is array (1 .. 1000) of String (1 .. 800);

   type User_Data is
      record
         Name     : Str_Array_Type;
         Hostname : Str_Array_Type;
      end record;

my program uses 800 bytes of memory, while if I only change the program
like this:

   type User_Data is
      record
         Name     : Str_Array_Type := (others => (others => ' '));
         Hostname : Str_Array_Type := (others => (others => ' '));
      end record;

then the program uses 2356 bytes.

I thought all the memory needed for an array would be claimed when the
program started no matter if the array was filled or not.

Is this correct or am I just being tricked by the program that measures
the memory used by my program? 

If it is correct then it doesn't matter if I define my array as (1 ..
100) or (1 .. 10_000) as long as I don't initialize the strings.
I can just add to User_Data a:

   type Used_Type is array (1 .. 1000) of Boolean;

   Used : Used_Type := (others => False);

And then I don't need to add any linked list etc... to my small server.

I use GNAT 3.14p on i386 Linux (Debian).

Thanks in advance

Prebene Randhol
-- 
 ()   Join the worldwide campaign to protect fundamental human rights.
'||}
{||'                                           http://www.amnesty.org/



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Array and memory usage
  2003-03-09 11:19 Array and memory usage Preben Randhol
@ 2003-03-09 11:52 ` John R. Strohm
  2003-03-09 12:11   ` Preben Randhol
  2003-03-09 17:04 ` Jeffrey Creem
  2003-03-09 18:10 ` Robert A Duff
  2 siblings, 1 reply; 21+ messages in thread
From: John R. Strohm @ 2003-03-09 11:52 UTC (permalink / raw)


"Preben Randhol" <randhol+news@pvv.org> wrote in message
news:slrnb6m8mg.15o.randhol+news@kiuk0152.chembio.ntnu.no...
> I was a bit puzzeled to see that if I do:
>
>    type Str_Array_Type is array (1 .. 1000) of String (1 .. 800);
>
>    type User_Data is
>       record
>          Name     : Str_Array_Type;
>          Hostname : Str_Array_Type;
>       end record;

This declares two types and NO instances of those types.

> my program uses 800 bytes of memory, while if I only change the program
> like this:
>
>    type User_Data is
>       record
>          Name     : Str_Array_Type := (others => (others => ' '));
>          Hostname : Str_Array_Type := (others => (others => ' '));
>       end record;
>
> then the program uses 2356 bytes.

This declares the same two types and TWO (2) implicit instances of
Str_Array_Type, namely, the aggregate constants.  That's where the
difference comes from.






^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Array and memory usage
  2003-03-09 11:52 ` John R. Strohm
@ 2003-03-09 12:11   ` Preben Randhol
  2003-03-09 13:34     ` Marin David Condic
  2003-03-09 17:49     ` Robert A Duff
  0 siblings, 2 replies; 21+ messages in thread
From: Preben Randhol @ 2003-03-09 12:11 UTC (permalink / raw)


John R. Strohm wrote:
> "Preben Randhol" <randhol+news@pvv.org> wrote in message
> news:slrnb6m8mg.15o.randhol+news@kiuk0152.chembio.ntnu.no...
>> I was a bit puzzeled to see that if I do:
>>
>>    type Str_Array_Type is array (1 .. 1000) of String (1 .. 800);
>>
>>    type User_Data is
>>       record
>>          Name     : Str_Array_Type;
>>          Hostname : Str_Array_Type;
>>       end record;
> 
> This declares two types and NO instances of those types.
> 
>> my program uses 800 bytes of memory, while if I only change the program
>> like this:
>>
>>    type User_Data is
>>       record
>>          Name     : Str_Array_Type := (others => (others => ' '));
>>          Hostname : Str_Array_Type := (others => (others => ' '));
>>       end record;
>>
>> then the program uses 2356 bytes.
> 
> This declares the same two types and TWO (2) implicit instances of
> Str_Array_Type, namely, the aggregate constants.  That's where the
> difference comes from.
> 

Ok but I also have a :

   User_Db : User_Data;

And if I use the first User_Data type then the memory usage only increases as
I add Names and Hostnames. I mean it doesn't go from 800 to 2356 as I
add one name, but slowly until all Names are added. This puzzeles me.

-- 
 ()   Join the worldwide campaign to protect fundamental human rights.
'||}
{||'                                           http://www.amnesty.org/



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Array and memory usage
  2003-03-09 12:11   ` Preben Randhol
@ 2003-03-09 13:34     ` Marin David Condic
  2003-03-09 15:05       ` Preben Randhol
  2003-03-10  1:45       ` Mark Biggar
  2003-03-09 17:49     ` Robert A Duff
  1 sibling, 2 replies; 21+ messages in thread
From: Marin David Condic @ 2003-03-09 13:34 UTC (permalink / raw)


Note that it is *probably* generating code that will initialize the objects
as they are created. It is probably also not the most efficient way that the
initialization could be done, either. I've seen major problems with embedded
code where we've wanted to initialize global RAM data and couldn't get
static initialization of it using the usual Ada mechanisms. I've also seen
where initialization of the type would generate code to initialize each
field of the record with function calls but initialization of the object
with an aggregate would result in a simple memory move of the whole thing.

If you care about code size and efficiency, you can find a number of
"surprises" and "gotchas" with Ada.

MDC
--
======================================================================
Marin David Condic
I work for: http://www.belcan.com/
My project is: http://www.jsf.mil/

Send Replies To: m c o n d i c @ a c m . o r g

    "Going cold turkey isn't as delicious as it sounds."
        -- H. Simpson
======================================================================

Preben Randhol <randhol+news@pvv.org> wrote in message
news:slrnb6mbnv.8a1.randhol+news@kiuk0152.chembio.ntnu.no...
>
> Ok but I also have a :
>
>    User_Db : User_Data;
>
> And if I use the first User_Data type then the memory usage only increases
as
> I add Names and Hostnames. I mean it doesn't go from 800 to 2356 as I
> add one name, but slowly until all Names are added. This puzzeles me.






^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Array and memory usage
  2003-03-09 13:34     ` Marin David Condic
@ 2003-03-09 15:05       ` Preben Randhol
  2003-03-10  1:45       ` Mark Biggar
  1 sibling, 0 replies; 21+ messages in thread
From: Preben Randhol @ 2003-03-09 15:05 UTC (permalink / raw)


Marin David Condic wrote:
> Note that it is *probably* generating code that will initialize the objects
> as they are created. It is probably also not the most efficient way that the
> initialization could be done, either. I've seen major problems with embedded
> code where we've wanted to initialize global RAM data and couldn't get
> static initialization of it using the usual Ada mechanisms. I've also seen
> where initialization of the type would generate code to initialize each
> field of the record with function calls but initialization of the object
> with an aggregate would result in a simple memory move of the whole thing.

Ok so this is something the compiler supplies and this could be
different with another compiler?

> If you care about code size and efficiency, you can find a number of
> "surprises" and "gotchas" with Ada.

What do you mean?

-- 
 ()   Join the worldwide campaign to protect fundamental human rights.
'||}
{||'                                           http://www.amnesty.org/



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Array and memory usage
  2003-03-09 11:19 Array and memory usage Preben Randhol
  2003-03-09 11:52 ` John R. Strohm
@ 2003-03-09 17:04 ` Jeffrey Creem
  2003-03-09 17:09   ` Preben Randhol
  2003-03-09 18:10 ` Robert A Duff
  2 siblings, 1 reply; 21+ messages in thread
From: Jeffrey Creem @ 2003-03-09 17:04 UTC (permalink / raw)



"Preben Randhol" <randhol+news@pvv.org> wrote in message
news:slrnb6m8mg.15o.randhol+news@kiuk0152.chembio.ntnu.no...
> I was a bit puzzeled to see that if I do:
>
>    type Str_Array_Type is array (1 .. 1000) of String (1 .. 800);
>
>    type User_Data is
>       record
>          Name     : Str_Array_Type;
>          Hostname : Str_Array_Type;
>       end record;
>
> my program uses 800 bytes of memory, while if I only change the program
> like this:
>
>    type User_Data is
>       record
>          Name     : Str_Array_Type := (others => (others => ' '));
>          Hostname : Str_Array_Type := (others => (others => ' '));
>       end record;
>
> then the program uses 2356 bytes.
>
> I thought all the memory needed for an array would be claimed when the
> program started no matter if the array was filled or not.
>
> Is this correct or am I just being tricked by the program that measures
> the memory used by my program?
>
> If it is correct then it doesn't matter if I define my array as (1 ..
> 100) or (1 .. 10_000) as long as I don't initialize the strings.
> I can just add to User_Data a:
>
>    type Used_Type is array (1 .. 1000) of Boolean;
>
>    Used : Used_Type := (others => False);


I believe what you are seeing is not something that will happen under all
OS's.

Linux uses a commit-on-us type memory manager. Even though you initialized
your data, it probably was done in such a way that the memory was intialized
within the ELF initialized data segment.

Under linux even something like

aa = malloc(A_BIG_NUMBER)

will not show all of the data as actually allocated until a process accesses
each page of memory..

What you are seeing is partially a trick and partially real. It is not a
direct effect of the language or compiler.

Whether or not you should write your programs to rely on it will depend on
many factors.

Certainly on an embedded OS like vxWorks, your memory would be allocated and
consumed immediately.





^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Array and memory usage
  2003-03-09 17:04 ` Jeffrey Creem
@ 2003-03-09 17:09   ` Preben Randhol
  0 siblings, 0 replies; 21+ messages in thread
From: Preben Randhol @ 2003-03-09 17:09 UTC (permalink / raw)


Jeffrey Creem wrote:

> 
> I believe what you are seeing is not something that will happen under all
> OS's.
> 
> Linux uses a commit-on-us type memory manager. Even though you initialized
> your data, it probably was done in such a way that the memory was intialized
> within the ELF initialized data segment.
> 
> Under linux even something like
> 
> aa = malloc(A_BIG_NUMBER)
> 
> will not show all of the data as actually allocated until a process accesses
> each page of memory..
> 
> What you are seeing is partially a trick and partially real. It is not a
> direct effect of the language or compiler.
> 
> Whether or not you should write your programs to rely on it will depend on
> many factors.
> 
> Certainly on an embedded OS like vxWorks, your memory would be allocated and
> consumed immediately.

OK Thanks for clearing it up. I saw the effect last night so I decided
it was time for sleep as I thought I had done something else wrong. But
when I saw the same effect today I started wondering what the effect
was.

At any rate I think I'll stick with an array and set it to some number
that can be changed if needed. We are not talking about a lot of bytes
anyway :-)

-- 
 ()   Join the worldwide campaign to protect fundamental human rights.
'||}
{||'                                           http://www.amnesty.org/



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Array and memory usage
  2003-03-09 12:11   ` Preben Randhol
  2003-03-09 13:34     ` Marin David Condic
@ 2003-03-09 17:49     ` Robert A Duff
  1 sibling, 0 replies; 21+ messages in thread
From: Robert A Duff @ 2003-03-09 17:49 UTC (permalink / raw)


Preben Randhol <randhol+news@pvv.org> writes:

> John R. Strohm wrote:
> > "Preben Randhol" <randhol+news@pvv.org> wrote in message
> > news:slrnb6m8mg.15o.randhol+news@kiuk0152.chembio.ntnu.no...
> >> I was a bit puzzeled to see that if I do:
> >>
> >>    type Str_Array_Type is array (1 .. 1000) of String (1 .. 800);
> >>
> >>    type User_Data is
> >>       record
> >>          Name     : Str_Array_Type;
> >>          Hostname : Str_Array_Type;
> >>       end record;
> > 
> > This declares two types and NO instances of those types.
> > 
> >> my program uses 800 bytes of memory, while if I only change the program
> >> like this:
> >>
> >>    type User_Data is
> >>       record
> >>          Name     : Str_Array_Type := (others => (others => ' '));
> >>          Hostname : Str_Array_Type := (others => (others => ' '));
> >>       end record;
> >>
> >> then the program uses 2356 bytes.
> > 
> > This declares the same two types and TWO (2) implicit instances of
> > Str_Array_Type, namely, the aggregate constants.  That's where the
> > difference comes from.

A good compiler will not do that, especially if the object is statically
allocated.

- Bob



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Array and memory usage
  2003-03-09 11:19 Array and memory usage Preben Randhol
  2003-03-09 11:52 ` John R. Strohm
  2003-03-09 17:04 ` Jeffrey Creem
@ 2003-03-09 18:10 ` Robert A Duff
  2003-03-10 15:13   ` Matthew Heaney
  2 siblings, 1 reply; 21+ messages in thread
From: Robert A Duff @ 2003-03-09 18:10 UTC (permalink / raw)


Preben Randhol <randhol+news@pvv.org> writes:

> I was a bit puzzeled to see that if I do:
> 
>    type Str_Array_Type is array (1 .. 1000) of String (1 .. 800);
> 
>    type User_Data is
>       record
>          Name     : Str_Array_Type;
>          Hostname : Str_Array_Type;
>       end record;
> 
> my program uses 800 bytes of memory, while if I only change the program
> like this:
> 
>    type User_Data is
>       record
>          Name     : Str_Array_Type := (others => (others => ' '));
>          Hostname : Str_Array_Type := (others => (others => ' '));
>       end record;
> 
> then the program uses 2356 bytes.
> 
> I thought all the memory needed for an array would be claimed when the
> program started no matter if the array was filled or not.

I presume you mean you declared one object of type User_Data at library
package level?

A good operating system will allocate *virtual* memory for that object
(among other things), but will not allocate physical memory until is is
needed.  If you initialize it to all-blanks, physical memory is needed
to store those blanks (on typical operating systems).  If you don't
initialize it, that physical memory is not needed (until later,
presumably).

You don't say what tool you're using to measure memory usage, or what it
claims to measure.  It may well be that it is measuring physical memory,
which may be quite low at first, but might grow as you assign data into
that object.

All of this applies only if you have a proper operating system running
on hardware with memory protection.  Many embedded systems do not, and
therefore would allocate as much physical memory as they need for that
variable up front.

> Is this correct or am I just being tricked by the program that measures
> the memory used by my program? 
> 
> If it is correct then it doesn't matter if I define my array as (1 ..
> 100) or (1 .. 10_000) as long as I don't initialize the strings.
> I can just add to User_Data a:
> 
>    type Used_Type is array (1 .. 1000) of Boolean;
> 
>    Used : Used_Type := (others => False);
> 
> And then I don't need to add any linked list etc... to my small server.

Well, if you're allocating randomly all over the place, this won't help,
because physical memory is allocated in pages.

But one thing you can do is if you're allocating "used slots" at the
beginning of an array, you can keep a count of the "last used" slot.  On
a good operating system you can safely allocate an enormous array (say,
hundreds of megabytes), trusting the OS to allocate physical memory near
the beginning of the array as needed.  This is useful in cases where you
don't know ahead of time how much data you need -- it might be huge, but
probably is quite small, and you don't want to waste physical memory
when it is small.

If you use this trick on Windows, it will allocate backing store for all
that virtual memory.  I believe most Unix systems are smarter than that.

This trick is sometimes more efficient than linked lists for growable
data structures.  The links waste space, plus if you're not careful, the
list gets scattered all over the place, damaging cache behavior.  The
nice thing about arrays is that they are contiguous.

If you try to use this trick for an array of pointers, Ada will defeat
you by implicitly initializing to all-nulls.  It is possible to do
better, but that requires some cooperation between OS and compiler.

As a rule of thumb, virtual memory is perhaps 1000 times cheaper than
physical memory, depending on hardware and OS.

> I use GNAT 3.14p on i386 Linux (Debian).

I believe a page on that system is 2**12 bytes.  So if my explanation is
correct, you should be seeing your memory usage grow in chunks of that
size.  It doesn't match the numbers you mentioned above...

- Bob



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Array and memory usage
  2003-03-09 13:34     ` Marin David Condic
  2003-03-09 15:05       ` Preben Randhol
@ 2003-03-10  1:45       ` Mark Biggar
  2003-03-10 11:57         ` Marin David Condic
  1 sibling, 1 reply; 21+ messages in thread
From: Mark Biggar @ 2003-03-10  1:45 UTC (permalink / raw)


Marin David Condic wrote:
> Note that it is *probably* generating code that will initialize the objects
> as they are created. It is probably also not the most efficient way that the
> initialization could be done, either. I've seen major problems with embedded
> code where we've wanted to initialize global RAM data and couldn't get
> static initialization of it using the usual Ada mechanisms. I've also seen
> where initialization of the type would generate code to initialize each
> field of the record with function calls but initialization of the object
> with an aggregate would result in a simple memory move of the whole thing.
> 
> If you care about code size and efficiency, you can find a number of
> "surprises" and "gotchas" with Ada.

Shouldn't the preelaborate pragma help with this problem?  Otherwise
this is obviously a "quality or implementation" issue.


-- 
Mark Biggar
mark.a.biggar@attbi.com




^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Array and memory usage
  2003-03-10  1:45       ` Mark Biggar
@ 2003-03-10 11:57         ` Marin David Condic
  2003-03-10 12:06           ` Jeffrey Creem
  0 siblings, 1 reply; 21+ messages in thread
From: Marin David Condic @ 2003-03-10 11:57 UTC (permalink / raw)


They're all "Quality Of Implementation" issues :-)

Here's the thing: In Ada, it is *theoretically* possible for compilers to do
a wonderful job with all sorts of things. However, when one starts examining
the output of the available compilers in given circumstances and discovers
that the output is *universally bad* in some respect, what difference does
it make that the compiler writers might actually be able to do better? The
fact is they didn't do better and unless you have truckloads of money to
give them and months/years to wait for the end result, you're not going to
get the output you want.

In fairness to the compiler writers, they often start from some general
solution that works (albeit inefficiently) for all cases, then they start
looking at optimizations. In the case of a record type with default initial
values, one can see that the initialization could be problematic in
different contexts. If you declare the record globally, you'd like for that
global memory to be statically initialized and your load image just has the
bits set the way you want them. But what about when the record is allocated
on the stack? Or if it comes from the heap? Or if one of the initializations
is a function call? So they write an initializer that works for all the
cases and the global memory variant suffers. There are probably lots of
similar examples. In scenarios like that, you've got to work with fooling
the compiler into doing what you want or working with various other
techniques of getting the job done. (One can always resort to C and get the
desired output. :-)

So I guess the gripe is that while Ada may allow for efficient
implementation it often doesn't get there because its just too hard to do it
all. Sometimes you have to avoid the spiffy features because they aren't
implemented well. Mostly, this is a realtime/embedded complaint that I have
not had problems with in building workstation apps. Oddly, Ada works better
in an adopted domain than it does in the domain it was intended for
sometimes. :-)

MDC
--
======================================================================
Marin David Condic
I work for: http://www.belcan.com/
My project is: http://www.jsf.mil/

Send Replies To: m c o n d i c @ a c m . o r g

    "Going cold turkey isn't as delicious as it sounds."
        -- H. Simpson
======================================================================

Mark Biggar <mark.a.biggar@attbi.com> wrote in message
news:x6Saa.25310$eG2.4454@sccrnsc03...
> >
> > If you care about code size and efficiency, you can find a number of
> > "surprises" and "gotchas" with Ada.
>
> Shouldn't the preelaborate pragma help with this problem?  Otherwise
> this is obviously a "quality or implementation" issue.
>






^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Array and memory usage
  2003-03-10 11:57         ` Marin David Condic
@ 2003-03-10 12:06           ` Jeffrey Creem
  0 siblings, 0 replies; 21+ messages in thread
From: Jeffrey Creem @ 2003-03-10 12:06 UTC (permalink / raw)



"Marin David Condic" <mcondic.auntie.spam@acm.org> wrote in message
news:b4huit$crl$1@slb6.atl.mindspring.net...
> They're all "Quality Of Implementation" issues :-)
>
stuff deleted
> different contexts. If you declare the record globally, you'd like for
that
> global memory to be statically initialized and your load image just has
the
> bits set the way you want them. But what about when the record is
allocated
more stuff deleted..


Note that again in this particular case, no one has shown that the compiler
did not statically initialize this memory. I suspect this data is indeed
initialized within the initialized data segment of the elf file. The OS is
not showing the memory as used since the process has yet to refer to that
section of  address space.

Everything that is happening sounds consistent with an OS with commit-on-use
semantics.

Nothing that has been presented looked like any sort of dynamic elaboration
(not mentioned in e-mail I am replying to but I believe it was part of this
thread perhaps from a different poster) or any such strange concept.






^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Array and memory usage
  2003-03-09 18:10 ` Robert A Duff
@ 2003-03-10 15:13   ` Matthew Heaney
  2003-03-10 23:33     ` Robert A Duff
  0 siblings, 1 reply; 21+ messages in thread
From: Matthew Heaney @ 2003-03-10 15:13 UTC (permalink / raw)


Robert A Duff <bobduff@shell01.TheWorld.com> wrote in message news:<wccr89ge7hw.fsf@shell01.TheWorld.com>...
> 
> If you use this trick on Windows, it will allocate backing store for all
> that virtual memory.  I believe most Unix systems are smarter than that.
> 
> This trick is sometimes more efficient than linked lists for growable
> data structures.  The links waste space, plus if you're not careful, the
> list gets scattered all over the place, damaging cache behavior.  The
> nice thing about arrays is that they are contiguous.

One thing you can on Windows is simply reserve the virtual memory for
the array, without committing it to physical memory, and then use SEH
("structured exception handling") to handle the error.

The SEH mechanism supports resumption semantics, so you can handle the
exception by committing the page you need, and then resuming from the
spot from where the exception was raised.

This is documented in Jeff Richter's Programming Applications book.



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Array and memory usage
  2003-03-10 15:13   ` Matthew Heaney
@ 2003-03-10 23:33     ` Robert A Duff
  2003-03-11  0:40       ` Randy Brukardt
                         ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Robert A Duff @ 2003-03-10 23:33 UTC (permalink / raw)


mheaney@on2.com (Matthew Heaney) writes:

> Robert A Duff <bobduff@shell01.TheWorld.com> wrote in message news:<wccr89ge7hw.fsf@shell01.TheWorld.com>...
> > 
> > If you use this trick on Windows, it will allocate backing store for all
> > that virtual memory.  I believe most Unix systems are smarter than that.
> > 
> > This trick is sometimes more efficient than linked lists for growable
> > data structures.  The links waste space, plus if you're not careful, the
> > list gets scattered all over the place, damaging cache behavior.  The
> > nice thing about arrays is that they are contiguous.
> 
> One thing you can on Windows is simply reserve the virtual memory for
> the array, without committing it to physical memory, and then use SEH
> ("structured exception handling") to handle the error.
> 
> The SEH mechanism supports resumption semantics, so you can handle the
> exception by committing the page you need, and then resuming from the
> spot from where the exception was raised.
> 
> This is documented in Jeff Richter's Programming Applications book.

I thought it happened automatically.  That is, if I say:

    X: array (1..100_000_000) of Integer;

it would allocate 400,000,000 bytes of virtual memory, but only allocate
physical pages on demand.  I believe that much works on Windows (without
calling windows-specific stuff like SEH).  (Windows 2000, Windows XP.)
The problem I found was that it allocated 400,000,000 bytes of backing
store, which was annoying.

Am I misinformed?

- Bob



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Array and memory usage
  2003-03-10 23:33     ` Robert A Duff
@ 2003-03-11  0:40       ` Randy Brukardt
  2003-03-11 23:37         ` Robert A Duff
  2003-03-11 16:01       ` Matthew Heaney
  2003-03-11 17:38       ` Warren W. Gay VE3WWG
  2 siblings, 1 reply; 21+ messages in thread
From: Randy Brukardt @ 2003-03-11  0:40 UTC (permalink / raw)


Robert A Duff wrote in message ...
>mheaney@on2.com (Matthew Heaney) writes:
>
>> Robert A Duff <bobduff@shell01.TheWorld.com> wrote in message
news:<wccr89ge7hw.fsf@shell01.TheWorld.com>...
>> >
>> > If you use this trick on Windows, it will allocate backing store
for all
>> > that virtual memory.  I believe most Unix systems are smarter than
that.
>> >
>> > This trick is sometimes more efficient than linked lists for
growable
>> > data structures.  The links waste space, plus if you're not
careful, the
>> > list gets scattered all over the place, damaging cache behavior.
The
>> > nice thing about arrays is that they are contiguous.
>>
>> One thing you can on Windows is simply reserve the virtual memory for
>> the array, without committing it to physical memory, and then use SEH
>> ("structured exception handling") to handle the error.
>>
>> The SEH mechanism supports resumption semantics, so you can handle
the
>> exception by committing the page you need, and then resuming from the
>> spot from where the exception was raised.
>>
>> This is documented in Jeff Richter's Programming Applications book.
>
>I thought it happened automatically.  That is, if I say:
>
>    X: array (1..100_000_000) of Integer;
>
>it would allocate 400,000,000 bytes of virtual memory, but only
allocate
>physical pages on demand.  I believe that much works on Windows
(without
>calling windows-specific stuff like SEH).  (Windows 2000, Windows XP.)
>The problem I found was that it allocated 400,000,000 bytes of backing
>store, which was annoying.
>
>Am I misinformed?


Sort of. Windows has three stages of memory allocation, address space
allocation, virtual page allocation, and physical page allocation. The
first two are (partially) under control of the programmer.

Swap space is allocated when virtual pages are allocated (called
"commit"). If the pages aren't commited, an attempt to access them
causes a page fault. (That's the idea Matt was talking about).

Memory that is allocated in a load image (even as address space) is
always commited, so that it can be accessed without a fault -- but that
requires swap space to be allocated (even if no physical memory will
ever be used).

The proper way to deal with this is to avoid allocating the address
space at compile-time; rather use dynamic allocation to do it. That is
easy to do in Ada by writing a storage pool that allocates the address
space immediately, then commits additional memory only as needed.

In my application, I did it by using accesses to unconstrained arrays,
and having an "Expand" routine. When one became full, I called the
Expand routine, which determined if uncommitted address space followed
the acces, and if it did, commited the needed memory and adjusted the
array descriptor for the access type. Thus, all of the expansion was
done in place. Of course, adjusting the array descriptor is not a safe
or portable operation, and it might not even be possible on some
compilers. If the memory couldn't be expanded, then I would allocate the
needed larger array from the pool, copy it, and free the old array --
that is, the standard way of expanding dynamic arrays.

One warning: address space has to be allocated in 64K chunks. If you try
to use smaller chunks than that, it get rounded up to 64K. (I found that
out the hard way).

              Randy.





^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Array and memory usage
  2003-03-10 23:33     ` Robert A Duff
  2003-03-11  0:40       ` Randy Brukardt
@ 2003-03-11 16:01       ` Matthew Heaney
  2003-03-11 17:38       ` Warren W. Gay VE3WWG
  2 siblings, 0 replies; 21+ messages in thread
From: Matthew Heaney @ 2003-03-11 16:01 UTC (permalink / raw)


Robert A Duff <bobduff@shell01.TheWorld.com> wrote in message news:<wccfzpuajbg.fsf@shell01.TheWorld.com>...
> 
> I thought it happened automatically.  That is, if I say:
> 
>     X: array (1..100_000_000) of Integer;
> 
> it would allocate 400,000,000 bytes of virtual memory, but only allocate
> physical pages on demand.  I believe that much works on Windows (without
> calling windows-specific stuff like SEH).  (Windows 2000, Windows XP.)
> The problem I found was that it allocated 400,000,000 bytes of backing
> store, which was annoying.
> 
> Am I misinformed?

Windows is going to commit all that memory up front, which (as Randy
pointed out) will indeed allocate backing store.

The solution is to "reserve" that range of virtual memory, without
actually "committing" the pages (to physical memory).  Then use SEH to
"commit" the pages as necessary, when a page fault occurs.



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Array and memory usage
  2003-03-10 23:33     ` Robert A Duff
  2003-03-11  0:40       ` Randy Brukardt
  2003-03-11 16:01       ` Matthew Heaney
@ 2003-03-11 17:38       ` Warren W. Gay VE3WWG
  2 siblings, 0 replies; 21+ messages in thread
From: Warren W. Gay VE3WWG @ 2003-03-11 17:38 UTC (permalink / raw)


Robert A Duff wrote:
> mheaney@on2.com (Matthew Heaney) writes:
>>Robert A Duff <bobduff@shell01.TheWorld.com> wrote in message news:<wccr89ge7hw.fsf@shell01.TheWorld.com>...
>>
>>>If you use this trick on Windows, it will allocate backing store for all
>>>that virtual memory.  I believe most Unix systems are smarter than that.
>>>
>>>This trick is sometimes more efficient than linked lists for growable
>>>data structures.  The links waste space, plus if you're not careful, the
>>>list gets scattered all over the place, damaging cache behavior.  The
>>>nice thing about arrays is that they are contiguous.
>>
>>One thing you can on Windows is simply reserve the virtual memory for
>>the array, without committing it to physical memory, and then use SEH
>>("structured exception handling") to handle the error.
>>
>>The SEH mechanism supports resumption semantics, so you can handle the
>>exception by committing the page you need, and then resuming from the
>>spot from where the exception was raised.
>>
>>This is documented in Jeff Richter's Programming Applications book.
> 
> I thought it happened automatically.  That is, if I say:
> 
>     X: array (1..100_000_000) of Integer;
> 
> it would allocate 400,000,000 bytes of virtual memory, but only allocate
> physical pages on demand.  I believe that much works on Windows (without
> calling windows-specific stuff like SEH).  (Windows 2000, Windows XP.)
> The problem I found was that it allocated 400,000,000 bytes of backing
> store, which was annoying.
> 
> Am I misinformed?
> 
> - Bob

Where it is defined also makes a difference. If you do:

procedure Foo is
    X: array (1..100_000_000) of Integer;
begin
    ...

then this will use the _stack_ (considerably at that). If your stack is too
small (or too small because it has been split up between many tasks),
this has no hope of succeeding without blowing up, or corrupting memory
beyond the stack (or end of) if the stack overflow goes undetected.

The pages are committed on the stack as they are used AFAIK, for most
platforms.

-- 
Warren W. Gay VE3WWG
http://home.cogeco.ca/~ve3wwg




^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Array and memory usage
  2003-03-11  0:40       ` Randy Brukardt
@ 2003-03-11 23:37         ` Robert A Duff
  2003-03-12 19:18           ` Randy Brukardt
  0 siblings, 1 reply; 21+ messages in thread
From: Robert A Duff @ 2003-03-11 23:37 UTC (permalink / raw)


"Randy Brukardt" <randy@rrsoftware.com> writes:

> Sort of. Windows has three stages of memory allocation, address space
> allocation, virtual page allocation, and physical page allocation. The
> first two are (partially) under control of the programmer.
> 
> Swap space is allocated when virtual pages are allocated (called
> "commit"). If the pages aren't commited, an attempt to access them
> causes a page fault. (That's the idea Matt was talking about).
> 
> Memory that is allocated in a load image (even as address space) is
> always commited, so that it can be accessed without a fault -- but that
> requires swap space to be allocated (even if no physical memory will
> ever be used).

Thank you (and Matthew) *very* much for explaining this to me.
Very helpful.

I still don't see the *point* of this design, though.  (Not the first
time I've disagreed with a design "aspect" of Windows...)

The hardware is going to fault if there is no *physical* page allocated
for a given virtual address.  That requires switching to supervisor mode
(slow).  Whether or not the user-mode process sees this fault is largely
irrelevant (to efficiency).

I see no reason why swap space should be allocated (on the disk) until
it is needed -- when the OS wants to swap out a (dirty) page.  Or if you
don't buy that, then why not allocate swap space when *physical* memory
is first allocated for a given virtual address.  Clearly, if a given
virtual page frame has never been touched, so no physical memory has
ever been allocated for it, it can't be swapped out, so no swap space is
needed.

In other words, I see no reason for this weird "committed" state, where
swap space is allocated on the disk but no physical page is allocated.

Running out of memory is pretty unpredictable anyway (on virtual systems
like we're talking about), so who cares exactly when it occurs?

Also, I see no reason why user-mode code needs to be involved at all,
for this trick.  There are reasons why user-mode code might want to set
the protection bits of parts of its address space, and handle
traps/signals/whatever caused by accessing those parts, and that
requires separate system calls.  I've done that (on both Unix and
Windows).  But surely *this* trick (relying on "allocate on first use")
ought to be completely invisible to user-mode code.

Am I confused, or was the designer of Windwos confused?

> The proper way to deal with this is to avoid allocating the address
> space at compile-time; rather use dynamic allocation to do it. That is
> easy to do in Ada by writing a storage pool that allocates the address
> space immediately, then commits additional memory only as needed.
> 
> In my application, I did it by using accesses to unconstrained arrays,
> and having an "Expand" routine. When one became full, I called the
> Expand routine, which determined if uncommitted address space followed
> the acces, and if it did, commited the needed memory and adjusted the
> array descriptor for the access type. Thus, all of the expansion was
> done in place. Of course, adjusting the array descriptor is not a safe
> or portable operation, and it might not even be possible on some
> compilers. If the memory couldn't be expanded, then I would allocate the
> needed larger array from the pool, copy it, and free the old array --
> that is, the standard way of expanding dynamic arrays.
> 
> One warning: address space has to be allocated in 64K chunks. If you try
> to use smaller chunks than that, it get rounded up to 64K. (I found that
> out the hard way).

Why 64K?  When not the page size supported by the hardware (which is
2**12 bytes on 386-family).

>               Randy.

Sorry if this post slightly off topic.  At least I'm not ranting about
Adam Smith.  ;-)

- Bob



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Array and memory usage
  2003-03-11 23:37         ` Robert A Duff
@ 2003-03-12 19:18           ` Randy Brukardt
  0 siblings, 0 replies; 21+ messages in thread
From: Randy Brukardt @ 2003-03-12 19:18 UTC (permalink / raw)


Robert A Duff wrote in message ...
>"Randy Brukardt" <randy@rrsoftware.com> writes:
...
>In other words, I see no reason for this weird "committed" state, where
>swap space is allocated on the disk but no physical page is allocated.
>
>Running out of memory is pretty unpredictable anyway (on virtual
systems
>like we're talking about), so who cares exactly when it occurs?
>
>Also, I see no reason why user-mode code needs to be involved at all,
>for this trick.  There are reasons why user-mode code might want to set
>the protection bits of parts of its address space, and handle
>traps/signals/whatever caused by accessing those parts, and that
>requires separate system calls.  I've done that (on both Unix and
>Windows).  But surely *this* trick (relying on "allocate on first use")
>ought to be completely invisible to user-mode code.
>
>Am I confused, or was the designer of Windwos confused?

You want me to explain why Windows is designed the way it is? I think it
would be easier for me to find Osama bin Laden....

I can see some benefit to allocating address space which will fault (the
uncommited state), and certainly you have to be able to allocate address
space which does not fault (the committed state). But why the committed
state needs space in the swap file is a mystery to me.

...
>> One warning: address space has to be allocated in 64K chunks. If you
try
>> to use smaller chunks than that, it get rounded up to 64K. (I found
that
>> out the hard way).
>
>Why 64K?  When not the page size supported by the hardware (which is
>2**12 bytes on 386-family).


You want me to explain Windows? (Oh, I said that already)

Allocations are done in terms of the page size (4K). But, apparently the
map of allocated address space has a granularity of 64K. So you rapidly
run out of address space if you allocate individual pages. (I know,
'cause that's the way I originally wrote my Storage Pool. The
"expanding" stuff was a later addition.) I presume the reason for that
is to manage the overhead determine free/in use address space sections.
If they use a bitmap, 64K chunks means a 8K bitmap, while 4K pages would
a 128K bitmap.

Allocating address space is much like calling Malloc; you pass in a size
and you get back an address. (I think this similarity was intentional,
so malloc could be replaced by VirtualAlloc without much change in the
program). And, of course there is a matching VirtualFree. Thus, it's
necessary for Windows to keep some data structure of free/allocated
address space.

>Sorry if this post slightly off topic.  At least I'm not ranting about
>Adam Smith.  ;-)


Nah, we're talking about how to design an Ada Storage_Pool for Windows.
That's not too off-topic.

In any case, it's possible to use this stuff in a Storage_Pool, so you
can have your very large array and not require a giant swap file (unless
you actually use the array). Since Ada makes access-to-arrays quite
painless, it's rather easy to drop such an implementation into existing
code that uses a large array. These days, I often just use the array,
and switch to a heap/pool implementation if it turns out that it needs
to be too large to allocate statically.

                 Randy.





^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Array and memory usage
@ 2003-03-12 22:04 Alexandre E. Kopilovitch
  2003-03-12 22:20 ` Larry Kilgallen
  0 siblings, 1 reply; 21+ messages in thread
From: Alexandre E. Kopilovitch @ 2003-03-12 22:04 UTC (permalink / raw)
  To: comp.lang.ada

>>Am I confused, or was the designer of Windwos confused?
>
>You want me to explain why Windows is designed the way it is?

It seems that convoluted virtual memory policies appeared in VAX VMS,
and Windows NT inherited (not surprisingly) from there.




^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Array and memory usage
  2003-03-12 22:04 Alexandre E. Kopilovitch
@ 2003-03-12 22:20 ` Larry Kilgallen
  0 siblings, 0 replies; 21+ messages in thread
From: Larry Kilgallen @ 2003-03-12 22:20 UTC (permalink / raw)


In article <mailman.14.1047506652.308.comp.lang.ada@ada.eu.org>, "Alexandre E. Kopilovitch" <aek@vib.usr.pu.ru> writes:
>>>Am I confused, or was the designer of Windwos confused?
>>
>>You want me to explain why Windows is designed the way it is?
> 
> It seems that convoluted virtual memory policies appeared in VAX VMS,
> and Windows NT inherited (not surprisingly) from there.

The VMS memory management system supports "copy on demand" and pages
not supported by backing store.  Of course it was more limited back
in the days of VMS V1, when Dave Cutler left the project.



^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2003-03-12 22:20 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-03-09 11:19 Array and memory usage Preben Randhol
2003-03-09 11:52 ` John R. Strohm
2003-03-09 12:11   ` Preben Randhol
2003-03-09 13:34     ` Marin David Condic
2003-03-09 15:05       ` Preben Randhol
2003-03-10  1:45       ` Mark Biggar
2003-03-10 11:57         ` Marin David Condic
2003-03-10 12:06           ` Jeffrey Creem
2003-03-09 17:49     ` Robert A Duff
2003-03-09 17:04 ` Jeffrey Creem
2003-03-09 17:09   ` Preben Randhol
2003-03-09 18:10 ` Robert A Duff
2003-03-10 15:13   ` Matthew Heaney
2003-03-10 23:33     ` Robert A Duff
2003-03-11  0:40       ` Randy Brukardt
2003-03-11 23:37         ` Robert A Duff
2003-03-12 19:18           ` Randy Brukardt
2003-03-11 16:01       ` Matthew Heaney
2003-03-11 17:38       ` Warren W. Gay VE3WWG
  -- strict thread matches above, loose matches on Subject: below --
2003-03-12 22:04 Alexandre E. Kopilovitch
2003-03-12 22:20 ` Larry Kilgallen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox