comp.lang.ada
 help / color / mirror / Atom feed
* Memory Useage
@ 2007-06-08 20:38 mhamel_98
  2007-06-09  0:43 ` Adam Beneschan
  2007-06-09  5:25 ` Niklas Holsti
  0 siblings, 2 replies; 5+ messages in thread
From: mhamel_98 @ 2007-06-08 20:38 UTC (permalink / raw)


Hello c.l.a.  Another question, I have a program that stores data on
the disk using sequential_io.  When I later read that data into an
array, the memory growth after ingesting a file is much much larger
than the disk footprint.  A file that takes 26.8MB on disk (over 134k
records) causes the program to swell by over 600MB!  Holy bloatware.
A short overview of what I'm trying to do - each sequential_io data
file has an associated header file with stuff like number of records,
etc.  The header is read, and an array is then created based on how
many records are said to be in the data file.  The data file is then
read, sticking a node into the array.  Some abbreviated code below,
the spec:

generic
  type Node_Type is private;
package Node_Manager is

  package Seq is new Sequential_Io (Node_Type);

  type Node_Array is array (positive range <>) of Node_Type;
  type Node_Ptr is access Node_Array;

  type Data_Rec is
    record
      Hdr : Node_Hdr;
      Data : Node_Ptr;
    end record;

Body stuff:

  procedure Free is new Unchecked_Deallocation (Node_Array, Node_Ptr);
  procedure Open (File : in out Data_Rec;
                           Name : in String) is
    Curr : Positive := 1;
    Node : Node_Type;
  begin
    Read_Hdr (Name, File.Hdr);
    File.Data := new Node_Array (1 .. File.Hdr.Size);

    Seq.Open (Dat_File, Seq.In_File, Name & ".dat");
    while not Seq.End_of_File (Dat_File) loop
      Seq.Read (Dat_File, Node);
      File.data.all (Curr) := Node;
      Curr := Curr + 1;
    end loop;
    Seq.Close (Dat_File);
    ...

The program works as I've wanted, though up until recently I've only
dealt with very small data sets, which is why I've never noticed undue
memory growth.  Now that I'm working with some "large" data sets, the
bloat is unbearable.  Any suggestions? (Besides look for another line
of work ;) )
Platform is ObjectAda 7.2 on WinNT.




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Memory Useage
  2007-06-08 20:38 Memory Useage mhamel_98
@ 2007-06-09  0:43 ` Adam Beneschan
  2007-06-09  3:09   ` mhamel_98
  2007-06-09  5:25 ` Niklas Holsti
  1 sibling, 1 reply; 5+ messages in thread
From: Adam Beneschan @ 2007-06-09  0:43 UTC (permalink / raw)


On Jun 8, 1:38 pm, mhamel...@yahoo.com wrote:
> Hello c.l.a.  Another question, I have a program that stores data on
> the disk using sequential_io.  When I later read that data into an
> array, the memory growth after ingesting a file is much much larger
> than the disk footprint.  A file that takes 26.8MB on disk (over 134k
> records) causes the program to swell by over 600MB!  Holy bloatware.
> A short overview of what I'm trying to do - each sequential_io data
> file has an associated header file with stuff like number of records,
> etc.  The header is read, and an array is then created based on how
> many records are said to be in the data file.  The data file is then
> read, sticking a node into the array.  Some abbreviated code below,
> the spec:
>
> generic
>   type Node_Type is private;
> package Node_Manager is
>
>   package Seq is new Sequential_Io (Node_Type);
>
>   type Node_Array is array (positive range <>) of Node_Type;
>   type Node_Ptr is access Node_Array;
>
>   type Data_Rec is
>     record
>       Hdr : Node_Hdr;
>       Data : Node_Ptr;
>     end record;
>
> Body stuff:
>
>   procedure Free is new Unchecked_Deallocation (Node_Array, Node_Ptr);
>   procedure Open (File : in out Data_Rec;
>                            Name : in String) is
>     Curr : Positive := 1;
>     Node : Node_Type;
>   begin
>     Read_Hdr (Name, File.Hdr);
>     File.Data := new Node_Array (1 .. File.Hdr.Size);
>
>     Seq.Open (Dat_File, Seq.In_File, Name & ".dat");
>     while not Seq.End_of_File (Dat_File) loop
>       Seq.Read (Dat_File, Node);
>       File.data.all (Curr) := Node;
>       Curr := Curr + 1;
>     end loop;
>     Seq.Close (Dat_File);
>     ...
>
> The program works as I've wanted, though up until recently I've only
> dealt with very small data sets, which is why I've never noticed undue
> memory growth.  Now that I'm working with some "large" data sets, the
> bloat is unbearable.  Any suggestions? (Besides look for another line
> of work ;) )
> Platform is ObjectAda 7.2 on WinNT.


You sure File.Hdr.Size is correct?  (I.e. is it the same as the number
of records in the file?)

                   -- Adam




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Memory Useage
  2007-06-09  0:43 ` Adam Beneschan
@ 2007-06-09  3:09   ` mhamel_98
  0 siblings, 0 replies; 5+ messages in thread
From: mhamel_98 @ 2007-06-09  3:09 UTC (permalink / raw)


On Jun 8, 5:43 pm, Adam Beneschan <a...@irvine.com> wrote:
> On Jun 8, 1:38 pm, mhamel...@yahoo.com wrote:
>
>
>
> > Hello c.l.a.  Another question, I have a program that stores data on
> > the disk using sequential_io.  When I later read that data into an
> > array, the memory growth after ingesting a file is much much larger
> > than the disk footprint.  A file that takes 26.8MB on disk (over 134k
> > records) causes the program to swell by over 600MB!  Holy bloatware.
> > A short overview of what I'm trying to do - each sequential_io data
> > file has an associated header file with stuff like number of records,
> > etc.  The header is read, and an array is then created based on how
> > many records are said to be in the data file.  The data file is then
> > read, sticking a node into the array.  Some abbreviated code below,
> > the spec:
>
> > generic
> >   type Node_Type is private;
> > package Node_Manager is
>
> >   package Seq is new Sequential_Io (Node_Type);
>
> >   type Node_Array is array (positive range <>) of Node_Type;
> >   type Node_Ptr is access Node_Array;
>
> >   type Data_Rec is
> >     record
> >       Hdr : Node_Hdr;
> >       Data : Node_Ptr;
> >     end record;
>
> > Body stuff:
>
> >   procedure Free is new Unchecked_Deallocation (Node_Array, Node_Ptr);
> >   procedure Open (File : in out Data_Rec;
> >                            Name : in String) is
> >     Curr : Positive := 1;
> >     Node : Node_Type;
> >   begin
> >     Read_Hdr (Name, File.Hdr);
> >     File.Data := new Node_Array (1 .. File.Hdr.Size);
>
> >     Seq.Open (Dat_File, Seq.In_File, Name & ".dat");
> >     while not Seq.End_of_File (Dat_File) loop
> >       Seq.Read (Dat_File, Node);
> >       File.data.all (Curr) := Node;
> >       Curr := Curr + 1;
> >     end loop;
> >     Seq.Close (Dat_File);
> >     ...
>
> > The program works as I've wanted, though up until recently I've only
> > dealt with very small data sets, which is why I've never noticed undue
> > memory growth.  Now that I'm working with some "large" data sets, the
> > bloat is unbearable.  Any suggestions? (Besides look for another line
> > of work ;) )
> > Platform is ObjectAda 7.2 on WinNT.
>
> You sure File.Hdr.Size is correct?  (I.e. is it the same as the number
> of records in the file?)
>
>                    -- Adam


Yep, pretty sure the size is correct.  There is an internal confidence
test I developed sometime ago, I'll run it on this data set on Monday,
but things down the road in the program break if the size is
incorrectly over- or under-stated in the header.




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Memory Useage
  2007-06-08 20:38 Memory Useage mhamel_98
  2007-06-09  0:43 ` Adam Beneschan
@ 2007-06-09  5:25 ` Niklas Holsti
  2007-06-11 15:28   ` mhamel_98
  1 sibling, 1 reply; 5+ messages in thread
From: Niklas Holsti @ 2007-06-09  5:25 UTC (permalink / raw)


mhamel_98@yahoo.com wrote:
> Hello c.l.a.  Another question, I have a program that stores data on
> the disk using sequential_io.  When I later read that data into an
> array, the memory growth after ingesting a file is much much larger
> than the disk footprint.  A file that takes 26.8MB on disk (over 134k
> records) causes the program to swell by over 600MB!  Holy bloatware.
> A short overview of what I'm trying to do - each sequential_io data
> file has an associated header file with stuff like number of records,
> etc.  The header is read, and an array is then created based on how
> many records are said to be in the data file.  The data file is then
> read, sticking a node into the array.  Some abbreviated code below,
> the spec:
> 
> generic
>   type Node_Type is private;
> package Node_Manager is
> 
>   package Seq is new Sequential_Io (Node_Type);
> 
>   type Node_Array is array (positive range <>) of Node_Type;
>   type Node_Ptr is access Node_Array;

Is the actual type for Node_Type a record type with variants? If so, is 
the size of the largest variant much larger than the size of the most 
common variants? I don't know about ObjectAda, but in GNAT the 
Node_Array would have a size that lets you store the largest variant in 
every array element, while the Node_Type objects stored in the 
sequential_io file probably use only as much file-space as the actual 
variant of each object requires.

The solution in GNAT would be to allocate storage for each Node_Type 
object separately and have an array of accesses:

    type Node_Ptr is access Node_Type;
    type Node_Array is array (positive range <>) of Node_Ptr;
    type Node_Array_Ptr is access Node_Array;

This increases memory overhead by allocating more blocks from the heap, 
but it may reduce the overall memory requirement if the largest variant 
of Node_Type is much larger than the average variant.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Memory Useage
  2007-06-09  5:25 ` Niklas Holsti
@ 2007-06-11 15:28   ` mhamel_98
  0 siblings, 0 replies; 5+ messages in thread
From: mhamel_98 @ 2007-06-11 15:28 UTC (permalink / raw)


On Jun 9, 1:25 am, Niklas Holsti <niklas.hol...@nospam.please> wrote:
> mhamel...@yahoo.com wrote:
> > Hello c.l.a.  Another question, I have a program that stores data on
> > the disk using sequential_io.  When I later read that data into an
> > array, the memory growth after ingesting a file is much much larger
> > than the disk footprint.  A file that takes 26.8MB on disk (over 134k
> > records) causes the program to swell by over 600MB!  Holy bloatware.
> > A short overview of what I'm trying to do - each sequential_io data
> > file has an associated header file with stuff like number of records,
> > etc.  The header is read, and an array is then created based on how
> > many records are said to be in the data file.  The data file is then
> > read, sticking a node into the array.  Some abbreviated code below,
> > the spec:
>
> > generic
> >   type Node_Type is private;
> > package Node_Manager is
>
> >   package Seq is new Sequential_Io (Node_Type);
>
> >   type Node_Array is array (positive range <>) of Node_Type;
> >   type Node_Ptr is access Node_Array;
>
> Is the actual type for Node_Type a record type with variants? If so, is
> the size of the largest variant much larger than the size of the most
> common variants? I don't know about ObjectAda, but in GNAT the
> Node_Array would have a size that lets you store the largest variant in
> every array element, while the Node_Type objects stored in the
> sequential_io file probably use only as much file-space as the actual
> variant of each object requires.
>
> The solution in GNAT would be to allocate storage for each Node_Type
> object separately and have an array of accesses:
>
>     type Node_Ptr is access Node_Type;
>     type Node_Array is array (positive range <>) of Node_Ptr;
>     type Node_Array_Ptr is access Node_Array;
>
> This increases memory overhead by allocating more blocks from the heap,
> but it may reduce the overall memory requirement if the largest variant
> of Node_Type is much larger than the average variant.
>
> --
> Niklas Holsti
> Tidorum Ltd
> niklas holsti tidorum fi
>        .      @       .- Hide quoted text -
>
> - Show quoted text -

Hi Niklas, no variant records, that's the first question a few other
people asked as well though!

Anyhoo, good news, bad news.  Good news is, this bit of code isn't the
source of the bloat.  The bad news is what is causing it is an
instantiation (several instantiations actually) of a generic Double-
Linked-List package.  Well, back to the drawing board...




^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-06-11 15:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-06-08 20:38 Memory Useage mhamel_98
2007-06-09  0:43 ` Adam Beneschan
2007-06-09  3:09   ` mhamel_98
2007-06-09  5:25 ` Niklas Holsti
2007-06-11 15:28   ` mhamel_98

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox