A little trouble with very large arrays.

comp.lang.ada
 help / color / mirror / Atom feed

* A little trouble with very large arrays.
@ 2018-10-04 21:38 Shark8
  2018-10-05  6:17 ` Jacob Sparre Andersen
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Shark8 @ 2018-10-04 21:38 UTC (permalink / raw)


I'm trying to implement a FITS library for work -- see https://fits.gsfc.nasa.gov/standard40/fits_standard40aa.pdf -- and have come across some rather interesting problems implementing it.

The main-problem right now is the "Primary Data Array" which can have a dimensionality in 1..999, each itself with some non-zero range. (In the files these are specified by keywords in the file like NAXIS = n, NAXIS1 = n_1, NAXIS2 = n_2, and so on until the NAXISn = n_n keyword/value pair is encountered.)

Relatively straightforward, no? Well, I'd thought I could handle everything with a dimensionality-array and generic like:


    Type Axis_Count is range 0..999 with Size => 10;
    Type Axis_Dimensions is Array (Axis_Count range <>) of Positive
      with Default_Component_Value => 1;
...
Generic
	Type Element is (<>);
	Dim : Axis_Dimensions:= (1..999 => 1);
Package FITS.Data with Pure is

	Type Data_Array is not null access Array(
			1..Dim( 1),1..Dim( 2),1..Dim( 3),1..Dim( 4),
  			1..Dim( 5),1..Dim( 6),1..Dim( 7),1..Dim( 8),
  			--...
			1..Dim( 997),1..Dim( 998),1..Dim( 999)
		) of Element
      with Convention => Fortran;
End FITS.Data;

But no dice.
GNAT won't even compile an array like this [999 indexes].


What's the proper way to go about doing this?
(As another interesting constraint, the file-format mandates a sort of block-structure of 2880 bytes [23040 bits], and while I don't anticipate this being an issue, something that might be relevant.)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: A little trouble with very large arrays.
  2018-10-04 21:38 A little trouble with very large arrays Shark8
@ 2018-10-05  6:17 ` Jacob Sparre Andersen
  2018-10-05  6:20 ` Niklas Holsti
  2018-10-05  6:36 ` Dmitry A. Kazakov
  2 siblings, 0 replies; 16+ messages in thread
From: Jacob Sparre Andersen @ 2018-10-05  6:17 UTC (permalink / raw)


Shark8 <onewingedshark@gmail.com> writes:

> The main-problem right now is the "Primary Data Array" which can have
> a dimensionality in 1..999, each itself with some non-zero range. (In
> the files these are specified by keywords in the file like NAXIS = n,
> NAXIS1 = n_1, NAXIS2 = n_2, and so on until the NAXISn = n_n
> keyword/value pair is encountered.)

Ouch. :-(

Something like this will work, but it doesn't look nice:

package Variable_Dimensionality is

   type Raw is array (Positive range <>,
                      Positive range <>,
                      Positive range <>) of Boolean;

   type Nice (Dim_1, Dim_2, Dim_3 : Positive) is
      record
         Data : Raw (1 .. Dim_1,
                     1 .. Dim_2,
                     1 .. Dim_3);
      end record;

end Variable_Dimensionality;

> (As another interesting constraint, the file-format mandates a sort of
> block-structure of 2880 bytes [23040 bits], and while I don't
> anticipate this being an issue, something that might be relevant.)

Ada.Sequential_IO and Ada.Direct_IO can both be instantiated with types
of any size, so you could simply use a 2880 character String, or a 2880
element Storage_Element_Array (remember to assert that
Storage_Element'Size = 8).

Greetings,

Jacob
-- 
»Verbing weirds language.«                         -- Calvin

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: A little trouble with very large arrays.
  2018-10-04 21:38 A little trouble with very large arrays Shark8
  2018-10-05  6:17 ` Jacob Sparre Andersen
@ 2018-10-05  6:20 ` Niklas Holsti
  2018-10-05 16:47   ` Shark8
  2018-10-05  6:36 ` Dmitry A. Kazakov
  2 siblings, 1 reply; 16+ messages in thread
From: Niklas Holsti @ 2018-10-05  6:20 UTC (permalink / raw)


On 18-10-05 00:38 , Shark8 wrote:
> I'm trying to implement a FITS library for work -- see
> https://fits.gsfc.nasa.gov/standard40/fits_standard40aa.pdf --
> and have come across some rather interesting problems implementing it.
>
> The main-problem right now is the "Primary Data Array" which can
> have  a dimensionality in 1..999, each itself with some non-zero range.
> (In the files these are specified by keywords in the file like
> NAXIS = n, NAXIS1 = n_1, NAXIS2 = n_2, and so on until the NAXISn = n_n
> keyword/value pair is encountered.)
>
> Relatively straightforward, no?

No. Handling arrays with a variable number of dimensions is not simple.

> Well, I'd thought I could handle  everything with a dimensionality-array
> and generic like:
>
>     Type Axis_Count is range 0..999 with Size => 10;
>     Type Axis_Dimensions is Array (Axis_Count range <>) of Positive
>       with Default_Component_Value => 1;
> ...
> Generic
> 	Type Element is (<>);
> 	Dim : Axis_Dimensions:= (1..999 => 1);
> Package FITS.Data with Pure is
>
> 	Type Data_Array is not null access Array(
> 			1..Dim( 1),1..Dim( 2),1..Dim( 3),1..Dim( 4),
>   			1..Dim( 5),1..Dim( 6),1..Dim( 7),1..Dim( 8),
>   			--...
> 			1..Dim( 997),1..Dim( 998),1..Dim( 999)
> 		) of Element

Give it some thought. Even if each dimension would have the smallest 
sensible length, which is two index values, the total number of elements 
in that array would be 2**999, somewhat larger than the memories of 
current computers.

> What's the proper way to go about doing this?

If you really want to support up to 999 dimensions (though I doubt that 
any real FITS file will be close to that number), your program has to 
manage the data in blocks of some practical size.

> (As another interesting constraint, the file-format mandates a sort
> of block-structure of 2880 bytes [23040 bits], and while I don't
> anticipate this being an issue, something that might be relevant.)

Perhaps that is the solution, not a new problem.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: A little trouble with very large arrays.
  2018-10-05  6:20 ` Niklas Holsti
@ 2018-10-05 16:47   ` Shark8
  2018-10-05 17:39     ` Niklas Holsti
  0 siblings, 1 reply; 16+ messages in thread
From: Shark8 @ 2018-10-05 16:47 UTC (permalink / raw)


On Friday, October 5, 2018 at 12:20:34 AM UTC-6, Niklas Holsti wrote:
> On 18-10-05 00:38 , Shark8 wrote:
> > I'm trying to implement a FITS library for work -- see
> > https://fits.gsfc.nasa.gov/standard40/fits_standard40aa.pdf --
> > and have come across some rather interesting problems implementing it.
> >
> > The main-problem right now is the "Primary Data Array" which can
> > have  a dimensionality in 1..999, each itself with some non-zero range.
> > (In the files these are specified by keywords in the file like
> > NAXIS = n, NAXIS1 = n_1, NAXIS2 = n_2, and so on until the NAXISn = n_n
> > keyword/value pair is encountered.)
> >
> > Relatively straightforward, no?
> 
> No. Handling arrays with a variable number of dimensions is not simple.
> 
> > Well, I'd thought I could handle  everything with a dimensionality-array
> > and generic like:
> >
> >     Type Axis_Count is range 0..999 with Size => 10;
> >     Type Axis_Dimensions is Array (Axis_Count range <>) of Positive
> >       with Default_Component_Value => 1;
> > ...
> > Generic
> > 	Type Element is (<>);
> > 	Dim : Axis_Dimensions:= (1..999 => 1);
> > Package FITS.Data with Pure is
> >
> > 	Type Data_Array is not null access Array(
> > 			1..Dim( 1),1..Dim( 2),1..Dim( 3),1..Dim( 4),
> >   			1..Dim( 5),1..Dim( 6),1..Dim( 7),1..Dim( 8),
> >   			--...
> > 			1..Dim( 997),1..Dim( 998),1..Dim( 999)
> > 		) of Element
> 
> Give it some thought. Even if each dimension would have the smallest 
> sensible length, which is two index values, the total number of elements 
> in that array would be 2**999, somewhat larger than the memories of 
> current computers.

No, the smallest sensible number of indices is 1, for everything except maybe the first two or three dimensions: eg Image data from a camera, or perhaps topological data from a map (longitude, latitude, elevation).


FITS was developed for handling "image" transport by the astronomy world, back when there were 9-bit bytes and such.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: A little trouble with very large arrays.
  2018-10-05 16:47   ` Shark8
@ 2018-10-05 17:39     ` Niklas Holsti
  2018-10-05 19:49       ` Shark8
  2018-10-06  6:40       ` Jacob Sparre Andersen
  0 siblings, 2 replies; 16+ messages in thread
From: Niklas Holsti @ 2018-10-05 17:39 UTC (permalink / raw)


On 18-10-05 19:47 , Shark8 wrote:
> On Friday, October 5, 2018 at 12:20:34 AM UTC-6, Niklas Holsti wrote:
>> On 18-10-05 00:38 , Shark8 wrote:
>>> I'm trying to implement a FITS library for work -- see
>>> https://fits.gsfc.nasa.gov/standard40/fits_standard40aa.pdf --
>>> and have come across some rather interesting problems implementing it.
>>>
>>> The main-problem right now is the "Primary Data Array" which can
>>> have  a dimensionality in 1..999, each itself with some non-zero range.
>>> (In the files these are specified by keywords in the file like
>>> NAXIS = n, NAXIS1 = n_1, NAXIS2 = n_2, and so on until the NAXISn = n_n
>>> keyword/value pair is encountered.)
>>>
>>> Relatively straightforward, no?
>>
>> No. Handling arrays with a variable number of dimensions is not simple.
>>
>>> Well, I'd thought I could handle  everything with a dimensionality-array
>>> and generic like:
>>>
>>>     Type Axis_Count is range 0..999 with Size => 10;
>>>     Type Axis_Dimensions is Array (Axis_Count range <>) of Positive
>>>       with Default_Component_Value => 1;
>>> ...
>>> Generic
>>> 	Type Element is (<>);
>>> 	Dim : Axis_Dimensions:= (1..999 => 1);
>>> Package FITS.Data with Pure is
>>>
>>> 	Type Data_Array is not null access Array(
>>> 			1..Dim( 1),1..Dim( 2),1..Dim( 3),1..Dim( 4),
>>>   			1..Dim( 5),1..Dim( 6),1..Dim( 7),1..Dim( 8),
>>>   			--...
>>> 			1..Dim( 997),1..Dim( 998),1..Dim( 999)
>>> 		) of Element
>>
>> Give it some thought. Even if each dimension would have the smallest
>> sensible length, which is two index values, the total number of elements
>> in that array would be 2**999, somewhat larger than the memories of
>> current computers.
>
> No, the smallest sensible number of indices is 1, for everything except
> maybe the first two or three dimensions: eg Image data from a camera, or
> perhaps topological data from a map (longitude, latitude, elevation).

FITS images can have more dimensions than that. Further dimensions might 
be the frequency of the light (spectral imaging); polarisation; time 
when image was taken; and perhaps a couple more that don't come to mind 
immediately.

I understand what you tried to do, including having length-one 
dimensions, but I don't think that it is a sensible approach to handling 
up to 999 dimensions. I agree with the flattening approach that Dmitry 
suggested.

If your FITS files are not much larger than your RAM, the fastest 
approach is probably to "mmap" the file into your virtual address space 
and then compute the address of any given image pixel with the 
flattening method. If your FITS files are larger than your RAM, your 
program should process the file as a stream, which may or may not be 
practical, depending on what the program should output.

> FITS was developed for handling "image" transport by the astronomy
> world, back when there were 9-bit bytes and such.

I know, I used to work in astronomy. What's your point about 9-bit 
bytes? FITS standard version 4.0 defines "byte" as 8 bits, and allows 
only 8, 16, 32 and 64-bit pixels. No 9-bit pixels.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: A little trouble with very large arrays.
  2018-10-05 17:39     ` Niklas Holsti
@ 2018-10-05 19:49       ` Shark8
  2018-10-05 20:31         ` Dmitry A. Kazakov
  2018-10-06 16:04         ` Jeffrey R. Carter
  2018-10-06  6:40       ` Jacob Sparre Andersen
  1 sibling, 2 replies; 16+ messages in thread
From: Shark8 @ 2018-10-05 19:49 UTC (permalink / raw)


> > No, the smallest sensible number of indices is 1, for everything except
> > maybe the first two or three dimensions: eg Image data from a camera, or
> > perhaps topological data from a map (longitude, latitude, elevation).
> 
> FITS images can have more dimensions than that. Further dimensions might 
> be the frequency of the light (spectral imaging); polarisation; time 
> when image was taken; and perhaps a couple more that don't come to mind 
> immediately.

Sure; but even those are a fairly small dimensionality than what the standard allows.
 
> I understand what you tried to do, including having length-one 
> dimensions, but I don't think that it is a sensible approach to handling 
> up to 999 dimensions. I agree with the flattening approach that Dmitry 
> suggested.

That would be rather unfortunate, to be honest. I'd much rather rely on the compiler translating the indexes than have to do so manually. I trust the compiler more than myself; plus letting it take care of keeping track of the mapping (ie FORTRAN convention) is nice.

My ultimate goal was to have some FITS_OBJECT type that had the appropriate data-members be able to simply "write itself to a stream" to output the proper FITS format file.

> 
> If your FITS files are not much larger than your RAM, the fastest 
> approach is probably to "mmap" the file into your virtual address space 
> and then compute the address of any given image pixel with the 
> flattening method. If your FITS files are larger than your RAM, your 
> program should process the file as a stream, which may or may not be 
> practical, depending on what the program should output.

Most of the anticipated usage for where I am right now would be producing FITS files, likely in something that would boil down to a coupling like this:

  Count      : Positive := 1;
  Today      : Ada.Calendar.Time renames Ada.Calendar.Clock;
  New_Image  : Camera_Image renames Normalize( Get_Camera_Image );
  New_Object : FITS.Object := FITS.Create_w_Defaults( New_Image );
  --..
  -- Writes data out to "Observation(YYYY-MM-DD)_00X.FITS".
  New_Object.Write( Base => "Observation", Date => Today, Count => X );

I'd rather not tie things to a memory-mapped file at a high level, but it may be that my ideal abstraction is non-tenable.


> > FITS was developed for handling "image" transport by the astronomy
> > world, back when there were 9-bit bytes and such.
> 
> I know, I used to work in astronomy. What's your point about 9-bit 
> bytes? FITS standard version 4.0 defines "byte" as 8 bits, and allows 
> only 8, 16, 32 and 64-bit pixels. No 9-bit pixels.

Sorry, that was more about Dmitry's suggestion to pretend representation-clauses don't exist; I haven't done anything at a bit-level at all. (And I don't think I need to, except perhaps to mark the Primary-Data array elements as Big-endian [IIRC].)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: A little trouble with very large arrays.
  2018-10-05 19:49       ` Shark8
@ 2018-10-05 20:31         ` Dmitry A. Kazakov
  2018-10-06 16:04         ` Jeffrey R. Carter
  1 sibling, 0 replies; 16+ messages in thread
From: Dmitry A. Kazakov @ 2018-10-05 20:31 UTC (permalink / raw)


On 2018-10-05 21:49, Shark8 wrote:

> Most of the anticipated usage for where I am right now would be producing FITS files, likely in something that would boil down to a coupling like this:
> 
>    Count      : Positive := 1;
>    Today      : Ada.Calendar.Time renames Ada.Calendar.Clock;
>    New_Image  : Camera_Image renames Normalize( Get_Camera_Image );
>    New_Object : FITS.Object := FITS.Create_w_Defaults( New_Image );
>    --..
>    -- Writes data out to "Observation(YYYY-MM-DD)_00X.FITS".
>    New_Object.Write( Base => "Observation", Date => Today, Count => X );
> 
> I'd rather not tie things to a memory-mapped file at a high level, but it may be that my ideal abstraction is non-tenable.

You still can do this. The object can have any representation, the 
stream attribute will encode/decode it as required by FITS:

    Object : FITS.Image :=
       Create
       (  Base  => "Observation",
          Date  => Clock,
          Image => Get_Camera_Image
       );
begin
    FITS.Image'Write (Stream, Object);

Or without any intermediate objects:

    FITS.Store
    (  File  => Stream,
       Base  => "Observation",
       Date  => Clock,
       Image => Get_Camera_Image
    );

The problem with intermediate objects is copying bulky data like images 
unless you deploy some complex reference-counting schema. Good bindings 
support provide in-place I/O operations.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: A little trouble with very large arrays.
  2018-10-05 19:49       ` Shark8
  2018-10-05 20:31         ` Dmitry A. Kazakov
@ 2018-10-06 16:04         ` Jeffrey R. Carter
  2018-10-06 18:49           ` Shark8
  1 sibling, 1 reply; 16+ messages in thread
From: Jeffrey R. Carter @ 2018-10-06 16:04 UTC (permalink / raw)


On 10/05/2018 09:49 PM, Shark8 wrote:
> 
> Most of the anticipated usage for where I am right now would be producing FITS files, likely in something that would boil down to a coupling like this:

For that you can probably get by with something that translates your image into 
a sequence of FITS "blocks" and writes them to a file:

FITS.Write (Image => Image, File_Name => "George");

There doesn't seem to be any reason to store a FITS object.

-- 
Jeff Carter
"We burst our pimples at you."
Monty Python & the Holy Grail
16

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: A little trouble with very large arrays.
  2018-10-06 16:04         ` Jeffrey R. Carter
@ 2018-10-06 18:49           ` Shark8
  2018-10-06 21:40             ` Jeffrey R. Carter
  0 siblings, 1 reply; 16+ messages in thread
From: Shark8 @ 2018-10-06 18:49 UTC (permalink / raw)


On Saturday, October 6, 2018 at 10:04:59 AM UTC-6, Jeffrey R. Carter wrote:
> On 10/05/2018 09:49 PM, Shark8 wrote:
> > 
> > Most of the anticipated usage for where I am right now would be producing FITS files, likely in something that would boil down to a coupling like this:
> 
> For that you can probably get by with something that translates your image into 
> a sequence of FITS "blocks" and writes them to a file:
> 
> FITS.Write (Image => Image, File_Name => "George");
> 
> There doesn't seem to be any reason to store a FITS object.

For our specific usage *RIGHT NOW*, sure.
All that's *REALLY* required, for the Telescope's production-side is writing out those blocks, this is true... but doing it this way would be kneecapping myself in the sense of maintenance & usability. (Like global-variables/states.)

[WRT software:] The Astronomy field is pretty fragmented and ripe for solid, reliable, libraries. Getting a good FITS library is only one of several things I'd like to produce:
(1) A good ISO 8601 library, to include periods and intervals;
--(a) This would include a secondary scheduling library.
(2) A stellar-coordinate library;
(3) A good abstraction for telescope-control.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: A little trouble with very large arrays.
  2018-10-06 18:49           ` Shark8
@ 2018-10-06 21:40             ` Jeffrey R. Carter
  0 siblings, 0 replies; 16+ messages in thread
From: Jeffrey R. Carter @ 2018-10-06 21:40 UTC (permalink / raw)


On 10/06/2018 08:49 PM, Shark8 wrote:
> 
> For our specific usage *RIGHT NOW*, sure.
> All that's *REALLY* required, for the Telescope's production-side is writing out those blocks, this is true... but doing it this way would be kneecapping myself in the sense of maintenance & usability. (Like global-variables/states.)

Of course you'd also implement

Image : FITS.Image_Handle := FITS.Read ("George");

But there still doesn't seem to be a reason to store a FITS object.

-- 
Jeff Carter
"We burst our pimples at you."
Monty Python & the Holy Grail
16

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: A little trouble with very large arrays.
  2018-10-05 17:39     ` Niklas Holsti
  2018-10-05 19:49       ` Shark8
@ 2018-10-06  6:40       ` Jacob Sparre Andersen
  2018-10-06  9:35         ` Niklas Holsti
  1 sibling, 1 reply; 16+ messages in thread
From: Jacob Sparre Andersen @ 2018-10-06  6:40 UTC (permalink / raw)


Niklas Holsti wrote:

> If your FITS files are not much larger than your RAM, the fastest
> approach is probably to "mmap" the file into your virtual address
> space and then compute the address of any given image pixel with the
> flattening method. If your FITS files are larger than your RAM, your
> program should process the file as a stream, which may or may not be
> practical, depending on what the program should output.

Why not leave the transport between disk and RAM to the operating
system, and use memory mapping even if the file is larger than the RAM
of the system?

Greetings,

Jacob
-- 
"When we cite authors we cite their demonstrations, not their names"
                                                           -- Pascal

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: A little trouble with very large arrays.
  2018-10-06  6:40       ` Jacob Sparre Andersen
@ 2018-10-06  9:35         ` Niklas Holsti
  0 siblings, 0 replies; 16+ messages in thread
From: Niklas Holsti @ 2018-10-06  9:35 UTC (permalink / raw)

On 18-10-06 09:40 , Jacob Sparre Andersen wrote:
> Niklas Holsti wrote:
>
>> If your FITS files are not much larger than your RAM, the fastest
>> approach is probably to "mmap" the file into your virtual address
>> space and then compute the address of any given image pixel with the
>> flattening method. If your FITS files are larger than your RAM, your
>> program should process the file as a stream, which may or may not be
>> practical, depending on what the program should output.
>
> Why not leave the transport between disk and RAM to the operating
> system, and use memory mapping even if the file is larger than the RAM
> of the system?

One could use mmap even for very large files, I guess, but on a 32-bit 
system the virtual address space could run out. On a 64-bit system, 
probably not.

This was advice based on my feeling of what would work best. If the file 
is processed as a stream, the OS is likely to use read-ahead to speed 
things up. If the file is mmap'ed to use the virtual-memory paging 
system, I'm not sure if the OS will do read-ahead, but perhaps modern 
OS's have some such adaptive optimisations even for mmap'ed files.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: A little trouble with very large arrays.
  2018-10-04 21:38 A little trouble with very large arrays Shark8
  2018-10-05  6:17 ` Jacob Sparre Andersen
  2018-10-05  6:20 ` Niklas Holsti
@ 2018-10-05  6:36 ` Dmitry A. Kazakov
  2018-10-05 16:56   ` Shark8
  2 siblings, 1 reply; 16+ messages in thread
From: Dmitry A. Kazakov @ 2018-10-05  6:36 UTC (permalink / raw)

On 2018-10-04 23:38, Shark8 wrote:
> I'm trying to implement a FITS library for work -- see https://fits.gsfc.nasa.gov/standard40/fits_standard40aa.pdf -- and have come across some rather interesting problems implementing it.
> 
> The main-problem right now is the "Primary Data Array" which can have a dimensionality in 1..999, each itself with some non-zero range. (In the files these are specified by keywords in the file like NAXIS = n, NAXIS1 = n_1, NAXIS2 = n_2, and so on until the NAXISn = n_n keyword/value pair is encountered.)
> 
> Relatively straightforward, no? Well, I'd thought I could handle everything with a dimensionality-array and generic like:
> 
> 
>      Type Axis_Count is range 0..999 with Size => 10;
>      Type Axis_Dimensions is Array (Axis_Count range <>) of Positive
>        with Default_Component_Value => 1;
> ...
> Generic
> 	Type Element is (<>);
> 	Dim : Axis_Dimensions:= (1..999 => 1);
> Package FITS.Data with Pure is
> 
> 	Type Data_Array is not null access Array(
> 			1..Dim( 1),1..Dim( 2),1..Dim( 3),1..Dim( 4),
>    			1..Dim( 5),1..Dim( 6),1..Dim( 7),1..Dim( 8),
>    			--...
> 			1..Dim( 997),1..Dim( 998),1..Dim( 999)
> 		) of Element
>        with Convention => Fortran;
> End FITS.Data;
> 
> But no dice.
> GNAT won't even compile an array like this [999 indexes].
>
> What's the proper way to go about doing this?

A wrong way dealing with protocols is attempting to define an Ada type 
having the exact representation of the data as defined by the protocol. 
It is both useless and difficult to impossible, especially if bits are 
involved.

As a starting point consider representation clauses non-existent and 
simply provide operations to construct reasonably defined Ada objects 
from raw protocol data and conversely. Nobody would ever program 
anything using 999-D arrays. Nobody would ever instantiate n**1000 
instances.

You could use a flat array internally and provide operations for image 
serialization/deserialization in whatever format, e.g. by 
Get_Pixel/Set_Pixel.

The hardest problem would be controlling bit representations. If they 
really mean that. Modern hardware usually handles octets atomically and 
simply does not allow accessing individual bits. There is basically no 
way to tell the bit order programmatically or even define "order".

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: A little trouble with very large arrays.
  2018-10-05  6:36 ` Dmitry A. Kazakov
@ 2018-10-05 16:56   ` Shark8
  2018-10-05 18:07     ` Niklas Holsti
  2018-10-05 19:06     ` Dmitry A. Kazakov
  0 siblings, 2 replies; 16+ messages in thread
From: Shark8 @ 2018-10-05 16:56 UTC (permalink / raw)


On Friday, October 5, 2018 at 12:36:47 AM UTC-6, Dmitry A. Kazakov wrote:
> On 2018-10-04 23:38, Shark8 wrote:
> > GNAT won't even compile an array like this [999 indexes].
> >
> > What's the proper way to go about doing this?
> 
> A wrong way dealing with protocols is attempting to define an Ada type 
> having the exact representation of the data as defined by the protocol. 
> It is both useless and difficult to impossible, especially if bits are 
> involved.

Protocol?
FITS is a file-format. The only reason bits are involved at all in the spec is because it was developed back when some machines had 9-bit bytes. It's all defined based on 2880 byte blocks at the very lowest level; atop that there are headers (key-value pairs) and data-arrays/-structure (indicated by data within the header).

> As a starting point consider representation clauses non-existent and 
> simply provide operations to construct reasonably defined Ada objects 
> from raw protocol data and conversely. Nobody would ever program 
> anything using 999-D arrays. Nobody would ever instantiate n**1000 
> instances.

I still need a way to conform to the standard, that means if the standard says that it's possible to have a 999-dimension array, I need to have some way to represent this... even if it is never in actuality used.

> 
> You could use a flat array internally and provide operations for image 
> serialization/deserialization in whatever format, e.g. by 
> Get_Pixel/Set_Pixel.

I tried this, it doesn't quite work though. (Stack overflow, oddly enough.)
    Function Flatten( Item : Axis_Dimensions ) return Natural is
      (case Item'Length is
	   when 0 => 1,
	   when 1 => Item( Item'First ),
	   when 2 => Item( Item'First ) * Item( Item'Last ),
	   when others =>
	     Flatten( Item(Item'First..Item'Last/2) ) *
	     Flatten( Item(Axis_Count'Succ(Item'Last/2)..Item'Last) )
      );


> 
> The hardest problem would be controlling bit representations. If they 
> really mean that. Modern hardware usually handles octets atomically and 
> simply does not allow accessing individual bits. There is basically no 
> way to tell the bit order programmatically or even define "order".
> 
> -- 
> Regards,
> Dmitry A. Kazakov
> http://www.dmitry-kazakov.de


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: A little trouble with very large arrays.
  2018-10-05 16:56   ` Shark8
@ 2018-10-05 18:07     ` Niklas Holsti
  2018-10-05 19:06     ` Dmitry A. Kazakov
  1 sibling, 0 replies; 16+ messages in thread
From: Niklas Holsti @ 2018-10-05 18:07 UTC (permalink / raw)

On 18-10-05 19:56 , Shark8 wrote:
> On Friday, October 5, 2018 at 12:36:47 AM UTC-6, Dmitry A. Kazakov wrote:
>> On 2018-10-04 23:38, Shark8 wrote:
>>> GNAT won't even compile an array like this [999 indexes].
>>>
>>> What's the proper way to go about doing this?
>>
>> A wrong way dealing with protocols is attempting to define an Ada type
>> having the exact representation of the data as defined by the protocol.
>> It is both useless and difficult to impossible, especially if bits are
>> involved.
>
> Protocol?
> FITS is a file-format.

Which is defined as a sequence of bytes, so a FITS file is equivalent to 
a message in a protocol such as SMTP etc. Usually a very _long_ message, 
of course.

> The only reason bits are involved at all in the
> spec is because it was developed back when some machines had 9-bit
> bytes.

FITS version 4.0 defines everything with 8-bit bytes, as far as I could 
see with a glance at the standard. Do you need to process some older 
FITS files with a different byte-size?

Yes, the FITS block size (2880 octets) was chosen to be divisible by 9, 
and other ancient word-sizes and byte-sizes, but so what?

>> You could use a flat array internally and provide operations for image
>> serialization/deserialization in whatever format, e.g. by
>> Get_Pixel/Set_Pixel.
>
> I tried this, it doesn't quite work though. (Stack overflow, oddly enough.)
>     Function Flatten( Item : Axis_Dimensions ) return Natural is
>       (case Item'Length is
> 	   when 0 => 1,
> 	   when 1 => Item( Item'First ),
> 	   when 2 => Item( Item'First ) * Item( Item'Last ),
> 	   when others =>
> 	     Flatten( Item(Item'First..Item'Last/2) ) *

That Item'Last/2 does not seem right. If you want the middle index, it 
should be (Item'First + Item'Last) / 2. Perhaps this error leads to an 
unending recursion, explaining the stack overflow.

> 	     Flatten( Item(Axis_Count'Succ(Item'Last/2)..Item'Last) )
>       );

But what is this function supposed to do? Is it meant to compute the 
length (number of elements) in the flattened array? That is just the 
product of the Axis_Dimension values, isn't it?

    function Product (Item : Axis_Dimensions) return Natural
    is
       Result : Natural := 1;
    begin
       for I in Item'Range loop
          Result := Result * Item(I);
       end loop;
       return Result;
    end Product;

For computing the position (flattened index) of an element in a 
flattened multi-dimensional array, you need a function that takes two 
arguments:
- a vector giving the length of each axis
- a vector giving the index (of the element) for each axis.

Coding that function as a double recursion gives no benefit IMO. A 
simple loop is better, as in the function above.

Also remember that the FITS array is in Fortran order, so the index of 
the first axis varies most rapidly in the flattened sequence of array 
elements. This can be done by a "loop .. in reverse ...".

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: A little trouble with very large arrays.
  2018-10-05 16:56   ` Shark8
  2018-10-05 18:07     ` Niklas Holsti
@ 2018-10-05 19:06     ` Dmitry A. Kazakov
  1 sibling, 0 replies; 16+ messages in thread
From: Dmitry A. Kazakov @ 2018-10-05 19:06 UTC (permalink / raw)


On 2018-10-05 18:56, Shark8 wrote:

> I still need a way to conform to the standard, that means if the standard says that it's possible to have a 999-dimension array, I need to have some way to represent this... even if it is never in actuality used.

No. You only need to support applications reading/writing 999-D arrays 
in the defined format. Nothing in the standard orders any application to 
actually have 999-D arrays or any arrays at all.

This is why it is so important to distinguish objects and their 
representations as defined by the protocol from the objects and their 
representations in the application. The problems you face arise from an 
attempt to equate them.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2018-10-06 21:40 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-04 21:38 A little trouble with very large arrays Shark8
2018-10-05  6:17 ` Jacob Sparre Andersen
2018-10-05  6:20 ` Niklas Holsti
2018-10-05 16:47   ` Shark8
2018-10-05 17:39     ` Niklas Holsti
2018-10-05 19:49       ` Shark8
2018-10-05 20:31         ` Dmitry A. Kazakov
2018-10-06 16:04         ` Jeffrey R. Carter
2018-10-06 18:49           ` Shark8
2018-10-06 21:40             ` Jeffrey R. Carter
2018-10-06  6:40       ` Jacob Sparre Andersen
2018-10-06  9:35         ` Niklas Holsti
2018-10-05  6:36 ` Dmitry A. Kazakov
2018-10-05 16:56   ` Shark8
2018-10-05 18:07     ` Niklas Holsti
2018-10-05 19:06     ` Dmitry A. Kazakov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox