From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,13ab88b30e0f779d X-Google-Attributes: gid103376,public X-Google-ArrivalTime: 2002-12-27 17:44:54 PST Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!headwall.stanford.edu!fu-berlin.de!uni-berlin.de!pc-62-30-113-45-cr.blueyonder.co.UK!not-for-mail From: "Nick Roberts" Newsgroups: comp.lang.ada Subject: Re: Efficient Matrix? Date: Sat, 28 Dec 2002 01:44:50 -0000 Message-ID: References: <3e0b2a66_4@news.bluewin.ch> NNTP-Posting-Host: pc-62-30-113-45-cr.blueyonder.co.uk (62.30.113.45) X-Trace: fu-berlin.de 1041039892 8231923 62.30.113.45 (16 [25716]) X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 5.50.4522.1200 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200 Xref: archiver1.google.com comp.lang.ada:32349 Date: 2002-12-28T01:44:50+00:00 List-Id: [Jonas Gasser asked about how to efficiently model a large boolean matrix in Ada.] It may be worth pointing out that on a computer with typical virtual memory management, if patterns of actual access to your large boolean array will tend to be 'sparse' (i.e. your program will tend to concentrate on one or a few small parts of the matrix at a time during computation), only a small part of the matrix will be part of the 'working set' (the set of memory pages which are actually being used at a certain 'moment', e.g. each second, during the program's execution). In this case, the fact that the matrix takes up many megabytes (or even gigabytes) of memory may recede in importance compared to the fact that only a small part (perhaps a few 100 kilobytes) needs to be in RAM at any one time during most of the program's execution. Or to put it more simply, if you are running your program on a virtual memory machine, try the big array declaration and see if it works! If it seems intolerably slow, try another more sophisticated strategy, such as a hash table, and see if this increases speed (it may decrease speed, so be wary). One technique that may be worth investigating (again, on a virtual memory machine) is that of using a small hash table which simply uses the full array to resolve collisions. type Matrix is array (Positive range <>, Positive range <>) of Boolean; type Hash_Index is range 0..999; -- whatever range is appropriate function Hash (i, j: Positive) return Hash_Index; type Hashtable_Entry is record Value: Boolean := False; -- or True if most values will be true Collision: Boolean := False; end record; type Matrix_by_Hash (Hash_Index) of Hashtable_Entry; M1: Matrix(1..10000,1..5000) := (others => (others => False)); -- M1 may need to be dynamically allocated. -- The initialisation may need to be different or omitted. H1: Matrix_by_Hash; function Get_M1 (i, j: in Positive) return Boolean is h: constant Hash_Index := Hash(i,j); E: constant Hashtable_Entry := H1(h); begin if E.Collision then return M1(i,j); else return E.Value; end if; end; procedure Set_M1 (i, j: in Positive; b: in Boolean) is h: constant Hash_Index := Hash(i,j); begin M1(i,j) := b; if H1(h).Value /= b then H1(h).Collision := True; end if; end; It is one of the basic premises of the hashing technique that the data written into the full matrix will tend, in practice, to be sparse (i.e. that it will contain mainly falses (or maybe that it will have mainly trues)). The computation of the hashing index (the Hash function) always needs to be fast, but particularly so in this kind of situation, I think. I'll leave the actual choice of the Hash function and Hash_Index range to you. Also, for the above specific technique, since Set_M1 always accesses the big M1 array, it is supposed that the number of reads will greatly outnumber the number of writes. You may well wish to apply pragma Inline to Hash, Get_M1 and Set_M1. You will almost certainly wish to apply pragma Pack to M1 (or the (sub)type Matrix), and possibly also to H1 (or Matrix_by_Hash) and Hashtable_Entry. If this technique works, you ought to 'pot' the matrix types and functions in a package; possibly a generic one which takes the full matrix array type, the hash index type, and the hashing function as generic parameters. Hope this is of some help (or interest). Good luck! -- Nick Roberts