* reading a text file into a string @ 2004-07-15 17:27 zork 2004-07-15 17:49 ` Marius Amado Alves ` (5 more replies) 0 siblings, 6 replies; 44+ messages in thread From: zork @ 2004-07-15 17:27 UTC (permalink / raw) hi, i would like to read a whole text file into a string. I thought of using an unbounded_string for this: ---------- c, char: character; text : unbounded_string; ... -- read in text file while not end_of_file ( File ) loop Get ( File, c ); append ( text, c ); end loop; ... put ( to_string ( text ) ); -- display content of unbounded_string -- process text for i in 1 .. length ( text ) loop char := element ( text, i ); ... end loop; ----------- ... is this the general way of going about it? or is there a more prefered method of reading in a whole text file (into whatever format) for processing? Thanks again! cheers, zork ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-15 17:27 reading a text file into a string zork @ 2004-07-15 17:49 ` Marius Amado Alves 2004-07-15 19:57 ` Nick Roberts 2004-07-15 17:59 ` Marius Amado Alves ` (4 subsequent siblings) 5 siblings, 1 reply; 44+ messages in thread From: Marius Amado Alves @ 2004-07-15 17:49 UTC (permalink / raw) Cc: comp.lang.ada Unbounded_String is the right container for this. But note Get for characters skips over newlines and relatives (I think!) Do you really want to loose that information? If not then consider using Get_Immediate or Get_Line. This assuming you are using Ada.Text_IO. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-15 17:49 ` Marius Amado Alves @ 2004-07-15 19:57 ` Nick Roberts 0 siblings, 0 replies; 44+ messages in thread From: Nick Roberts @ 2004-07-15 19:57 UTC (permalink / raw) On Thu, 15 Jul 2004 18:49:17 +0100, Marius Amado Alves <amado.alves@netcabo.pt> wrote: > Unbounded_String is the right container for this. But note Get > for characters skips over newlines and relatives (I think!) Do > you really want to loose that information? If not then consider > using Get_Immediate or Get_Line. This assuming you are using > Ada.Text_IO. I don't suggest using Get_Immediate for this purpose. You could use: (a) a combination of Get and the End_of_Line function; or (b) Get_Line. Option (a) is likely to be slower, but this might not worry you. If you use option (b), you must either not care about lines which are too long (or know that none are) or you must write special code to deal with overlong lines. Example of option (a): c, char: character; text : unbounded_string; Line_Break: constant Character := Ada.Characters.Latin_1.NUL; ... -- read in text file while not End_of_File(File) loop if End_of_Line(File) then Append( text, Line_Break ); Skip_Line(File); end if; Get( File, c ); Append( text, c ); end loop; ... -- display content of unbounded_string: for i in 1..Length(text) loop if Element(text,i) = Line_Break then Put_Line; else Put( Element(text,i) ); end if; end loop; -- process text for i in 1 .. length ( text ) loop char := element ( text, i ); ... end loop; This will work, but it may not be very efficient. Example of option (b): with AI302.Containers.Vectors; with Ada.Strings.Unbounded; use Ada.Strings.Unbounded; ... package Line_Vectors is new AI302.Containers.Vectors(Positive,Unbounded_String); use Line_Vectors; ... Text: Line_Vectors.Vector_Type; Line: String(1..100); LL: Natural; ... -- Read in text file: while not End_of_File(File) loop Read( File, Line, LL ); -- read line or line segment Append( Text, To_Unbounded_String(Line(1..LL)) ); end loop; -- Display content of unbounded_string: for i in 1..Natural(Length(Text)) loop Put_Line( To_String( Element(Text,i) ) ); end loop; ... If a line is in the file which is longer than 100 characters, it will be broken into two or more lines in the Text variable. Try this yourself. You can download the AI-302 sample implementation packages from: http://home.earthlink.net/~matthewjheaney/charles/ai302-20040227.zip courtesy of Matthew Heaney (thanks Matt). -- Nick Roberts ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-15 17:27 reading a text file into a string zork 2004-07-15 17:49 ` Marius Amado Alves @ 2004-07-15 17:59 ` Marius Amado Alves 2004-07-15 19:18 ` Nick Roberts 2004-07-15 19:18 ` Nick Roberts ` (3 subsequent siblings) 5 siblings, 1 reply; 44+ messages in thread From: Marius Amado Alves @ 2004-07-15 17:59 UTC (permalink / raw) To: comp.lang.ada "...is there a more prefered method of reading in a whole text file?" The preference depends. The methods are: - declare a string S of size equal to the file size and then call a standard string reading procedure with S as the out parameter - use stream attributes 'Read, 'Input of type String and facilities in Ada.Streams.Stream_IO ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-15 17:59 ` Marius Amado Alves @ 2004-07-15 19:18 ` Nick Roberts 0 siblings, 0 replies; 44+ messages in thread From: Nick Roberts @ 2004-07-15 19:18 UTC (permalink / raw) On Thu, 15 Jul 2004 18:59:06 +0100, Marius Amado Alves <amado.alves@netcabo.pt> wrote: > "...is there a more prefered method of reading in a whole text file?" > > The preference depends. That's true, there are many ways to skin a cat, and how you read and process the file will very much depend on what you are trying to do. > The methods are: > > - declare a string S of size equal to the file size and > then call a standard string reading procedure with S as > the out parameter The problem with this idea is that there is no standard way to determine the size of a text file. For some kinds of text 'file' (such as a device or pipe), it may not be possible to tell in advance by any means. > - use stream attributes 'Read, 'Input of type String and > facilities in Ada.Streams.Stream_IO I think this is a terrible idea, unless you know what the character encoding is, and wish to do the decoding yourself! -- Nick Roberts ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-15 17:27 reading a text file into a string zork 2004-07-15 17:49 ` Marius Amado Alves 2004-07-15 17:59 ` Marius Amado Alves @ 2004-07-15 19:18 ` Nick Roberts 2004-07-15 20:02 ` Nick Roberts 2004-07-16 1:23 ` Jeffrey Carter ` (2 subsequent siblings) 5 siblings, 1 reply; 44+ messages in thread From: Nick Roberts @ 2004-07-15 19:18 UTC (permalink / raw) On Fri, 16 Jul 2004 03:27:57 +1000, zork <zork@nospam.com> wrote: > hi, i would like to read a whole text file into a string. I thought of > using > an unbounded_string for this: > > ---------- > c, char: character; > text : unbounded_string; > ... > -- read in text file > while not end_of_file ( File ) loop > Get ( File, c ); > append ( text, c ); > end loop; > ... > put ( to_string ( text ) ); -- display content of unbounded_string > > -- process text > for i in 1 .. length ( text ) loop > char := element ( text, i ); > ... > end loop; > ----------- > > ... is this the general way of going about it? or is there a more > prefered > method of reading in a whole text file (into whatever format) for > processing? It is usual to read and process information from files a piece at a time. Quite often a file is read a piece at a time, each piece is interpreted in some way, and then some structure is built up in memory from the interpreted pieces. Then, typically, further processing is done using the whole structure. It is unusual to read an entire text file into a string in memory. However, sometimes this may be a quick and convenient technique for achieving results in a hurry. An unbounded string will generally be the appropriate data structure to use for this purpose. The problem you do not address with the code you suggest above -- as another poster has pointed out -- is that of line breaks. One easy possibility might be to insert: if End_of_Line(File) then Append( text, Ada.Characters.Latin_1.NUL ); Skip_Line(File); end if; between the Get and the append. Line breaks are then indicated by the NUL character, and could be processed as such. This should work provided the file itself does not contain any NULs. -- Nick Roberts ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-15 19:18 ` Nick Roberts @ 2004-07-15 20:02 ` Nick Roberts 0 siblings, 0 replies; 44+ messages in thread From: Nick Roberts @ 2004-07-15 20:02 UTC (permalink / raw) On Thu, 15 Jul 2004 20:18:35 +0100, Nick Roberts <nick.roberts@acm.org> wrote: > insert: > > if End_of_Line(File) then > Append( text, Ada.Characters.Latin_1.NUL ); > Skip_Line(File); > end if; > > between the Get and the append. Sorry, I should have said /before/ the Get. -- Nick Roberts ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-15 17:27 reading a text file into a string zork ` (2 preceding siblings ...) 2004-07-15 19:18 ` Nick Roberts @ 2004-07-16 1:23 ` Jeffrey Carter 2004-07-16 2:20 ` Steve 2004-07-16 2:26 ` Steve 5 siblings, 0 replies; 44+ messages in thread From: Jeffrey Carter @ 2004-07-16 1:23 UTC (permalink / raw) zork wrote: > while not end_of_file ( File ) loop > Get ( File, c ); > append ( text, c ); > end loop; This will work. As others have pointed out, Get skips line terminators. I'll assume you're not interested in them. A "better" way to do this is to use Get_Line: Line : String (1 .. Max); Last : Natural; ... Read : loop exit Read when End_Of_File (File); Get_Line (File => File, Item => Line, Last => Last); Append (Text, Line (1 .. Last) ); end loop Read; Get_Line returns when the String (Line) is filled (in which case Last = Max) or a line terminator is encountered (in which case Last < Max), whichever comes first; if a line terminator is encountered, it is skipped. You can also use a function such as PragmARC.Get_Line, which reads an entire line and skips the line terminator: Read : loop exit Read when End_Of_File (File); Append (Text, PragmARC.Get_Line (File) ); end loop Read; This is especially convenient if you want to add a special Character to indicate line terminators: Append (Text, PragmARC.Get_Line (File) & EOT); -- Jeff Carter "The time has come to act, and act fast. I'm leaving." Blazing Saddles 36 ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-15 17:27 reading a text file into a string zork ` (3 preceding siblings ...) 2004-07-16 1:23 ` Jeffrey Carter @ 2004-07-16 2:20 ` Steve 2004-07-16 2:26 ` Steve 5 siblings, 0 replies; 44+ messages in thread From: Steve @ 2004-07-16 2:20 UTC (permalink / raw) "zork" <zork@nospam.com> wrote in message news:40f6bf21@dnews.tpgi.com.au... > hi, i would like to read a whole text file into a string. I thought of using > an unbounded_string for this: > > ---------- > c, char: character; > text : unbounded_string; > ... > -- read in text file > while not end_of_file ( File ) loop > Get ( File, c ); > append ( text, c ); > end loop; > ... > put ( to_string ( text ) ); -- display content of unbounded_string > > -- process text > for i in 1 .. length ( text ) loop > char := element ( text, i ); > ... > end loop; > ----------- > > ... is this the general way of going about it? or is there a more prefered > method of reading in a whole text file (into whatever format) for > processing? > > Thanks again! > > cheers, > zork > > ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-15 17:27 reading a text file into a string zork ` (4 preceding siblings ...) 2004-07-16 2:20 ` Steve @ 2004-07-16 2:26 ` Steve 2004-07-16 16:16 ` Jeffrey Carter 2004-07-16 21:19 ` Randy Brukardt 5 siblings, 2 replies; 44+ messages in thread From: Steve @ 2004-07-16 2:26 UTC (permalink / raw) Sorry for the blank reply (obese finger) The easiest way to read an entire file into a string, if you're looking for speed, is to use Ada.Direct_Io; Here is a small working example: with Ada.Direct_Io; with Ada.Text_Io; procedure Demo is function Get_File_Size( file_name : String ) return Natural is package Direct_Io_Char_File is new Ada.Direct_IO( Character ); use Direct_Io_Char_File; size_file : File_Type; result : Natural; begin Open( size_file, In_File, file_name ); result := Natural( Size( size_file ) ); Close( size_file ); return result; end Get_File_Size; begin declare File_Size : Natural := Get_File_Size( "demo.txt" ); subtype File_String is String( 1 .. File_Size ); package File_Reader is new Ada.Direct_Io( File_String ); in_file : File_Reader.File_Type; file_data : File_String; begin File_Reader.Open( in_file, File_Reader.In_File, "demo.txt" ); File_Reader.Read( in_file, file_data ); File_Reader.Close( in_file ); -- Do what you will with file_data Ada.Text_Io.Put( file_data ); end; end Demo; The basic idea is to first create an instance of direct_io for characters, just to use the "size" function to find out how many characters are in the file. Then create an instance of direct_io for a string of the same length as the file, and do one read. This gives you the content of the file as one raw string. Which may or may not be what you want, but it is what you asked for. Steve (The Duck) "zork" <zork@nospam.com> wrote in message news:40f6bf21@dnews.tpgi.com.au... > hi, i would like to read a whole text file into a string. I thought of using > an unbounded_string for this: > > ---------- > c, char: character; > text : unbounded_string; > ... > -- read in text file > while not end_of_file ( File ) loop > Get ( File, c ); > append ( text, c ); > end loop; > ... > put ( to_string ( text ) ); -- display content of unbounded_string > > -- process text > for i in 1 .. length ( text ) loop > char := element ( text, i ); > ... > end loop; > ----------- > > ... is this the general way of going about it? or is there a more prefered > method of reading in a whole text file (into whatever format) for > processing? > > Thanks again! > > cheers, > zork > > ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-16 2:26 ` Steve @ 2004-07-16 16:16 ` Jeffrey Carter 2004-07-16 17:45 ` Nick Roberts 2004-07-16 21:19 ` Randy Brukardt 1 sibling, 1 reply; 44+ messages in thread From: Jeffrey Carter @ 2004-07-16 16:16 UTC (permalink / raw) Steve wrote: > This gives you the content of the file as one raw string. Which may > or may not be what you want, but it is what you asked for. Right, including line terminators, page terminators, and file terminators, if they exist, and which vary from system to system (under UNIX, lines are terminated by LF; under DOS/Windows, by CR LF; perhaps someone can comment on what happens with this approach under VMS). Therefore, this approach is rarely used for production programs, which usually want a platform-independent representation of the file. -- Jeff Carter "If you think you got a nasty taunting this time, you ain't heard nothing yet!" Monty Python and the Holy Grail 23 ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-16 16:16 ` Jeffrey Carter @ 2004-07-16 17:45 ` Nick Roberts 0 siblings, 0 replies; 44+ messages in thread From: Nick Roberts @ 2004-07-16 17:45 UTC (permalink / raw) On Fri, 16 Jul 2004 16:16:33 GMT, Jeffrey Carter <spam@spam.com> wrote: > Steve wrote: > >> This gives you the content of the file as one raw string. Which >> may or may not be what you want, but it is what you asked for. > > Right, including line terminators, page terminators, and file > terminators, if they exist, and which vary from system to system > (under UNIX, lines are terminated by LF; under DOS/Windows, by CR > LF; perhaps someone can comment on what happens with this approach > under VMS). Therefore, this approach is rarely used for > production programs, which usually want a platform-independent > representation of the file. Not to mention other differences in basic file format and character encoding. On some systems, trying this will produce complete and utter rubbish, and on some systems it will fail (with an exception) when you try to open the file (because text and binary files are immiscible). I think the latter is the case on VMS with DEC Ada. -- Nick Roberts ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-16 2:26 ` Steve 2004-07-16 16:16 ` Jeffrey Carter @ 2004-07-16 21:19 ` Randy Brukardt 2004-07-17 2:27 ` Robert I. Eachus 1 sibling, 1 reply; 44+ messages in thread From: Randy Brukardt @ 2004-07-16 21:19 UTC (permalink / raw) "Steve" <nospam_steved94@comcast.net> wrote in message news:E1HJc.101277$Oq2.96646@attbi_s52... > Sorry for the blank reply (obese finger) > > The easiest way to read an entire file into a string, if you're looking for > speed, is to > use Ada.Direct_Io; ... That's how you'd do that in Ada 83, but in Ada 95 you could do the same with Stream_IO, without needing the weird instantiations. (Yes, you'd be assuming that Stream_Element'Size = Character'Size, but since this technique really only works on Windows and Unix anyway [as noted by others], that's not an issue.) Randy. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-16 21:19 ` Randy Brukardt @ 2004-07-17 2:27 ` Robert I. Eachus 2004-07-17 11:31 ` Mats Weber ` (2 more replies) 0 siblings, 3 replies; 44+ messages in thread From: Robert I. Eachus @ 2004-07-17 2:27 UTC (permalink / raw) There have been a lot of useful tips in this thread on how to accomplish the stated goal. But what is really missing is a discussion of HOW a newbie should decide what he actually wants to do. What you have to do is refine your requirements, and that can be the most important, and most time consuming step when programming in Ada. I usually state it as Ada is much better at doing what you tell it to do than other languages. But it is like a four-year old child, always asking, "Why?" So before you decide whether to represent line-breaks with nulls, linefeeds, or copy the existing characters exactly, you have to know the answer to the "Why?" question. In this case, "Why are you reading the file?" Once you know whether you need a bitwise copy of the file, to parse the text and reformat it, or merely to scan through the contents of the file, then you can decide the right way to read the file. I usually find that when I have though about it enough, I want to do line at a time processing, rather than character at a time, or reading the entire file in one gulp. For this reason, I find myself contructing or using a Get_Line FUNCTION inside a loop and a declare block: while not End_of_Line(Somefile) loop declare Buffer: String := Get_Line(Somefile); begin -- process buffer exception ... end; end loop; Each iteration of the loop, the Buffer contains a CONSTANT String, but it is potentially different in length and content every time through. Incidently, GNAT has a special package to allow you to do a Get_Line into an Unbounded_String, no matter how long. I think I posted a "clever example" of how to do it here, and if you need it I can find it again. (The code is an elegant example of the use of recursion. Using the GNAT equivalent is better performance-wise if you really are reading multi-megabyte lines.) -- Robert I. Eachus "The flames kindled on the Fourth of July, 1776, have spread over too much of the globe to be extinguished by the feeble engines of despotism; on the contrary, they will consume these engines and all who work them." -- Thomas Jefferson, 1821 ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-17 2:27 ` Robert I. Eachus @ 2004-07-17 11:31 ` Mats Weber 2004-07-17 15:52 ` Robert I. Eachus 2004-07-19 8:07 ` Dale Stanbrough 2004-07-19 11:51 ` Ada2005 (was " Peter Hermann 2 siblings, 1 reply; 44+ messages in thread From: Mats Weber @ 2004-07-17 11:31 UTC (permalink / raw) In article <fOednXzORfHlE2Xd4p2dnA@comcast.com>, "Robert I. Eachus" <rieachus@comcast.net> wrote: >while not End_of_Line(Somefile) loop > declare > Buffer: String := Get_Line(Somefile); > begin > -- process buffer > exception > ... > end; >end loop; > >Each iteration of the loop, the Buffer contains a CONSTANT String, but It's constant only if you declare it constant, as in Buffer: constant String := Get_Line(Somefile); >it is potentially different in length and content every time through. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-17 11:31 ` Mats Weber @ 2004-07-17 15:52 ` Robert I. Eachus 2004-07-17 22:38 ` Jeffrey Carter 0 siblings, 1 reply; 44+ messages in thread From: Robert I. Eachus @ 2004-07-17 15:52 UTC (permalink / raw) Mats Weber wrote: >>Each iteration of the loop, the Buffer contains a CONSTANT String, but > > > It's constant only if you declare it constant, as in > > Buffer: constant String := Get_Line(Somefile); > > >>it is potentially different in length and content every time through. When I woke up this morning my mind told me I'd goofed. What it really said was something like, "You IDIOT, you left out the word length, and worse you emphasized the wrong word." My brain is pretty nasty before it gets its morning fix of caffine. ;-) I meant to say "the Buffer contains a constant LENGTH String..." In Ada 83 you had to declare the String a constant for this idiom to work, but that wasn't what I was trying to say. The magic is that each time through the loop the buffer is exactly the right size to hold the line. If you need to be able to change the length of the buffer though, you have to use Unbounded_String. -- Robert I. Eachus "The flames kindled on the Fourth of July, 1776, have spread over too much of the globe to be extinguished by the feeble engines of despotism; on the contrary, they will consume these engines and all who work them." -- Thomas Jefferson, 1821 ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-17 15:52 ` Robert I. Eachus @ 2004-07-17 22:38 ` Jeffrey Carter 2004-07-18 13:44 ` zork 0 siblings, 1 reply; 44+ messages in thread From: Jeffrey Carter @ 2004-07-17 22:38 UTC (permalink / raw) Robert I. Eachus wrote: > I meant to say "the Buffer contains a constant LENGTH String..." In Ada > 83 you had to declare the String a constant for this idiom to work, but > that wasn't what I was trying to say. The magic is that each time > through the loop the buffer is exactly the right size to hold the line. > If you need to be able to change the length of the buffer though, you > have to use Unbounded_String. In Ada 83, I often did something like Buffer_C : constant String := Some_Function; Buffer : String (1 .. Buffer_C'Length) := Buffer_C; so I could modify Buffer, and hoped that the compiler would be smart enough to only keep one copy of the string. -- Jeff Carter "Nobody expects the Spanish Inquisition!" Monty Python's Flying Circus 22 ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-17 22:38 ` Jeffrey Carter @ 2004-07-18 13:44 ` zork 0 siblings, 0 replies; 44+ messages in thread From: zork @ 2004-07-18 13:44 UTC (permalink / raw) Thanks everyone for your help on this one. I wasn't too worried about not being able to read in newlines. Everyone has been really helpful. This is a great board! Really appreciate the responses :) cheers zork ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-17 2:27 ` Robert I. Eachus 2004-07-17 11:31 ` Mats Weber @ 2004-07-19 8:07 ` Dale Stanbrough 2004-07-19 8:58 ` Martin Dowie 2004-07-19 11:51 ` Ada2005 (was " Peter Hermann 2 siblings, 1 reply; 44+ messages in thread From: Dale Stanbrough @ 2004-07-19 8:07 UTC (permalink / raw) Robert I. Eachus wrote: > > For this reason, I find myself contructing or using a Get_Line FUNCTION > inside a loop and a declare block: > > while not End_of_Line(Somefile) loop > declare > Buffer: String := Get_Line(Somefile); > begin > -- process buffer > exception > ... > end; > end loop; I use a generic procedure that has a process procedure as a parameter. It gets called with each line of the string... -- Apply the procedure Process to each line of the file -- This allows for very simple file processing, with all of the -- control bits (not much really) hidden away. -- -- Each line is read from the file, and then passed to the -- procedure -- The maximum line size for the file is 1000 chars. -- -- Typical use is -- -- with Ada.Text_IO; use Ada.Text_IO; -- with Ada.Integer_Text_IO; use Ada.Integer_Text_IO; -- -- with Process_File; -- -- procedure Count_Chars is -- -- Count : Natural := 0; -- -- procedure Count_Letters (Item : String) is -- begin -- Count := Count + Item'Length; -- end; -- -- procedure Count_Em is -- new Process_File (Process => Count_Letters); -- begin -- Count_Em (<Somefilename>); -- Put ("There are ..."); Put (Count); Put (" characters"); -- end; -- generic with procedure Process (Line : String); Max_Line_Size : Positive := 1000; -- The maximum number of characters on any one line procedure Process_File (Filename : String); ----------------------------------------------------- it presumes a maximum line length, which is not so great, but is otherwise a very convenient generic. Dale -- dstanbro@spam.o.matic.bigpond.net.au ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-19 8:07 ` Dale Stanbrough @ 2004-07-19 8:58 ` Martin Dowie 2004-07-21 0:17 ` Robert I. Eachus 0 siblings, 1 reply; 44+ messages in thread From: Martin Dowie @ 2004-07-19 8:58 UTC (permalink / raw) Dale Stanbrough wrote: > generic > with procedure Process (Line : String); > Max_Line_Size : Positive := 1000; > -- The maximum number of characters on any one line > > procedure Process_File (Filename : String); > > > ----------------------------------------------------- > > it presumes a maximum line length, which is not so great, but > is otherwise a very convenient generic. But you could always override the default of 1000 with a more appropriate value if you find it necessary. Isn't there an arguement for defaulting to 250 characters/line? Can't remember what it was off the top of my head but it is ringing a bell... ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-19 8:58 ` Martin Dowie @ 2004-07-21 0:17 ` Robert I. Eachus 2004-07-21 21:39 ` Randy Brukardt 0 siblings, 1 reply; 44+ messages in thread From: Robert I. Eachus @ 2004-07-21 0:17 UTC (permalink / raw) Martin Dowie wrote: > Isn't there an arguement for defaulting to 250 characters/line? Can't > remember what it was off the top of my head but it is ringing a bell... There is, and it is probably worth spelling out for programmers... If you are buffering lines and want to avoid unintentional cache pressure, you should try to force buffers into a single cache line if possible. But what size to use to do that? Well first of all, Intel CPUs tend to have a 256-byte cache line. If a cache read is in progress when another read occurs, the CPU may cut the read off at 128 bytes. AMD processors (Athlon, Opteron, etc.) have 64-byte cache lines, but will normally read two lines on any memory read. So Intel asks for 256 bytes, but may take 128, and AMD asks for 128 but may take 64. So 128 or 256-bytes is a good size for objects to be fit in a single cache line. However, in Ada a String will have a descriptor associated with it. Also when a line is stored in a file, it will usually have some sort of decoration, either a length field, a line terminator, or an appended null. So what would like to do is: subtype Index is Integer range 0..250; type Buffer (Length: Index := 0) is record Contents: String(1..Length); end record; for Buffer'Alignment use 256; -- or 128 on Athlons. ;-) Language lawyer note: RM 13.3(32) says: "An implementation need not support specified Alignments that are greater than the maximum Alignment the implementation ever returns by default." This applies to subtypes, for stand-alone objects RM 13.3(35) says: "For stand-alone library-level objects of statically constrained subtypes, the implementation should support all Alignments supported by the target linker. For example, page alignment is likely to be supported for such objects, but not for subtypes." So in theory you may need to put the 'Alignment clause on buffer objects instead. GNAT however, won't even recognize those alignment clauses, which IMHO is a shame: ------------------------------------------------------ package Test_Align is subtype Index is Integer range 0..250; type Buffer (Length: Index := 0) is record Contents: String(1..Length); end record; Buff: Buffer; for Buff'Alignment use 256; end Test_Align; ------------------------------------------------------- gnatmake test_align gcc -c test_align.ads test_align.ads:10:27: largest supported alignment for "Buff" is 4 gnatmake: "test_align.ads" compilation error When the programmer can do something simple like this to improve program performance, it should be supported by all compilers. (Notice that this is an error, not a warning.) I can see not supporting the subtype case due to the (potential) requirement for either large stack frames or varying size stack frames. But for an object that can be allocated at link time, I don't see why it shouldn't be supported. -- Robert I. Eachus "The flames kindled on the Fourth of July, 1776, have spread over too much of the globe to be extinguished by the feeble engines of despotism; on the contrary, they will consume these engines and all who work them." -- Thomas Jefferson, 1821 ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-21 0:17 ` Robert I. Eachus @ 2004-07-21 21:39 ` Randy Brukardt 2004-07-22 22:34 ` Robert I. Eachus 0 siblings, 1 reply; 44+ messages in thread From: Randy Brukardt @ 2004-07-21 21:39 UTC (permalink / raw) "Robert I. Eachus" <rieachus@comcast.net> wrote in message news:nM-dnegXLbmdK2DdRVn-hQ@comcast.com... ... > When the programmer can do something simple like this to improve program > performance, it should be supported by all compilers. (Notice that this > is an error, not a warning.) I can see not supporting the subtype case > due to the (potential) requirement for either large stack frames or > varying size stack frames. But for an object that can be allocated at > link time, I don't see why it shouldn't be supported. What's an object allocated at link time? I don't know of any such thing (you can allocate segments at link time, but the number of those is quite limited). Similarly, do you know of *any* compiler for *any* language that supports 256 byte alignment? I don't, at least on Windows. As far as I know, the largest alignment on Windows is paragraph. (There may be choices for larger alignments in the linker structures, but I would guess that if they are't used by the C compiler, they don't work. That's certainly been our experience with linkers on Windows, SunOS, SCO Unix, the U2000, etc. Virtually nothing we tried would work until we duplicated precisely what the local C compiler generated. Then all is fine...) Randy. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-21 21:39 ` Randy Brukardt @ 2004-07-22 22:34 ` Robert I. Eachus 2004-07-23 0:49 ` Randy Brukardt 0 siblings, 1 reply; 44+ messages in thread From: Robert I. Eachus @ 2004-07-22 22:34 UTC (permalink / raw) Randy Brukardt wrote: > "Robert I. Eachus" <rieachus@comcast.net> wrote in message > news:nM-dnegXLbmdK2DdRVn-hQ@comcast.com... > ... > >>When the programmer can do something simple like this to improve program >>performance, it should be supported by all compilers. (Notice that this >>is an error, not a warning.) I can see not supporting the subtype case >>due to the (potential) requirement for either large stack frames or >>varying size stack frames. But for an object that can be allocated at >>link time, I don't see why it shouldn't be supported. > > > What's an object allocated at link time? I don't know of any such thing (you > can allocate segments at link time, but the number of those is quite > limited). Um, a compiler can (but most Ada compilers don't) allocate objects in library packages on the heap, in static storage, or even in code segments. But you are right that this is very definitely not the usual in x86 compilers and environments. > Similarly, do you know of *any* compiler for *any* language that supports > 256 byte alignment? I don't, at least on Windows. You are probably correct with regards to Windows. I do know of compilers that do support such alignments, but only for supercomputers. With the rate at which x86 chips are taking over the supercomputer market though, I'll have to check. But the real reason I posted all this is that Ada compilers for x86, including x86 Windows SHOULD support this alignment, even if it is relatively painful to do so. (Painful in terms of gaps in the stack, or doing the extra effort required when allocating space on the heap.) The case of a String buffer when reading files is a perfect case in point. If the buffer is 128 (AMD) or 256 (Intel) byte aligned when reading from a memory-mapped file, you will reduce the number of cache line misses during the execution of the program. (If the buffer is in a single cache line, then that line will stay resident in L1 cache. If the buffer is distributed over two (Intel) or more (AMD) cache lines, the lines that are not referenced every line may get paged out. This is much more likely on an Intel CPU, and is potentially much more painful when it happens. The 'exclusive' cache feature on AMD processors means that if the line gets replaced in L1, it will be copied back to L2. So it takes two cache line replacements to push the line out of cache entirely. With Intel the line can get moved to the much smaller L1 cache, then overwritten in L2 cache. When it is overwritten in L1, then it will have to be pulled in from main memory next time around. We may only be talking say, a 2 or 3% slowdown from such a misaligned text buffer. Not really worth going to all the trouble to align buffers for casual programming. But when I am working on a linear algebra code, I do go to the effort where it does pay off. (For example, if you represent the basis in a linear programming subroutine as a matrix and a series of pivots, it makes for a significant improvement in performance to start the pivot rows on the correct cache line boundary.) -- Robert I. Eachus "The flames kindled on the Fourth of July, 1776, have spread over too much of the globe to be extinguished by the feeble engines of despotism; on the contrary, they will consume these engines and all who work them." -- Thomas Jefferson, 1821 ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-22 22:34 ` Robert I. Eachus @ 2004-07-23 0:49 ` Randy Brukardt 2004-07-23 21:56 ` Nick Roberts 2004-07-24 2:56 ` Robert I. Eachus 0 siblings, 2 replies; 44+ messages in thread From: Randy Brukardt @ 2004-07-23 0:49 UTC (permalink / raw) "Robert I. Eachus" <rieachus@comcast.net> wrote in message news:OrednWv2_cdw3Z3cRVn-uQ@comcast.com... > Randy Brukardt wrote: ... > > Similarly, do you know of *any* compiler for *any* language that supports > > 256 byte alignment? I don't, at least on Windows. > > You are probably correct with regards to Windows. I do know of > compilers that do support such alignments, but only for supercomputers. > With the rate at which x86 chips are taking over the supercomputer > market though, I'll have to check. > > But the real reason I posted all this is that Ada compilers for x86, > including x86 Windows SHOULD support this alignment, even if it is > relatively painful to do so. (Painful in terms of gaps in the stack, or > doing the extra effort required when allocating space on the heap.) The > case of a String buffer when reading files is a perfect case in point. > If the buffer is 128 (AMD) or 256 (Intel) byte aligned when reading from > a memory-mapped file, you will reduce the number of cache line misses > during the execution of the program. (If the buffer is in a single > cache line, then that line will stay resident in L1 cache. If the > buffer is distributed over two (Intel) or more (AMD) cache lines, the > lines that are not referenced every line may get paged out. That could only be done at run-time, as you couldn't insure anything about the alignment of the stack at compile-time. (That's probably why GNAT will support only 4 byte alignment, which is about all you can guarentee.) So you're asking to make subprogram linkage more expensive, to make heap allocation more expensive, and probably to use indirect access to statically allocated objects (in order to align the starting address). I don't doubt that there are cases where you might gain a tiny bit of performance from doing so, but it seems a large burden on all of the users to insist on it. Indeed, it would make the most sense to allocate such objects from a storage pool (with enough extra memory to support the alignment); align the resulting address, and use an address clause to force the object to use that memory. That would get the performance benefit in the rare case where it would help without costing anything to implementors or to users of programs that don't need the alignment. Randy. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-23 0:49 ` Randy Brukardt @ 2004-07-23 21:56 ` Nick Roberts 2004-07-24 0:34 ` tmoran 2004-07-24 1:42 ` Randy Brukardt 2004-07-24 2:56 ` Robert I. Eachus 1 sibling, 2 replies; 44+ messages in thread From: Nick Roberts @ 2004-07-23 21:56 UTC (permalink / raw) On Thu, 22 Jul 2004 19:49:34 -0500, Randy Brukardt <randy@rrsoftware.com> wrote: > ... > That could only be done at run-time, as you couldn't insure anything > about the alignment of the stack at compile-time. (That's probably > why GNAT will support only 4 byte alignment, which is about all you > can guarentee.) So you're asking to make subprogram linkage more > expensive, to make heap allocation more expensive, and probably to > use indirect access to statically allocated objects (in order to align > the starting address). I don't doubt that there are cases where you > might gain a tiny bit of performance from doing so, but it seems a > large burden on all of the users to insist on it. Randy, this is weird. It is a well established technique for highly optimising compilers to align things for cache efficiency. Good grief there are whole books on the subject. Not only do they advocate the possibility of aligning both basic blocks (code) and data objects on cache-line boundaries, but they advocate that the compiler do it automatically wherever possible. It may be that there aren't any highly optimising Ada compilers (yet ;-) but Robert is suggesting compilers /should/ support this kind of alignment, so how can you disagree? Do you think all those computer scientists have got it terribly wrong? If you think having big 'gaps' is an efficiency concern, I think the idea is that you fill in the gaps with smaller objects (or basic blocks). If you are worried about the fact that all stacks and heaps/ pools must be cache-line aligned (32, 64 bytes?), you have missed the RAM revolution that has been going on for the last two decades ;-) My cheapo off-the-back-of-a-lorry PC has 1/2 GiB of RAM. -- Nick Roberts ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-23 21:56 ` Nick Roberts @ 2004-07-24 0:34 ` tmoran 2004-07-24 1:16 ` Nick Roberts 2004-07-24 1:42 ` Randy Brukardt 1 sibling, 1 reply; 44+ messages in thread From: tmoran @ 2004-07-24 0:34 UTC (permalink / raw) >RAM revolution that has been going on for the last two decades ;-) > >My cheapo off-the-back-of-a-lorry PC has 1/2 GiB of RAM. And how much fast memory (ie, cache) does it have? ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-24 0:34 ` tmoran @ 2004-07-24 1:16 ` Nick Roberts 0 siblings, 0 replies; 44+ messages in thread From: Nick Roberts @ 2004-07-24 1:16 UTC (permalink / raw) On Sat, 24 Jul 2004 00:34:29 GMT, <tmoran@acm.org> wrote: >> RAM revolution that has been going on for the last two >> decades ;-) >> >> My cheapo off-the-back-of-a-lorry PC has 1/2 GiB of RAM. > And how much fast memory (ie, cache) does it have? I'm not sure if you meant it, Tom, but that's precisely the point. Every machine has far less cache than main memory, but the cache memory is much faster (and on an SMP machine, private to each processor). So it can be a big advantage for the compiler to generate code and object placements that make optimum use of the cache (or to permit such placements). -- Nick Roberts ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-23 21:56 ` Nick Roberts 2004-07-24 0:34 ` tmoran @ 2004-07-24 1:42 ` Randy Brukardt 2004-07-24 15:14 ` Nick Roberts 1 sibling, 1 reply; 44+ messages in thread From: Randy Brukardt @ 2004-07-24 1:42 UTC (permalink / raw) "Nick Roberts" <nick.roberts@acm.org> wrote in message news:opsbl1vsgsp4pfvb@bram-2... > On Thu, 22 Jul 2004 19:49:34 -0500, Randy Brukardt <randy@rrsoftware.com> > wrote: > > > ... > > That could only be done at run-time, as you couldn't insure anything > > about the alignment of the stack at compile-time. (That's probably > > why GNAT will support only 4 byte alignment, which is about all you > > can guarentee.) So you're asking to make subprogram linkage more > > expensive, to make heap allocation more expensive, and probably to > > use indirect access to statically allocated objects (in order to align > > the starting address). I don't doubt that there are cases where you > > might gain a tiny bit of performance from doing so, but it seems a > > large burden on all of the users to insist on it. > > Randy, this is weird. It is a well established technique for highly > optimising compilers to align things for cache efficiency. Good grief > there are whole books on the subject. Not only do they advocate the > possibility of aligning both basic blocks (code) and data objects on > cache-line boundaries, but they advocate that the compiler do it > automatically wherever possible. Well, first of all, books don't necessarily equal practice. If aligning things causes a program to use more pages, it can make it run slower, because it makes it load code from disk more frequently. (And if you think that everything is always in main memory, you forget one of the primary rules of computing: programs and data always expand to fill - and overfill - available resources). Anyway, I wasn't arguing that alignment per-se is a bad idea. We do it on integers, for instance, and I think that virtually all compilers do that. I was arguing that on the x86, stack alignments beyond 4 can only be done at run-time. (Unless *all* software in the system in under your control, and there are no interrupts/signals on your stack -- never true in practice.) That's a distributed penalty that gets paid everywhere. Similarly, existing Windows linkers don't support alignments beyond 16 to my knowledge -- so again you would have to do something at runtime with a penalty. In both cases, the penalty might very well cost more than the time savings possible. Given there is a penalty, doing alignments automatically is a bad idea. > If you think having big 'gaps' is an efficiency concern, I think the > idea is that you fill in the gaps with smaller objects (or basic > blocks). Last time I checked, Intel was recommending that labels in code not be aligned further than 4 byte boundaries. I don't know precisely why they recommended that, but I don't claim to know better than Intel! > If you are worried about the fact that all stacks and heaps/ > pools must be cache-line aligned (32, 64 bytes?), you have missed the > RAM revolution that has been going on for the last two decades ;-) That's only possible if you build a new OS from the ground up. Stacks aren't aligned in Windows or Linux. So you have a pay a penalty to make them so; and because of interrupt handlers and the like, you can't even trust your own stack. Heap allocations aren't aligned in Windows, either. (Although you could build you own heap on top of the page management in Windows -- but you better be prepared to allocate 64K at a time.) Again, you can fix this with run-time overhead. But if you're willing to spend run-time overhead, an address clause does the same thing without any work. Randy. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-24 1:42 ` Randy Brukardt @ 2004-07-24 15:14 ` Nick Roberts 2004-07-26 23:48 ` Randy Brukardt 0 siblings, 1 reply; 44+ messages in thread From: Nick Roberts @ 2004-07-24 15:14 UTC (permalink / raw) On Fri, 23 Jul 2004 20:42:53 -0500, Randy Brukardt <randy@rrsoftware.com> wrote: > ... > Well, first of all, books don't necessarily equal practice. In other words, you /are/ trying to say all those computer scientists got it wrong ;-) > If aligning things causes a program to use more pages, it > can make it run slower, because it makes it load code from > disk more frequently. But we (Robert and I) are talking about using alignments sparingly, to improve the efficiency of the speed-critical parts of a program. Surely you've heard of the 80-20 rule? (Which is, of course, silly, being the 99-1 rule in reality.) > Anyway, I wasn't arguing that alignment per-se is a bad > idea. We do it on integers, for instance, and I think that > virtually all compilers do that. > I was arguing that on the x86, stack alignments beyond 4 > can only be done at run-time. (Unless *all* software in > the system in under your control, and there are no > interrupts/signals on your stack -- never true in > practice.) But Randy, it you get a signal/interrupt on your stack, it all happens on the top of your stack. It doesn't affect the stack's alignment! Were you actually talking about callbacks? In any event, all the compiler has to do to align the stack to 2^n bytes just prior to (parameter pushing and) subroutine call is to emit: and esp, -2^n et voila! > That's a distributed penalty that gets paid everywhere. No it isn't. Only in calling those subroutines which require alignment, and even then the penalty is an 'and' instruction which, as you know, can probably be scheduled to take zero time on a superscalar target. > Similarly, existing Windows linkers don't support alignments > beyond 16 to my knowledge -- so again you would have to do > something at runtime with a penalty. But then the point is that the linkers /should/ support other alignments. It's no good saying "Oh, we can't do that because the linker doesn't support it!" Obviously, you need to change the linker. It's called not letting the tail wag the dog :-) > In both cases, the penalty might very well cost more than > the time savings possible. I think I've demonstrated that this is very unlikely. > Given there is a penalty, doing alignments automatically is > a bad idea. All I can say is that, given that there /isn't/ a penalty, doing (cache-line) alignments automatically is a /good/ idea :-) > Last time I checked, Intel was recommending that labels in > code not be aligned further than 4 byte boundaries. The latest advice is: Loop entry labels should be 16-byte-aligned when less than eight bytes away from a 16-byte boundary. Labels that follow a conditional branch need not be aligned. Labels that follow an unconditional branch or function call should be 16-byte-aligned when less than eight bytes away from a 16-byte boundary. Use a compiler that will assure these rules are met for the generated code. [Section 2, Intel Architecture Optimization Reference Manual, Copyright (c) 1998, 1999 Intel Corporation All Rights Reserved Issued in U.S.A., Order Number: 245127-001] > I don't know precisely why they recommended that, but I don't > claim to know better than Intel! Well, I don't think they ever did; maybe you need to do some re-reading. >> If you are worried about the fact that all stacks and heaps/ >> pools must be cache-line aligned (32, 64 bytes?), you have >> missed the RAM revolution that has been going on for the last >> two decades ;-) > > That's only possible if you build a new OS from the ground up. Hehe :-) > Stacks aren't aligned in Windows or Linux. So you have a pay > a penalty to make them so; Again, I think the penalty is tiny (or zero), and not universal. > and because of interrupt handlers and the like, Did you mean callbacks? > you can't even trust your own stack. Indeed, so you have to align it yourself using an 'and'. > Heap allocations aren't aligned in Windows, either. (Although you could > build you own heap on top of the page management in > Windows -- but you better be prepared to allocate 64K at a > time.) Again, you can fix this with run-time overhead. Okay, but the example that Robert gave was of a (presumably) stack allocated object, and nobody mentioned anything about Windows or the IA-32 before you did. In general, there's nothing to prevent heaps/pools being capable of cache-line aligned allocation; I guess it would be harder to use the gaps for smaller allocations, but I'm sure that doesn't really matter. > But if you're willing to spend run-time overhead, an > address clause does the same thing without any work. Well, I would argue that a good highly optimising compiler should provide a convenient and portable way of enabling the programmer to achieve cache-line optimisations, for both code and data. Probably the best way is by providing appropriate pragmas (that will be harmlessly ignored when irrelevant). A possibility is to interpret the humble pragma Optimize(Time); to mean doing the cache-line alignments recommended for the target processor (group or architecture). In general, it is better for the compiler to make decisions about code or data placement for optimisation purposes, since only the compiler can know /all/ the other implement- ational details which could affect these decisions. I think it is best for the compiler to make these decisions guided by hints given in the form of pragmas. However, if a compiler does not do cache-line optimisations itself (automatically), it ought to support some reasonable method by which it can be done explicitly (and I don't think using an address clause is ideal for this purpose). I think think it is implicit that by 'compiler' Robert and I mean 'the toolchain necessary to get from source to executable'. -- Nick Roberts ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-24 15:14 ` Nick Roberts @ 2004-07-26 23:48 ` Randy Brukardt 2004-07-27 12:08 ` Nick Roberts 0 siblings, 1 reply; 44+ messages in thread From: Randy Brukardt @ 2004-07-26 23:48 UTC (permalink / raw) "Nick Roberts" <nick.roberts@acm.org> wrote in message news:opsbndy0o1p4pfvb@bram-2... > On Fri, 23 Jul 2004 20:42:53 -0500, Randy Brukardt <randy@rrsoftware.com> > wrote: > > > ... > > Well, first of all, books don't necessarily equal practice. > > In other words, you /are/ trying to say all those computer > scientists got it wrong ;-) No, they're just ignoring the realities of the target systems. Most articles I see make that mistake. (Including this one. :-) > > If aligning things causes a program to use more pages, it > > can make it run slower, because it makes it load code from > > disk more frequently. > > But we (Robert and I) are talking about using alignments > sparingly, to improve the efficiency of the speed-critical > parts of a program. Surely you've heard of the 80-20 rule? > (Which is, of course, silly, being the 99-1 rule in reality.) The largest alignment that you allow impacts the design of your stack and of your storage pool, at least if you intend to do it at compile-time. That's a distributed overhead - it's small, but certainly not zero. ... > In any event, all the compiler has to do to align the stack > to 2^n bytes just prior to (parameter pushing and) subroutine > call is to emit: > > and esp, -2^n > > et voila! How do you undo this when you leave the scope? You have to save the ESP value somewhere and restore it to do that, and *that* is an extra overhead. ... > > Similarly, existing Windows linkers don't support alignments > > beyond 16 to my knowledge -- so again you would have to do > > something at runtime with a penalty. > > But then the point is that the linkers /should/ support other > alignments. It's no good saying "Oh, we can't do that because > the linker doesn't support it!" Obviously, you need to change > the linker. It's called not letting the tail wag the dog :-) You know as well I as do that you don't get to change your target system to your whim. You have to use the tools that users want to use, such as the Microsoft linker. But even if you wrote your own linker, I don't think that there is any guarentee of alignment in the loading of the parts of an .EXE file. So I don't know if any alignment that you have in your linker would actually be preserved. ... > > Last time I checked, Intel was recommending that labels in > > code not be aligned further than 4 byte boundaries. > > The latest advice is: > > Loop entry labels should be 16-byte-aligned when less than > eight bytes away from a 16-byte boundary. > > Labels that follow a conditional branch need not be aligned. > > Labels that follow an unconditional branch or function call > should be 16-byte-aligned when less than eight bytes away > from a 16-byte boundary. > > Use a compiler that will assure these rules are met for the > generated code. > > [Section 2, Intel Architecture Optimization Reference Manual, > Copyright (c) 1998, 1999 Intel Corporation All Rights Reserved > Issued in U.S.A., Order Number: 245127-001] > > > I don't know precisely why they recommended that, but I don't > > claim to know better than Intel! > > Well, I don't think they ever did; maybe you need to do some > re-reading. That's it. That's the third time in the last few months that you've essentially called me a liar - or senile - and I'm done taking it without comment. Either we're going to talk without personal attacks, or we're not going to talk at all. OK? For the record, my knowledge of Intel's recommendations primarily comes from an Intel seminar I attended some years ago. Since it was covered by an NDA (non-disclosure agreement), I can't even show you - or tell you for that matter - much more than that. In any case, the rules that you gave above are weaker in most areas than the ones I remember (labels at 4, subprograms at 16), and certainly give no indication of the value of cache-line sized optimizations -- which is what I think we were talking about. I see nothing above recommending alignments greater than 16 for anything. Randy. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-26 23:48 ` Randy Brukardt @ 2004-07-27 12:08 ` Nick Roberts 2004-07-27 23:24 ` Robert I. Eachus 2004-07-29 0:53 ` Randy Brukardt 0 siblings, 2 replies; 44+ messages in thread From: Nick Roberts @ 2004-07-27 12:08 UTC (permalink / raw) [I've put my replies out of order, because I think there's a bit in the middle that needs to be said first.] On Mon, 26 Jul 2004 18:48:04 -0500, Randy Brukardt <randy@rrsoftware.com> wrote: > ... >> > Last time I checked, Intel was recommending that labels in >> > code not be aligned further than 4 byte boundaries. >> >> The latest advice is: >> >> Loop entry labels should be 16-byte-aligned when less than >> eight bytes away from a 16-byte boundary. >> >> Labels that follow a conditional branch need not be aligned. >> >> Labels that follow an unconditional branch or function call >> should be 16-byte-aligned when less than eight bytes away >> from a 16-byte boundary. >> >> Use a compiler that will assure these rules are met for the >> generated code. >> >> [Section 2, Intel Architecture Optimization Reference Manual, >> Copyright (c) 1998, 1999 Intel Corporation All Rights Reserved >> Issued in U.S.A., Order Number: 245127-001] >> >> > I don't know precisely why they recommended that, but I don't >> > claim to know better than Intel! >> >> Well, I don't think they ever did; maybe you need to do some >> re-reading. > > That's it. That's the third time in the last few months that > you've essentially called me a liar - or senile - and I'm done > taking it without comment. Either we're going to talk without > personal attacks, or we're not going to talk at all. OK? Well, that comes as a bolt out of the blue, Randy. Let me first assure you that neither this time nor at any time in the past have I intended to imply that were lying or to make any personal slight against you. On consideration, I feel that I should not have made the remark "maybe you need to do some re-reading", and I do truly apologise for it. It was intended to be lighthearted and to be taken in a friendly manner. Usenet is a medium given to stripping away all the extra cues that a different medium (such as a telephone call) would convey that help to disambiguate communications. It is easy, sometimes, to forget this, but I should have known better. In fact, I'm very unhappy that this seems to be the impression that you have got of me Randy, because the truth is -- though sadly you may not believe it now -- I have the greatest respect for you, and I honestly admire you: for what you have done and continue to do for the ARG and Ada standards and to champion the use of Ada; for your contributions to the Ada community (as I know it, in terms of Usenet and other Internet venues), and the friendly and helpful manner of those contributions. We may have had disagreements about lots of things during the course of discussions between us, but there is big, big difference, as far as I am concerned, between disagreeing with someone and having less respect for them. I do really hope that I have not permanently destroyed any faith you may have had in me, and I regret anything I may have said in the past to this effect. I often have a clumsy and hasty style of writing on Usenet, and I'm sure that often what I say comes across with a different meaning or emphasis to what I intended. That said, I hope my remaining replies will be taken in good part. > For the record, my knowledge of Intel's recommendations primarily > comes from an Intel seminar I attended some years ago. Since it > was covered by an NDA (non-disclosure agreement), I can't even > show you - or tell you for that matter - much more than that. I think I once read a magazine article that said Intel were no longer recommending cache-line (or half-line) alignments for code, for their (as it was then) upcoming Pentium model. I have read this sort of thing before, and dismissed it as hype or gossip, since the official (published) Intel recommendations never changed in the event. So I have tended to assume that repetitions of the idea have simply been repetitions of gossip. Obviously, since your information in fact comes from direct from Intel, I was wrong, and I was wrong to have doubted you. > In any case, the rules that you gave above are weaker in most > areas than the ones I remember (labels at 4, subprograms at 16), > and certainly give no indication of the value of cache-line > sized optimizations -- which is what I think we were talking > about. I see nothing above recommending alignments greater than > 16 for anything. According to the manual, the 16-byte alignments are to do with the way the instruction pre-decoding unit loads code, which is 16-bytes (a cache 'half-line') at a time. But is the manual correct? >> > If aligning things causes a program to use more pages, >> > it can make it run slower, because it makes it load >> > code from disk more frequently. >> >> But we (Robert and I) are talking about using alignments >> sparingly, to improve the efficiency of the speed-critical >> parts of a program. Surely you've heard of the 80-20 rule? >> (Which is, of course, silly, being the 99-1 rule in >> reality.) > > The largest alignment that you allow impacts the design of > your stack and of your storage pool, at least if you intend > to do it at compile-time. That's a distributed overhead - > it's small, but certainly not zero. Well, that's true and I cannot argue with it per se. However, based on the presumption that typical software does spend something like 99% of the time in 1% of the code (and that 1% tends to be fairly 'tight' loops), I am not convinced that the extra memory space that a program will take up (both code and data) due to cache-line alignments is more likely to cause the program to slow down more than it will cause it to speed up (in that critical 1% of the code). This will be dependent on how big the working set is during the execution of that speed-critical code, in particular whether the working set is caused to exceed available RAM; if it is, then the program will indeed be slowed down. But, of course, I am saying that even cheap computers have a lot of RAM these days, so I think that eventuality is unlikely. > ... >> In any event, all the compiler has to do to align the stack >> to 2^n bytes just prior to (parameter pushing and) subroutine >> call is to emit: >> >> and esp, -2^n >> >> et voila! > > How do you undo this when you leave the scope? You have to > save the ESP value somewhere and restore it to do that, and > *that* is an extra overhead. Well, I don't think so. The usual thing is to do is to save ESP in the EBP register at stack frame creation, and restore it from EBP just prior to return. There is, I grant, a need for a little care, in that one would (I guess) need to do the stack alignment I suggested before pushing anything onto the stack that you might want to pop off it afterwards. Otherwise, I think the 'and' instruction is the only extra thing required. I vaguely remember that I have actually used this technique, but a long time ago. > > ... >> > Similarly, existing Windows linkers don't support >> > alignments beyond 16 to my knowledge -- so again you would >> > have to do something at runtime with a penalty. >> >> But then the point is that the linkers /should/ support other >> alignments. It's no good saying "Oh, we can't do that because >> the linker doesn't support it!" Obviously, you need to change >> the linker. It's called not letting the tail wag the dog :-) > > You know as well I as do that you don't get to change your > target system to your whim. You have to use the tools that > users want to use, such as the Microsoft linker. > > But even if you wrote your own linker, I don't think that there > is any guarentee of alignment in the loading of the parts of an > .EXE file. So I don't know if any alignment that you have in > your linker would actually be preserved. I can't quickly find information on the subject, but I rather suspect that an .EXE or .DLL is likely to be loaded page aligned. That would mean alignments up to the page size would be safe. Also, I think possibly we're arguing at crossed purposes on this point. I'm only arging that linkers and execution environments /should/ support cache-line alignments. I accept that many do not, in practice, and I accept that a compiler targetting such a linker or environment cannot be expected to so so either. I think this is how Robert's original comment can be construed, also. -- Nick Roberts ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-27 12:08 ` Nick Roberts @ 2004-07-27 23:24 ` Robert I. Eachus 2004-07-29 0:55 ` Randy Brukardt 2004-07-29 0:53 ` Randy Brukardt 1 sibling, 1 reply; 44+ messages in thread From: Robert I. Eachus @ 2004-07-27 23:24 UTC (permalink / raw) Nick Roberts wrote: > Let me first assure you that neither this time nor at any time in > the past have I intended to imply that were lying or to make any > personal slight against you. I didn't read Nick's words as indicating anything other than "things have changed in this area." But I wasn't the target. Personally, though, I think this is VERY important discussion, and I hope we can keep to the issues. I was surprised to see GNAT saying it would only do doubleword (4-byte) alignment, because 8-byte alignment has gone into and out of programming guides with each new hardware generation. > According to the manual, the 16-byte alignments are to do with > the way the instruction pre-decoding unit loads code, which is > 16-bytes (a cache 'half-line') at a time. But is the manual > correct? Don't know about the Intel IA-32 manual, but the AMD "Software Optimization Guide for AMD Athlon� 64 and AMD Opteron� Processors" http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/25112.PDF indicates that the latest AMD processors now use a 32-byte code decoding window. The Intel Itanium2 also loads two instruction bundles (32-bytes) at a time. > Well, I don't think so. The usual thing is to do is to save > ESP in the EBP register at stack frame creation, and restore it > from EBP just prior to return. There is, I grant, a need for a > little care, in that one would (I guess) need to do the stack > alignment I suggested before pushing anything onto the stack > that you might want to pop off it afterwards. Otherwise, I > think the 'and' instruction is the only extra thing required. > > I vaguely remember that I have actually used this technique, > but a long time ago. The AMD manual referenced above gives the example code to do this on page 128 (in Section 5.13): prologue: push ebp mov ebp, esp sub esp, SIZE_OF_LOCALS ; Size of local variables and esp, �8 ... ; Push registers that need to be preserved. epilogue: ; Pop register that needed to be preserved. leave ret This example is explictly showing a quadword alignment (8-bytes). Compilers definitely should do this for code with quadword (usually Long_Float in Ada) values. Of course, to do cache boundary alignment as well, you replace -8 with -64 (or -256 on Intel Pentium4 CPUs). The waste of space on the stack is minor, or should be if it is only done when the programmer explicitly requests it. Again, in the code where I need to do this, the _execution_time_ cost should be zero, since the stack frame needs to be quad-word aligned for other reasons. > Also, I think possibly we're arguing at crossed purposes on > this point. I'm only arging that linkers and execution > environments /should/ support cache-line alignments. I accept > that many do not, in practice, and I accept that a compiler > targetting such a linker or environment cannot be expected > to so so either. I think this is how Robert's original comment > can be construed, also. Right. But as discussed above, aligning stack frames is something any compiler can do, whether on x86 or elsewhere. Also the heap management software can/should allow for an allocation request to specify alignment. MicroQuill sells a very nice library to replace malloc and free with better performing versions, if the 'native' OS functions are not aware of cache line and disk page sizes. (Heap objects should never be allocated across vitrual memory page boundaries unless they are too big to fit in a single page. But some versions of malloc ignore page boundaries when allocating objects in the heap.) -- Robert I. Eachus "The flames kindled on the Fourth of July, 1776, have spread over too much of the globe to be extinguished by the feeble engines of despotism; on the contrary, they will consume these engines and all who work them." -- Thomas Jefferson, 1821 ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-27 23:24 ` Robert I. Eachus @ 2004-07-29 0:55 ` Randy Brukardt 0 siblings, 0 replies; 44+ messages in thread From: Randy Brukardt @ 2004-07-29 0:55 UTC (permalink / raw) [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 684 bytes --] "Robert I. Eachus" <rieachus@comcast.net> wrote in message news:KOadnc3XCbaleZvcRVn-qQ@comcast.com... ... > The AMD manual referenced above gives the example code to do this on > page 128 (in Section 5.13): > > prologue: > push ebp > mov ebp, esp > sub esp, SIZE_OF_LOCALS ; Size of local variables > and esp, �8 > ... ; Push registers that need to be preserved. > > epilogue: ; Pop register that needed to be preserved. > leave > ret "Leave" used to be one of the instructions that Intel told you to avoid, although they were rather ambigious about it. Anyway, we put EBP at the bottom of the frame, so "leave" doesn't work. Randy. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-27 12:08 ` Nick Roberts 2004-07-27 23:24 ` Robert I. Eachus @ 2004-07-29 0:53 ` Randy Brukardt 2004-07-29 7:25 ` Martin Dowie 2004-07-29 20:08 ` Robert I. Eachus 1 sibling, 2 replies; 44+ messages in thread From: Randy Brukardt @ 2004-07-29 0:53 UTC (permalink / raw) "Nick Roberts" <nick.roberts@acm.org> wrote in message news:opsbspb4bep4pfvb@bram-2... ... > >> > I don't know precisely why they recommended that, but I don't > >> > claim to know better than Intel! > >> > >> Well, I don't think they ever did; maybe you need to do some > >> re-reading. > > > > That's it. That's the third time in the last few months that > > you've essentially called me a liar - or senile - and I'm done > > taking it without comment. Either we're going to talk without > > personal attacks, or we're not going to talk at all. OK? > > Well, that comes as a bolt out of the blue, Randy. > > Let me first assure you that neither this time nor at any time in > the past have I intended to imply that were lying or to make any > personal slight against you. > > On consideration, I feel that I should not have made the remark > "maybe you need to do some re-reading", and I do truly apologise > for it. It was intended to be lighthearted and to be taken in a > friendly manner. Usenet is a medium given to stripping away all > the extra cues that a different medium (such as a telephone call) > would convey that help to disambiguate communications. It is easy, > sometimes, to forget this, but I should have known better. For the record, I was most upset about the first part, not the second. I have no problem believing that the recommendations have changed - and you quoted some that are different (they seem rather old, but that's another story). But to say "I don't think that they ever did" recommend what I originally reported says that they *never* recommended what I remember and essentially that I was trying to mislead the conversation by saying that. Not good. Anyway, I accept your apology, and I'll try to be less sensitive next time. ... > > How do you undo this when you leave the scope? You have to > > save the ESP value somewhere and restore it to do that, and > > *that* is an extra overhead. > > Well, I don't think so. The usual thing is to do is to save > ESP in the EBP register at stack frame creation, and restore it > from EBP just prior to return. There is, I grant, a need for a > little care, in that one would (I guess) need to do the stack > alignment I suggested before pushing anything onto the stack > that you might want to pop off it afterwards. Otherwise, I > think the 'and' instruction is the only extra thing required. I realize that we're weird here, but EBP points at the bottom of the stack frame in Janus/Ada; that gives us positive stack offsets. We had a lot of trouble with negative ones in the early days, and I just gave up on that. In any case, we spend quite a bit of effort trying to avoid setting EBP at all. For small leaf subprograms, the overhead of writing then restoring EBP can be a significant percentage of the cost of the whole routine. Thus, we get rid of the stack frame with an Add, and that leaves us with no obvious way to do an alignment. (Alignment is not reflected in our intermediate code, as that is supposed to be done by the data layout earlier in the compiler. So it's either all or nothing - it has to be done for all stack frames or not supported; I suspect many other compilers are similar.) Anyway, my opinion these days is that spending a lot of effort making something run 2% faster is wasted effort. You're always better off changing to a different way of solving the problem. The most recent instance was in my web log analyzer. It was running too slow on the AdaIC site's logs, and I wasted a lot of time trying to improve it. But replacing the binary lookups (log N, N being around 200,000) by a hashed lookup (very similar to switching from Sorted _Sets to Hashed_Maps in the Containers library) improved the speed by a factor of 5 (a result I didn't expect, because I had to use an expensive hash function -- all of the cheap ones I tried didn't work well on the actual data -- and log N wasn't that large -- between 12 and 19 on the data I tested with). Moral: make sure you've exhausted algorithmic improvements before even thinking about squeezing a few extra percent out of the code. And when you think you've exhaused algorithmic improvements, try again, because sometimes non-obvious things work! (We hadn't originally used a hash because of the need to write sorted reports. But it turned out that using a hash and a quicksort on the report was faster than keeping the data sorted.) Randy. > I vaguely remember that I have actually used this technique, > but a long time ago. > > > > > ... > >> > Similarly, existing Windows linkers don't support > >> > alignments beyond 16 to my knowledge -- so again you would > >> > have to do something at runtime with a penalty. > >> > >> But then the point is that the linkers /should/ support other > >> alignments. It's no good saying "Oh, we can't do that because > >> the linker doesn't support it!" Obviously, you need to change > >> the linker. It's called not letting the tail wag the dog :-) > > > > You know as well I as do that you don't get to change your > > target system to your whim. You have to use the tools that > > users want to use, such as the Microsoft linker. > > > > But even if you wrote your own linker, I don't think that there > > is any guarentee of alignment in the loading of the parts of an > > .EXE file. So I don't know if any alignment that you have in > > your linker would actually be preserved. > > I can't quickly find information on the subject, but I rather > suspect that an .EXE or .DLL is likely to be loaded page > aligned. That would mean alignments up to the page size would > be safe. > > Also, I think possibly we're arguing at crossed purposes on > this point. I'm only arging that linkers and execution > environments /should/ support cache-line alignments. I accept > that many do not, in practice, and I accept that a compiler > targetting such a linker or environment cannot be expected > to so so either. I think this is how Robert's original comment > can be construed, also. > > -- > Nick Roberts ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-29 0:53 ` Randy Brukardt @ 2004-07-29 7:25 ` Martin Dowie 2004-07-29 20:08 ` Robert I. Eachus 1 sibling, 0 replies; 44+ messages in thread From: Martin Dowie @ 2004-07-29 7:25 UTC (permalink / raw) Randy Brukardt wrote: > Anyway, my opinion these days is that spending a lot of effort making > something run 2% faster is wasted effort. You're always better off > changing to a different way of solving the problem. The most recent > instance was in my web log analyzer. It was running too slow on the > AdaIC site's logs, and I wasted a lot of time trying to improve it. > But replacing the binary lookups (log N, N being around 200,000) by a > hashed lookup (very similar to switching from Sorted _Sets to > Hashed_Maps in the Containers library) improved the speed by a factor > of 5 (a result I didn't expect, because I had to use an expensive > hash function -- all of the cheap ones I tried didn't work well on > the actual data -- and log N wasn't that large -- between 12 and 19 > on the data I tested with). Moral: make sure you've exhausted > algorithmic improvements before even thinking about squeezing a few > extra percent out of the code. And when you think you've exhaused > algorithmic improvements, try again, because sometimes non-obvious > things work! (We hadn't originally used a hash because of the need to > write sorted reports. But it turned out that using a hash and a > quicksort on the report was faster than keeping the data sorted.) I'd whole heartedly second this advice. It reminds me of a recent case where a colleague had a program that seemed to be taking forever. I can't recall what data structure he was using but he worked out that at it's current speed it was going to need something fast approximating to the entire life of the universe so far to complete! He switched to a 'quadtree' and "bingo" - it only took a few (tens of) hours! Cheers -- Martin ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-29 0:53 ` Randy Brukardt 2004-07-29 7:25 ` Martin Dowie @ 2004-07-29 20:08 ` Robert I. Eachus 2004-07-30 0:14 ` tmoran 1 sibling, 1 reply; 44+ messages in thread From: Robert I. Eachus @ 2004-07-29 20:08 UTC (permalink / raw) Randy Brukardt wrote: > Anyway, my opinion these days is that spending a lot of effort making > something run 2% faster is wasted effort. You're always better off changing > to a different way of solving the problem... In general agreed. The only place I currently go through the pain of cache aligning data structures is in matrix multiplication and other linear algebra code for large matrix sizes. But there the difference is often a factor of 3 or more in execution time. How many people actually WRITE such code? Very few. That is what ATLAS is all about. It allows you to create a LINPACK and LAPACK version that is optimized for your exact execution environment, without worrying about all this. Of course that means that the people who port ATLAS to different architectures, are the ones that have to worry about such grody details. (And note that the right ATLAS version for Pentium 3 is not the right version for Pentium 4, same for Athlon XP and Athlon64 and so on.) -- Robert I. Eachus "The flames kindled on the Fourth of July, 1776, have spread over too much of the globe to be extinguished by the feeble engines of despotism; on the contrary, they will consume these engines and all who work them." -- Thomas Jefferson, 1821 ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-29 20:08 ` Robert I. Eachus @ 2004-07-30 0:14 ` tmoran 0 siblings, 0 replies; 44+ messages in thread From: tmoran @ 2004-07-30 0:14 UTC (permalink / raw) >cache aligning data structures is in matrix multiplication and other >linear algebra code for large matrix sizes. > >But there the difference is often a factor of 3 or more in execution time. > >How many people actually WRITE such code? Very few. That is what ATLAS Are there compilers with a "pragma Really_Really_Optimize" option? Or post-compilers that read a small piece of object code, mull over it for quite some time, and write a very highly optimized replacement? ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: reading a text file into a string 2004-07-23 0:49 ` Randy Brukardt 2004-07-23 21:56 ` Nick Roberts @ 2004-07-24 2:56 ` Robert I. Eachus 1 sibling, 0 replies; 44+ messages in thread From: Robert I. Eachus @ 2004-07-24 2:56 UTC (permalink / raw) Randy Brukardt wrote: > Indeed, it would make the most sense to allocate such objects from a storage > pool (with enough extra memory to support the alignment); align the > resulting address, and use an address clause to force the object to use that > memory. That would get the performance benefit in the rare case where it > would help without costing anything to implementors or to users of programs > that don't need the alignment. You may have a winner Randy. The problem (to me) is that the pain of aligning buffers 'by hand' is high enough that I only do it when it is necessary to get decent performance. I am talking of cases where there is a 2x or 3x speedup if some objects are cache aligned. For ordinary programs where there may be a 5 to 10% benefit for aligning a particular buffer, it is just too much work without compiler support. But if I create a cache-aligned storage pool, where all objects are allocated on natural cache boundaries then I just need a couple extra lines to do the alignment. Of course the buffers will have to be on the heap, I'm thinking about the right garbage collection approach to use... -- Robert I. Eachus "The flames kindled on the Fourth of July, 1776, have spread over too much of the globe to be extinguished by the feeble engines of despotism; on the contrary, they will consume these engines and all who work them." -- Thomas Jefferson, 1821 ^ permalink raw reply [flat|nested] 44+ messages in thread
* Ada2005 (was Re: reading a text file into a string 2004-07-17 2:27 ` Robert I. Eachus 2004-07-17 11:31 ` Mats Weber 2004-07-19 8:07 ` Dale Stanbrough @ 2004-07-19 11:51 ` Peter Hermann 2004-07-19 12:51 ` Dmitry A. Kazakov 2004-07-19 13:01 ` Nick Roberts 2 siblings, 2 replies; 44+ messages in thread From: Peter Hermann @ 2004-07-19 11:51 UTC (permalink / raw) Robert I. Eachus <rieachus@comcast.net> wrote: > For this reason, I find myself contructing or using a Get_Line FUNCTION > inside a loop and a declare block: > > while not End_of_Line(Somefile) loop > declare > Buffer: String := Get_Line(Somefile); > begin > -- process buffer > exception > ... > end; > end loop; There is no compelling reason why such a FUNCTION get_line should not be in package specification Ada.text_io of Ada2005. Or did I miss something? -- --Peter Hermann(49)0711-685-3611 fax3758 ica2ph@csv.ica.uni-stuttgart.de --Pfaffenwaldring 27 Raum 114, D-70569 Stuttgart Uni Computeranwendungen --http://www.csv.ica.uni-stuttgart.de/homes/ph/ --Team Ada: "C'mon people let the world begin" (Paul McCartney) ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Ada2005 (was Re: reading a text file into a string 2004-07-19 11:51 ` Ada2005 (was " Peter Hermann @ 2004-07-19 12:51 ` Dmitry A. Kazakov 2004-07-19 13:01 ` Nick Roberts 1 sibling, 0 replies; 44+ messages in thread From: Dmitry A. Kazakov @ 2004-07-19 12:51 UTC (permalink / raw) On Mon, 19 Jul 2004 11:51:52 +0000 (UTC), Peter Hermann wrote: > Robert I. Eachus <rieachus@comcast.net> wrote: >> For this reason, I find myself contructing or using a Get_Line FUNCTION >> inside a loop and a declare block: >> >> while not End_of_Line(Somefile) loop >> declare >> Buffer: String := Get_Line(Somefile); >> begin >> -- process buffer >> exception >> ... >> end; >> end loop; > > There is no compelling reason why such a FUNCTION get_line > should not be in package specification Ada.text_io of Ada2005. It would be nice. > Or did I miss something? In Ada community there is a strong resistance against functions having side-effects, even when side-effect is hidden in an *in* File_Type parameter. A counter example would be: What_Is_This : String := Get_Line (File) & Get_Line (File); -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Ada2005 (was Re: reading a text file into a string 2004-07-19 11:51 ` Ada2005 (was " Peter Hermann 2004-07-19 12:51 ` Dmitry A. Kazakov @ 2004-07-19 13:01 ` Nick Roberts 2004-07-19 13:35 ` Martin Dowie 2004-07-19 23:50 ` Randy Brukardt 1 sibling, 2 replies; 44+ messages in thread From: Nick Roberts @ 2004-07-19 13:01 UTC (permalink / raw) On Mon, 19 Jul 2004 11:51:52 +0000 (UTC), Peter Hermann <ica2ph@sinus.csv.ica.uni-stuttgart.de> wrote: > ... > There is no compelling reason why such a FUNCTION get_line > should not be in package specification Ada.text_io of Ada2005. > Or did I miss something? AI95-301 suggests: I/O operations on unbounded strings are provided in a new child package of Ada.Text_IO. But I'm not sure if this one will get in. -- Nick Roberts ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Ada2005 (was Re: reading a text file into a string 2004-07-19 13:01 ` Nick Roberts @ 2004-07-19 13:35 ` Martin Dowie 2004-07-19 17:22 ` Nick Roberts 2004-07-19 23:50 ` Randy Brukardt 1 sibling, 1 reply; 44+ messages in thread From: Martin Dowie @ 2004-07-19 13:35 UTC (permalink / raw) Nick Roberts wrote: > On Mon, 19 Jul 2004 11:51:52 +0000 (UTC), Peter Hermann > <ica2ph@sinus.csv.ica.uni-stuttgart.de> wrote: > >> ... >> There is no compelling reason why such a FUNCTION get_line >> should not be in package specification Ada.text_io of Ada2005. >> Or did I miss something? > > AI95-301 suggests: I/O operations on unbounded strings are provided > in a new child package of Ada.Text_IO. > > But I'm not sure if this one will get in. Its current state is "Amendment 200Y", so I'd imagine its chances are "quite good"! :-) ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Ada2005 (was Re: reading a text file into a string 2004-07-19 13:35 ` Martin Dowie @ 2004-07-19 17:22 ` Nick Roberts 0 siblings, 0 replies; 44+ messages in thread From: Nick Roberts @ 2004-07-19 17:22 UTC (permalink / raw) On Mon, 19 Jul 2004 14:35:30 +0100, Martin Dowie <martin.dowie@baesystems.com> wrote: >> AI95-301 suggests: I/O operations on unbounded strings are >> provided in a new child package of Ada.Text_IO. >> >> But I'm not sure if this one will get in. > > Its current state is "Amendment 200Y", so I'd imagine its > chances are "quite good"! :-) Hooray! I hope it does. I notice this amendment does also include a string function to get a line of text. Although I usually dislike functions with side effects (in a procedural language), I think this one makes sense. -- Nick Roberts ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Ada2005 (was Re: reading a text file into a string 2004-07-19 13:01 ` Nick Roberts 2004-07-19 13:35 ` Martin Dowie @ 2004-07-19 23:50 ` Randy Brukardt 1 sibling, 0 replies; 44+ messages in thread From: Randy Brukardt @ 2004-07-19 23:50 UTC (permalink / raw) "Nick Roberts" <nick.roberts@acm.org> wrote in message news:opsbdygatdp4pfvb@bram-2... > On Mon, 19 Jul 2004 11:51:52 +0000 (UTC), Peter Hermann > <ica2ph@sinus.csv.ica.uni-stuttgart.de> wrote: > > > ... > > There is no compelling reason why such a FUNCTION get_line > > should not be in package specification Ada.text_io of Ada2005. > > Or did I miss something? > > AI95-301 suggests: I/O operations on unbounded strings are provided > in a new child package of Ada.Text_IO. > > But I'm not sure if this one will get in. It was approved by WG9 at the June meeting, so it's in at this time. Of course, things remain subject to change because of integration issues, but I wouldn't expect the string functions to need modifications. So it's probably going to be in the Amendment. Randy. ^ permalink raw reply [flat|nested] 44+ messages in thread
end of thread, other threads:[~2004-07-30 0:14 UTC | newest] Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2004-07-15 17:27 reading a text file into a string zork 2004-07-15 17:49 ` Marius Amado Alves 2004-07-15 19:57 ` Nick Roberts 2004-07-15 17:59 ` Marius Amado Alves 2004-07-15 19:18 ` Nick Roberts 2004-07-15 19:18 ` Nick Roberts 2004-07-15 20:02 ` Nick Roberts 2004-07-16 1:23 ` Jeffrey Carter 2004-07-16 2:20 ` Steve 2004-07-16 2:26 ` Steve 2004-07-16 16:16 ` Jeffrey Carter 2004-07-16 17:45 ` Nick Roberts 2004-07-16 21:19 ` Randy Brukardt 2004-07-17 2:27 ` Robert I. Eachus 2004-07-17 11:31 ` Mats Weber 2004-07-17 15:52 ` Robert I. Eachus 2004-07-17 22:38 ` Jeffrey Carter 2004-07-18 13:44 ` zork 2004-07-19 8:07 ` Dale Stanbrough 2004-07-19 8:58 ` Martin Dowie 2004-07-21 0:17 ` Robert I. Eachus 2004-07-21 21:39 ` Randy Brukardt 2004-07-22 22:34 ` Robert I. Eachus 2004-07-23 0:49 ` Randy Brukardt 2004-07-23 21:56 ` Nick Roberts 2004-07-24 0:34 ` tmoran 2004-07-24 1:16 ` Nick Roberts 2004-07-24 1:42 ` Randy Brukardt 2004-07-24 15:14 ` Nick Roberts 2004-07-26 23:48 ` Randy Brukardt 2004-07-27 12:08 ` Nick Roberts 2004-07-27 23:24 ` Robert I. Eachus 2004-07-29 0:55 ` Randy Brukardt 2004-07-29 0:53 ` Randy Brukardt 2004-07-29 7:25 ` Martin Dowie 2004-07-29 20:08 ` Robert I. Eachus 2004-07-30 0:14 ` tmoran 2004-07-24 2:56 ` Robert I. Eachus 2004-07-19 11:51 ` Ada2005 (was " Peter Hermann 2004-07-19 12:51 ` Dmitry A. Kazakov 2004-07-19 13:01 ` Nick Roberts 2004-07-19 13:35 ` Martin Dowie 2004-07-19 17:22 ` Nick Roberts 2004-07-19 23:50 ` Randy Brukardt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox