comp.lang.ada
 help / color / mirror / Atom feed
* Unix text handling on stdin
@ 2001-08-24 21:36 David Starner
  2001-08-24 23:44 ` sk
  2001-08-25  2:56 ` sk
  0 siblings, 2 replies; 3+ messages in thread
From: David Starner @ 2001-08-24 21:36 UTC (permalink / raw)


I've worked on a couple things where I've had problems with this,
so I'd figure I'd ask it here. How do I read in a text file from
stdin as text, do some analysis on it, and then spit it out binary
identical version of stdout? That is, every form-feed (probably
interpreted as a new page by Ada.Text_IO), every carriage return,
every line feed in the original exists in the output in the exact 
same place; likewise the present or absences of a trailing LF or CRLF.

The problems I had, were:

(a) Ada.Text_IO will happily munch a form feed in such a way that 
it's hard to tell whether we had a form feed or new line. 

(b) From my reading, it's entirely standards-compliant to treat LF and 
CRLF the same way on Unix; it would certainly simplify things. This is 
a theoritical problem for me, as GNAT doesn't do this.

(c) GNAT would always terminate the output with a new line when the
program ended. (This was serious problem when I was trying to write
up a submission for the latest ICFP. The output couldn't be larger
than the input; but if the input was minimal and had no trailing
newline, the trailing newline would cause the program to lose.)

I understand that Ada.Text_IO is supposed to be portable; but I need
it to work in a system correct way more than I need it to be portable.
Is there any maximally standards complaint way to solve this? How about
'it should work on any reasonable compiler' way? GNAT specific way will
work; restricting myself to GNAT on Linux is not much different than just
restricting myself to Linux.

-- 
David Starner - dstarner98@aasaa.ofe.org
Pointless website: http://dvdeug.dhis.org
"I don't care if Bill personally has my name and reads my email and 
laughs at me. In fact, I'd be rather honored." - Joseph_Greg



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Unix text handling on stdin
  2001-08-24 21:36 Unix text handling on stdin David Starner
@ 2001-08-24 23:44 ` sk
  2001-08-25  2:56 ` sk
  1 sibling, 0 replies; 3+ messages in thread
From: sk @ 2001-08-24 23:44 UTC (permalink / raw)
  To: comp.lang.ada

[-- Attachment #1: Type: text/plain, Size: 1048 bytes --]

Hi, 

I tried the ICFP thing for S&G's (didn't, and didn't intend to, 
really enter) and ran into the same problem ...

So, my solution involved

Ada.Text_Io;
Ada.Text_Io.C_Streams;
Interfaces.C_Streams;

and here is a distilled way to take std in and copy it to 
stdout without TIO adding extra Line feeds when closing 
the file etc.

I fully intended to look up the issue in both the LRM and
Ada Issues to see if the compiler was behaving according to
spec, but once I found a work-around, I never got round
to checking about the legitimacy of this TIO behaviour.

The output of this procedure will be of exactly the same
size as the input ... unlike when exclusively using TIO
which always added an extra byte or two when closing.

sknipe@ktc.com

PS. Please note that this is a QAD[1] and not very careful such
that exceptions can and will be raised by going out of range

[1] Quick and Dirty :-)

PPS. Text attachment, sorry, to preserve formatting ...

PPPS. Linux, GNAT 3.13p

----------------------------------------------------------------

[-- Attachment #2: standard_io_play.adb --]
[-- Type: text/plain, Size: 2945 bytes --]

with Interfaces.C_Streams;

with Ada.Streams;
with Ada.Streams.Stream_Io;

with Ada.Text_Io;
with Ada.Text_Io.C_Streams;

procedure Standard_IO_Play is

	package ICS				renames Interfaces.C_Streams;
	
	package AS				renames Ada.Streams;
	package ASSIO			renames Ada.Streams.Stream_Io;

	package TIO				renames Ada.Text_Io;
	package TIOCS			renames Ada.Text_Io.C_Streams;
	
	function Is_File_A_Tty (
		File : Ada.Text_Io.File_Type
	) return Boolean is
	begin
		return (
			Interfaces.C_Streams.Isatty (
				Interfaces.C_Streams.FileNo (
					Ada.Text_Io.C_Streams.C_Stream (File)
				)
			) > 0
		);
	end Is_File_A_Tty;

	Buffer_Size	: constant := 10240; 	-- 10k buffer size


	Buffer	: String (1 .. Buffer_Size) := (Others => ' ');
	Last	: Natural := Buffer'Last;

	Buffer_Address	: constant ICS.Voids := Buffer(Buffer'First)'Address;
	Element_Size	: constant := Character'Size / 8;

	Bytes_Read		: ICS.Size_t := 0;
	Bytes_Written   : ICS.Size_t := 0;

	Flush_Result	: Integer;

begin
	-- First, check whether Std_In is attached to a tty or is 
	-- something else.
	-- If stdin is a tty, then expecting ketboard user input
	-- else (not a tty) assuming that the OS is piping a file 
	--     to this procedure
	-- (Remember this is just an example and that it is prabably
	--  a bad assumption and oversimplification). 
	--

	if Is_File_A_Tty (TIO.Current_Input) then
	
	
		TIO.Put_Line (
			File => TIO.Standard_Error,
			Item => "Trying to show piping. So please pipe into this procedure"
		);
		TIO.Put_Line (
			File => TIO.Standard_Error,
			Item => "ie # ""cat test-file | ./standard_io_play > result-file"""
		);
	
	else 
	
		Std_In_Read : loop
		
			exit Std_In_Read when TIO.End_Of_File (TIO.Current_Input);
			
			-- Read using ICS ...

			Bytes_Read := ICS.Fread (
				Buffer	=> Buffer_Address,
				Size	=> Element_Size,
				Count	=> ICS.Size_t (Buffer'Length),
				Stream	=> TIOCS.C_Stream (TIO.Current_Input)
			);
			
			-- Do your thing to the buffer ... (Note the possibility of
			-- incompatible ranges for Natural (Bytes_Read) and subsequent
			-- buffer overflows).
			
			Processing : for Char in Buffer'First .. Natural (Bytes_Read) loop

				if Buffer(Char) = ASCII.Lf then

					TIO.Put_Line (
						File	=> TIO.Standard_Error,
						Item    => "Found LF"
					);
					
				elsif Buffer(Char) = ASCII.Cr then

					TIO.Put_Line (
						File	=> TIO.Standard_Error,
						Item    => "Found CR"
					);
					
				elsif Buffer(Char) = ASCII.Ht then

					TIO.Put_Line (
						File	=> TIO.Standard_Error,
						Item    => "Found TAB"
					);
				
				end if;

			end loop Processing;
			
			-- Write using ICS ...

			Bytes_Written := ICS.Fwrite (
				Buffer	=> Buffer_Address,
				Size	=> Element_Size,
				Count	=> ICS.Size_t (Bytes_Read),
				Stream	=> TIOCS.C_Stream (TIO.Current_Output)
			);

		end loop Std_In_Read;

		Flush_Result := ICS.Fflush (TIOCS.C_Stream (TIO.Current_Output));

	end if;


end Standard_IO_Play;


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Unix text handling on stdin
  2001-08-24 21:36 Unix text handling on stdin David Starner
  2001-08-24 23:44 ` sk
@ 2001-08-25  2:56 ` sk
  1 sibling, 0 replies; 3+ messages in thread
From: sk @ 2001-08-25  2:56 UTC (permalink / raw)
  To: comp.lang.ada

Hi again,

I wrote:
>I fully intended to look up the issue in both the LRM and
>Ada Issues to see if the compiler was behaving according to
>spec, but once I found a work-around, I never got round
>to checking about the legitimacy of this TIO behaviour.

Firstly, I should have said text-io behaving according to spec,
and not "the compiler" ... and then I got around to actually
reading up on the issue.


From the Ada95 Reference Manual (Electronic, provided with GNAT 
.../gnat-3.13p-unx-docs/html/arm95.html#SEC247)

:Text File Management
:
:...
:
:2.For the procedure Close: If the file has the current mode 
:Out_File or Append_File, has the effect of calling New_Page,
:unless the current page is already terminated; then outputs
:a file terminator. 
:
:...
:

So, the behaviour of adding a file terminating byte in 
closing a file from Text_IO agrees with the ARM.

It also appears that Interfaces.C_Streams and Ada.Text_Io.C_Streams
are provided by GNAT. The packages do not seem to be documented
in the above mentioned electronic ARM; they are, however,
documented in the GNAT RM
(.../gnat-3.13p-unx-docs/html/gnat_rm.html#SEC42).

sknipe@ktc.com

PS The example makes no use of Ada.Streams and Ada.Streams.Stream_Io;
as I mentioned, it was a QUAD and therefore subject to not editing
out the unecessary items from a cut and paste job.



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2001-08-25  2:56 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-08-24 21:36 Unix text handling on stdin David Starner
2001-08-24 23:44 ` sk
2001-08-25  2:56 ` sk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox