From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on ip-172-31-65-14.ec2.internal X-Spam-Level: X-Spam-Status: No, score=-3.2 required=3.0 tests=BAYES_00,NICE_REPLY_A, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 Path: eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Chris Townley Newsgroups: comp.lang.ada Subject: Re: Using "pure" (?) Ada, how to determine whether a file is a "text" file, not a binary? Date: Sun, 2 Jul 2023 02:08:17 +0100 Organization: A noiseless patient Spider Message-ID: References: <41a5cad2-b5ca-4996-b057-e1ae8b27f526n@googlegroups.com> <87edlrxqfd.fsf@nosuchdomain.example.com> <48b33023-a38e-4ccc-855e-fe6de7b12ea5n@googlegroups.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Sun, 2 Jul 2023 01:08:18 -0000 (UTC) Injection-Info: dont-email.me; posting-host="b1cd67f8872ec3b52ff4de6d2fd45e81"; logging-data="3061745"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18oRF7F7689XegX9wZMsDgOJKydVmGqQts=" User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Cancel-Lock: sha1:3MdXo2NebNpDg4l8xrB+6XAFVIc= Content-Language: en-GB In-Reply-To: <48b33023-a38e-4ccc-855e-fe6de7b12ea5n@googlegroups.com> Xref: news.eternal-september.org comp.lang.ada:65398 List-Id: On 01/07/2023 22:50, Kenneth Wolcott wrote: > On Saturday, July 1, 2023 at 2:39:06 PM UTC-7, Keith Thompson wrote: >> Kenneth Wolcott writes: >>> On Saturday, July 1, 2023 at 1:39:30 PM UTC-7, Jeffrey R.Carter wrote: >>>> On 2023-07-01 19:15, Kenneth Wolcott wrote: >> [...] >>>> For example, if a text file is one in which all the characters, except line >>>> terminators, are graphic characters, then it should be clear how to determine >>>> whether a file meets that definition of a text file. >>> >>> I think that is the definition that I'm going to pursue as the >>> simplest and effective definition. >> Think about how you want to handle tab characters (non-graphic but >> common in some text) and carriage return characters (non-graphic but >> part of a line terminator for Windows-style text files). >> >> Also think about the various ways of representing text: ASCII, Latin-1, >> UTF-8, UTF-16, etc. > > Thanks, Keith! > > It looks like just need to more carefully examine the existing Ada I/O packages and experiment with the possibilities... > > Ken Maybe worth looking at the unix file utility, docs and source are available -- Chris