* File output and buffering @ 2008-08-19 20:27 Maciej Sobczak 2008-08-20 6:45 ` Georg Bauhaus 2008-08-20 8:43 ` Maciej Sobczak 0 siblings, 2 replies; 20+ messages in thread From: Maciej Sobczak @ 2008-08-19 20:27 UTC (permalink / raw) It seems to me that the file output in standard Ada library is not buffered: 1. There is no buffer-related operation in the whole library. 2. The semantics of output operations is defined in terms of the effects on external file. 3. The performance of simple test is consistent with what can be obtained in equivalent C code that flushes the channel after every operation (ie. some 15-20x slower than with default buffering). Let's suppose that I want to add buffering to my output. I can write the stream type that does the necessary magic, but how can I reuse the formatting machinery that is already available in Ada.Text_IO and related packages? -- Maciej Sobczak * www.msobczak.com * www.inspirel.com Database Access Library for Ada: www.inspirel.com/soci-ada ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: File output and buffering 2008-08-19 20:27 File output and buffering Maciej Sobczak @ 2008-08-20 6:45 ` Georg Bauhaus 2008-08-20 8:43 ` Maciej Sobczak 1 sibling, 0 replies; 20+ messages in thread From: Georg Bauhaus @ 2008-08-20 6:45 UTC (permalink / raw) Maciej Sobczak wrote: > Let's suppose that I want to add buffering to my output. I can write > the stream type that does the necessary magic, but how can I reuse the > formatting machinery that is already available in Ada.Text_IO and > related packages? Some formatting procedures from {Number}_IO and from Editing can write to a String instead of to a File_Type. Can you stream the strings to a buffer? There is an article on AdaPower entitlet something like "How to access memory as a String". I think it will illustrate reliable tricks, possibly of some use when handling data in the "external" world. In any case, char_array values are good for OS procedures of names like write, read, and so on. -- Georg Bauhaus Y A Time Drain http://www.9toX.de ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: File output and buffering 2008-08-19 20:27 File output and buffering Maciej Sobczak 2008-08-20 6:45 ` Georg Bauhaus @ 2008-08-20 8:43 ` Maciej Sobczak 2008-08-20 8:59 ` Maciej Sobczak 1 sibling, 1 reply; 20+ messages in thread From: Maciej Sobczak @ 2008-08-20 8:43 UTC (permalink / raw) On 19 Sie, 22:27, Maciej Sobczak <see.my.homep...@gmail.com> wrote: > It seems to me that the file output in standard Ada library is not > buffered: > 1. There is no buffer-related operation in the whole library. > 2. The semantics of output operations is defined in terms of the > effects on external file. > 3. The performance of simple test is consistent with what can be > obtained in equivalent C code that flushes the channel after every > operation (ie. some 15-20x slower than with default buffering). Now I'm puzzled, because it looks like the files are written in chunks of 32kB. In other words, nothing is written to the file until the total output accumulated to 32kB and the step is preserved for each future write - this indicates that the buffering is actually in use. My original observations become questions: 1. Why there is no buffer-related operation in the whole library? In particular: how can I *flush* the buffer? This is very important for log files. I have discovered this exactly because the log is not written synchronously with Put operations, which makes it "a bit" less useful. How can I make sure that what I Put is actually written? Closing a file after each Put does not make much sense. 2. What about the semantics of Put? 3. Why is buffered Ada.Text_IO as slow as non-buffered C's stdio? Who is eating the 20x factor? -- Maciej Sobczak * www.msobczak.com * www.inspirel.com Database Access Library for Ada: www.inspirel.com/soci-ada ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: File output and buffering 2008-08-20 8:43 ` Maciej Sobczak @ 2008-08-20 8:59 ` Maciej Sobczak 2008-08-20 9:21 ` Dmitry A. Kazakov 2008-08-20 13:19 ` Georg Bauhaus 0 siblings, 2 replies; 20+ messages in thread From: Maciej Sobczak @ 2008-08-20 8:59 UTC (permalink / raw) On 20 Sie, 10:43, Maciej Sobczak <see.my.homep...@gmail.com> wrote: I will answer myself: > 1. Why there is no buffer-related operation in the whole library? Heh, there is. > In particular: how can I *flush* the buffer? By calling Ada.Text_IO.Flush. Which means that Georg Bauhaus fell into the trap of my confusion. :-) Still valid question: > 3. Why is buffered Ada.Text_IO as slow as non-buffered C's stdio? Who > is eating the 20x factor? -- Maciej Sobczak * www.msobczak.com * www.inspirel.com Database Access Library for Ada: www.inspirel.com/soci-ada ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: File output and buffering 2008-08-20 8:59 ` Maciej Sobczak @ 2008-08-20 9:21 ` Dmitry A. Kazakov 2008-08-20 14:44 ` Maciej Sobczak 2008-08-20 13:19 ` Georg Bauhaus 1 sibling, 1 reply; 20+ messages in thread From: Dmitry A. Kazakov @ 2008-08-20 9:21 UTC (permalink / raw) On Wed, 20 Aug 2008 01:59:52 -0700 (PDT), Maciej Sobczak wrote: > On 20 Sie, 10:43, Maciej Sobczak <see.my.homep...@gmail.com> wrote: > > Still valid question: > >> 3. Why is buffered Ada.Text_IO as slow as non-buffered C's stdio? Who >> is eating the 20x factor? Because of page formatting, I suggest. You can use text streams instead. [But don't use String'Write! Although, the newest GNAT optimized that, AFAIK.] BTW, buffering does not make I/O faster. It obviously does the opposite. Certainly, you didn't mean the "last-mile" buffer held by the driver, which is usually inaccessible. In some elder OSes you could directly write from a user buffer mapped by the kernel, have record files etc. That was *fast*. But then came C, Unix and Co., you know... (:-)) -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: File output and buffering 2008-08-20 9:21 ` Dmitry A. Kazakov @ 2008-08-20 14:44 ` Maciej Sobczak 2008-08-20 15:39 ` Dmitry A. Kazakov 0 siblings, 1 reply; 20+ messages in thread From: Maciej Sobczak @ 2008-08-20 14:44 UTC (permalink / raw) On 20 Sie, 11:21, "Dmitry A. Kazakov" <mail...@dmitry-kazakov.de> wrote: > BTW, buffering does not make I/O faster. It obviously does the opposite. You must be using some strange timer or a specially distorted definition of I/O. Buffering allows to minimize the overhead that is there per each physical output operation. If you can produce the same amount of data but with less operations, then the total overhead is smaller. -- Maciej Sobczak * www.msobczak.com * www.inspirel.com Database Access Library for Ada: www.inspirel.com/soci-ada ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: File output and buffering 2008-08-20 14:44 ` Maciej Sobczak @ 2008-08-20 15:39 ` Dmitry A. Kazakov 2008-08-21 7:10 ` Maciej Sobczak 0 siblings, 1 reply; 20+ messages in thread From: Dmitry A. Kazakov @ 2008-08-20 15:39 UTC (permalink / raw) On Wed, 20 Aug 2008 07:44:48 -0700 (PDT), Maciej Sobczak wrote: > On 20 Sie, 11:21, "Dmitry A. Kazakov" <mail...@dmitry-kazakov.de> > wrote: > >> BTW, buffering does not make I/O faster. It obviously does the opposite. > > You must be using some strange timer or a specially distorted > definition of I/O. > > Buffering allows to minimize the overhead that is there per each > physical output operation. Buffering is used to make I/O in an asynchronous and/or conveyered way. That does not make I/O faster in terms of latencies. Any language buffer on top of numerous layered buffers, typical for an OS, adds nothing, but overhead. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: File output and buffering 2008-08-20 15:39 ` Dmitry A. Kazakov @ 2008-08-21 7:10 ` Maciej Sobczak 2008-08-21 9:24 ` Dmitry A. Kazakov 0 siblings, 1 reply; 20+ messages in thread From: Maciej Sobczak @ 2008-08-21 7:10 UTC (permalink / raw) On 20 Sie, 17:39, "Dmitry A. Kazakov" <mail...@dmitry-kazakov.de> wrote: > Buffering is used to make I/O in an asynchronous and/or conveyered way. No, it is not asynchronous. Nothing happens in the background, the operations are only grouped. The group is (usually) transmitted in the synchronous fashion. I do not know what is "conveyered". > That does not make I/O faster in terms of latencies. It does make it faster in terms of throughput. Note: I do not imply that throughput is more valuable for optimization than latency - these can be different goals and usually are. > Any language buffer on top of numerous layered buffers, typical for an OS, > adds nothing, but overhead. It can reduce the overhead that is associated with the number of requests. System calls are not free and there is also a significant latency of the medium that is better to be avoided (like network roundtrips or disk seek times). -- Maciej Sobczak * www.msobczak.com * www.inspirel.com Database Access Library for Ada: www.inspirel.com/soci-ada ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: File output and buffering 2008-08-21 7:10 ` Maciej Sobczak @ 2008-08-21 9:24 ` Dmitry A. Kazakov 2008-08-21 20:54 ` Maciej Sobczak 0 siblings, 1 reply; 20+ messages in thread From: Dmitry A. Kazakov @ 2008-08-21 9:24 UTC (permalink / raw) On Thu, 21 Aug 2008 00:10:52 -0700 (PDT), Maciej Sobczak wrote: > On 20 Sie, 17:39, "Dmitry A. Kazakov" <mail...@dmitry-kazakov.de> > wrote: > >> Buffering is used to make I/O in an asynchronous and/or conveyered way. > > No, it is not asynchronous. Nothing happens in the background, the > operations are only grouped. The group is (usually) transmitted in the > synchronous fashion. > > I do not know what is "conveyered". Pipelined processing. When you refer to throughput, then it is increased only because of existence of hidden conveyers, which ultimately always boils down to some asynchronously working elements. >> That does not make I/O faster in terms of latencies. > > It does make it faster in terms of throughput. > > Note: I do not imply that throughput is more valuable for optimization > than latency - these can be different goals and usually are. > >> Any language buffer on top of numerous layered buffers, typical for an OS, >> adds nothing, but overhead. > > It can reduce the overhead that is associated with the number of > requests. System calls are not free and there is also a significant > latency of the medium that is better to be avoided (like network > roundtrips or disk seek times). Well, here we need to clarify what is the I/O end point. When you say "system call" it presumes that the end point is the driver. Let us fix it. Now, the next question is where coalescing/pipelining is to happen. See where it goes? Is the driver's interface a stream of units or else, also, of blocks of units? Case A. There is no back door to the driver, you have only a stream. What can buffering add? Nothing, but overhead. Case B. There is a back door for pushing bigger chunks of units. Then use it in your application and it will go *faster* than whatever buffered interface on top of the same thing! Note also that A and B usually refer different protocol layers. It is common to put a stream layer onto something block-oriented beneath, and reverse. That stream is buffering and necessarily an overhead. Buffering is always overhead. We buy it only because the alternative is inaccessible, like to do DMA transfers from the application. But a language library is in the *same* position as the application, so buffering there would gain nothing, *from* performance perspective. Ada.Text_IO is slow because of the buffering it does in order to implement a protocol (pages) which you do not need. Classic abstraction inversion case. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: File output and buffering 2008-08-21 9:24 ` Dmitry A. Kazakov @ 2008-08-21 20:54 ` Maciej Sobczak 2008-08-21 21:27 ` Dmitry A. Kazakov 0 siblings, 1 reply; 20+ messages in thread From: Maciej Sobczak @ 2008-08-21 20:54 UTC (permalink / raw) On 21 Sie, 11:24, "Dmitry A. Kazakov" <mail...@dmitry-kazakov.de> wrote: > > I do not know what is "conveyered". > > Pipelined processing. When you refer to throughput, then it is increased > only because of existence of hidden conveyers, which ultimately always > boils down to some asynchronously working elements. No, there is no asynchronous processing there (usually). There is grouping that leads to smaller number of still synchronous operations. > Well, here we need to clarify what is the I/O end point. No, we do not need to, especially when it is already clear that we would spiral down in an endless philosophy discussion about definitions. It is enough to get a clock and measure two simple test programs. I can offer the test programs if needed. > Ada.Text_IO is slow because of the buffering it does in order to implement > a protocol (pages) which you do not need. I do not see how paging could be related here. Or at least I can imagine an implementation where the overhead of bookkeeping pages is less than 15-20x. -- Maciej Sobczak * www.msobczak.com * www.inspirel.com Database Access Library for Ada: www.inspirel.com/soci-ada ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: File output and buffering 2008-08-21 20:54 ` Maciej Sobczak @ 2008-08-21 21:27 ` Dmitry A. Kazakov 2008-08-22 11:53 ` Maciej Sobczak 0 siblings, 1 reply; 20+ messages in thread From: Dmitry A. Kazakov @ 2008-08-21 21:27 UTC (permalink / raw) On Thu, 21 Aug 2008 13:54:25 -0700 (PDT), Maciej Sobczak wrote: > On 21 Sie, 11:24, "Dmitry A. Kazakov" <mail...@dmitry-kazakov.de> > wrote: > >>> I do not know what is "conveyered". >> >> Pipelined processing. When you refer to throughput, then it is increased >> only because of existence of hidden conveyers, which ultimately always >> boils down to some asynchronously working elements. > > No, there is no asynchronous processing there (usually). There is > grouping that leads to smaller number of still synchronous operations. "Still synchronous operations" of items in the group? Come on, grouping brings nothing if items are output synchronously to the caller. Coalescing helps if and only if individual items in the group are output asynchronously to the caller and to the receiver. In other words when the interested parties re-synchronize only at the ends of a group. In which state relatively to the output is the caller between the ends of a group? > It is enough to get a clock and measure two simple test programs. > I can offer the test programs if needed. No thanks. We are actually paid for designing such tests, so we have plenty of. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: File output and buffering 2008-08-21 21:27 ` Dmitry A. Kazakov @ 2008-08-22 11:53 ` Maciej Sobczak 2008-08-22 13:22 ` Dmitry A. Kazakov 0 siblings, 1 reply; 20+ messages in thread From: Maciej Sobczak @ 2008-08-22 11:53 UTC (permalink / raw) On 21 Sie, 23:27, "Dmitry A. Kazakov" <mail...@dmitry-kazakov.de> wrote: > "Still synchronous operations" of items in the group? Come on, grouping > brings nothing if items are output synchronously to the caller. Of course it brings a lot - it minimizes the total overhead due to smaller number of requests. Ever tried to send each character in a separate mail instead of sending one mail containing many characters? > In which > state relatively to the output is the caller between the ends of a group? Why should I care? Sometimes I care only about throughput. > > It is enough to get a clock and measure two simple test programs. > > I can offer the test programs if needed. > > No thanks. We are actually paid for designing such tests, so we have plenty > of. Then why do you try so hard to distort this discussion? -- Maciej Sobczak * www.msobczak.com * www.inspirel.com Database Access Library for Ada: www.inspirel.com/soci-ada ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: File output and buffering 2008-08-22 11:53 ` Maciej Sobczak @ 2008-08-22 13:22 ` Dmitry A. Kazakov 2008-08-22 21:41 ` Maciej Sobczak 0 siblings, 1 reply; 20+ messages in thread From: Dmitry A. Kazakov @ 2008-08-22 13:22 UTC (permalink / raw) On Fri, 22 Aug 2008 04:53:56 -0700 (PDT), Maciej Sobczak wrote: > On 21 Sie, 23:27, "Dmitry A. Kazakov" <mail...@dmitry-kazakov.de> > wrote: > >> "Still synchronous operations" of items in the group? Come on, grouping >> brings nothing if items are output synchronously to the caller. > > Of course it brings a lot - it minimizes the total overhead due to > smaller number of requests. > > Ever tried to send each character in a separate mail instead of > sending one mail containing many characters? It seems that you didn't read my posts. One last try. In your example, when characters of a message are sent *synchronously* (assuming E-mail as the transport layer, no back doors, etc) then each single character has to be sent as a reply to the answer to the earlier mail. The very ability to send multiple characters in one mail means that they are sent in parallel = asynchronously. Compare it to parallel vs. serial communication. For the rest see http://en.wikipedia.org/wiki/Buffer_%28telecommunication%29 Note the category of the article, read the purposes of buffering. >> In which >> state relatively to the output is the caller between the ends of a group? > > Why should I care? Because it debunks your claim that the transfer of individual items is synchronous. It is asynchronous, when makes sense. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: File output and buffering 2008-08-22 13:22 ` Dmitry A. Kazakov @ 2008-08-22 21:41 ` Maciej Sobczak 2008-08-23 10:25 ` Dmitry A. Kazakov [not found] ` <Q7adnfmCI6Ly6S3VnZ2dnUVZ_jOdnZ2d@earthlink.com> 0 siblings, 2 replies; 20+ messages in thread From: Maciej Sobczak @ 2008-08-22 21:41 UTC (permalink / raw) On 22 Sie, 15:22, "Dmitry A. Kazakov" <mail...@dmitry-kazakov.de> wrote: > It seems that you didn't read my posts. I've read them, but did not understand. > One last try. In your example, when > characters of a message are sent *synchronously* (assuming E-mail as the > transport layer, no back doors, etc) then each single character has to be > sent as a reply to the answer to the earlier mail. Then we have a different notion of "synchronously". When I write something to the file, the operation is synchronous when the program *waits* for the transfer to complete. > The very ability to send > multiple characters in one mail means that they are sent in parallel = > asynchronously. Then we have a different notion of "asynchronously". When I write something to the file, the operation is asynchronous when the program can continue while the transfer is being handled. And we have also a different notion of "parallel". When I send a mail, it is transferred serially over a network cable. The longer is the mail the longer it takes (hint: with parallel communication the time of transmission would not depend on the number of characters in the mail, since they would be sent, well, in parallel). > Compare it to parallel vs. serial communication. Nothing to compare. > For the > rest see > > http://en.wikipedia.org/wiki/Buffer_%28telecommunication%29 Short, but nice. Especially point d). > Note the category of the article, read the purposes of buffering. Yes, the purpose d) is what I'm talking about. I use buffers to group data into smaller number of bigger units. This is where the performance gain comes from. > Because it debunks your claim that the transfer of individual items is > synchronous. It is asynchronous, when makes sense. No, it is synchronous, since the program has to wait until the transfer completes (if the transfer is triggered at all - the buffer makes that happen less frequently). -- Maciej Sobczak * www.msobczak.com * www.inspirel.com Database Access Library for Ada: www.inspirel.com/soci-ada ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: File output and buffering 2008-08-22 21:41 ` Maciej Sobczak @ 2008-08-23 10:25 ` Dmitry A. Kazakov 2008-08-23 13:41 ` Steve [not found] ` <Q7adnfmCI6Ly6S3VnZ2dnUVZ_jOdnZ2d@earthlink.com> 1 sibling, 1 reply; 20+ messages in thread From: Dmitry A. Kazakov @ 2008-08-23 10:25 UTC (permalink / raw) On Fri, 22 Aug 2008 14:41:18 -0700 (PDT), Maciej Sobczak wrote: > On 22 Sie, 15:22, "Dmitry A. Kazakov" <mail...@dmitry-kazakov.de> > wrote: > >> One last try. In your example, when >> characters of a message are sent *synchronously* (assuming E-mail as the >> transport layer, no back doors, etc) then each single character has to be >> sent as a reply to the answer to the earlier mail. > > Then we have a different notion of "synchronously". > When I write something to the file, the operation is synchronous when > the program *waits* for the transfer to complete. The transfer of the group, not the transfers of the individual items of. >> The very ability to send >> multiple characters in one mail means that they are sent in parallel = >> asynchronously. > > Then we have a different notion of "asynchronously". > When I write something to the file, the operation is asynchronous when > the program can continue while the transfer is being handled. That is the same notion. Asynchronous = not synchronous. The semantics of a transfer of a group of items does not depend on the order and exact timing of the transfers of individual items. If any, because they might be not transferred at all. Consider protocols which recode the group, digital fountains, etc. > And we have also a different notion of "parallel". > When I send a mail, it is transferred serially over a network cable. Wrong, they are printed and then sent per pigeon post. You have defined the transport layer as E-mail. That's it. Don't make suggestions about how E-mail might work, there are lots of ways. > The longer is the mail the longer it takes (hint: with parallel > communication the time of transmission would not depend on the number > of characters in the mail, since they would be sent, well, in > parallel). Nope I have a huge rack of multiplexed modems installed in the cellar. You again make assumptions about possible implementations of the transport layer, which weren't there when you presented the example. If the transport were rather a synchronous bytes stream, then buffering obviously would bring *nothing* to the throughout. >> � �http://en.wikipedia.org/wiki/Buffer_%28telecommunication%29 > > Short, but nice. Especially point d). Right, it says "operated on as a unit", read my previous posts. Who operates them as "a unit"? You need an independent asynchronous agent capable to do so, otherwise it is not a unit. If you have such an agent, and you can talk to it in terms of such units, then that is *without* buffering, and it is faster than anything else. The purpose of d) is to collect, it is merely an adapter between two protocols. Layered protocols are always slower. >> Note the category of the article, read the purposes of buffering. > > Yes, the purpose d) is what I'm talking about. I use buffers to group > data into smaller number of bigger units. This is where the > performance gain comes from. No, it is where you lose performance, because I just send bigger units directly. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: File output and buffering 2008-08-23 10:25 ` Dmitry A. Kazakov @ 2008-08-23 13:41 ` Steve 2008-08-23 14:33 ` Dmitry A. Kazakov 0 siblings, 1 reply; 20+ messages in thread From: Steve @ 2008-08-23 13:41 UTC (permalink / raw) "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message news:yiw2f938342v.xzb47swyx5h4$.dlg@40tude.net... > On Fri, 22 Aug 2008 14:41:18 -0700 (PDT), Maciej Sobczak wrote: > >> On 22 Sie, 15:22, "Dmitry A. Kazakov" <mail...@dmitry-kazakov.de> >> wrote: >> >>> One last try. In your example, when >>> characters of a message are sent *synchronously* (assuming E-mail as the >>> transport layer, no back doors, etc) then each single character has to >>> be >>> sent as a reply to the answer to the earlier mail. >> >> Then we have a different notion of "synchronously". >> When I write something to the file, the operation is synchronous when >> the program *waits* for the transfer to complete. > > The transfer of the group, not the transfers of the individual items of. > Dmitry, I have read enough of your posts on this newsgroup to know you're not a troll, but is sure hard to tell from reading this thread. In my experience (theory aside) sending one character a time to an OS is considerably slower than buffering the data and sending blocks of data. Several years ago I rewrote a driver on one of our system that we used to communicate serially (using RS232) with a PLC (Programmable Logic Controller). The driver was originally written to make separate calls to the OS for each character sent to the PLC. The original implementation utilized approximately 15% of the CPU. When I re-wrote the driver to buffer the characters into blocks of up to 128 characters (defined by the PLC protocol) and make one OS call for the buffered data, the CPU utilization dropt to less than 1% of the CPU. This behavior makes perfect sense to me because for each call to the OS a buffer is allocated containing the data to be transmitted and placed in a queue for the OS. The buffer itself contains more than just the data to be sent, it includes some overhead, sometimes significant in size. The addition of the buffer to the OS queue often includes considerable overhead, context switches, mutexes, etc. When the number of characters in the buffer is increased the overhead is not significantly increased. Sure, if you're talking directly to hardware hardware that only handles one character at a time then buffering and unbuffering data adds overhead. But it is rare in these days to talk directly with the hardware. Even the simpler systems often use a kernel or OS that makes buffering worthwhile. If you're using TCP/IP to send data, if you're going to send a bunch of data at a time it would be silly to send one byte at a time. IP has considerable overhead for each block. You should try to include as much data as possible to minimize the number of packets sent and minimize the overhead. I find it interesting that you seem to be arguing that it is always better to not bufffer, when the original poster has indicated that he has tried both buffered and unbuffered approaches and observed thant unbuffered was considerably slower on his system. Either you are miscommunicating or you are just plain wrong. Regards, Steve > -- > Regards, > Dmitry A. Kazakov > http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: File output and buffering 2008-08-23 13:41 ` Steve @ 2008-08-23 14:33 ` Dmitry A. Kazakov 0 siblings, 0 replies; 20+ messages in thread From: Dmitry A. Kazakov @ 2008-08-23 14:33 UTC (permalink / raw) I have repeated the argument, provided all possible explanations and examples more than three times. Since it goes in circles, let's put an end to this. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <Q7adnfmCI6Ly6S3VnZ2dnUVZ_jOdnZ2d@earthlink.com>]
* Re: File output and buffering [not found] ` <Q7adnfmCI6Ly6S3VnZ2dnUVZ_jOdnZ2d@earthlink.com> @ 2008-08-23 22:00 ` Maciej Sobczak 0 siblings, 0 replies; 20+ messages in thread From: Maciej Sobczak @ 2008-08-23 22:00 UTC (permalink / raw) On 23 Sie, 22:34, Dennis Lee Bieber <wlfr...@ix.netcom.com> wrote: > > Then we have a different notion of "synchronously". > > When I write something to the file, the operation is synchronous when > > the program *waits* for the transfer to complete. > > If I may slip in, since this thread has wandered into comparisons > that even I can't follow... > > Define "complete" So that the program can immediately terminate and still have the data reliably stored. Think about log files of various kinds (including database write ahead logs) and the importance of having something confirmed. > Most I/O systems I've encountered are buffered by the OS... Of course - and not only that. There are buffers everywhere, even in hard drives. The semantics of output operation from the program point of view can be, however, described in terms of reasonably understood best-effort or pushing data as far as it makes sense. For example, if the hard drive can guarantee reliable storage at the level of its own buffers, then it can confirm reception of the data without actually storing them on plates. From the point of view of the program, the I/O operation can be considered as finished, because from that point nothing can mess things up. > As far > as an application is concerned, an I/O "write" operation is "complete" > when the OS accepts the packet for buffering. Exactly - provided that the packet was *copied* to OS buffers as opposed to just passing pointer to programs data. In practical terms: Ada.Text_IO.Put_Line (File, "Hello"); Ada.Text_IO.Flush (File); -- here we can crash without losing data I consider the output operation above (triggered or ensured by Flush) to be *synchronous with respect to the program*. When the Flush operation returns the control back to the program, the data is already stored in the external file (as AARM calls it), whatever that means, even if the "external file" includes several layers of buffers. From the program's perspective, it is "done". If you want to contrast the above with asynchronous version, the output operation can be initiated by the program but the program would be allowed to continue without any guarantee related to the amount of data being stored (and with some provisions to get the status later on). Short coverage of what all this means in the context of databases: http://www.orafaq.com/node/93 It is really well written. I hope that you get what I'm trying to say here. Well, at least I'm sure that I'm not inventing anything new. -- Maciej Sobczak * www.msobczak.com * www.inspirel.com Database Access Library for Ada: www.inspirel.com/soci-ada ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: File output and buffering 2008-08-20 8:59 ` Maciej Sobczak 2008-08-20 9:21 ` Dmitry A. Kazakov @ 2008-08-20 13:19 ` Georg Bauhaus 2008-08-20 14:41 ` Maciej Sobczak 1 sibling, 1 reply; 20+ messages in thread From: Georg Bauhaus @ 2008-08-20 13:19 UTC (permalink / raw) Maciej Sobczak schrieb: >> In particular: how can I *flush* the buffer? > > By calling Ada.Text_IO.Flush. > > Which means that Georg Bauhaus fell into the trap of my confusion. :-) Sort of, but, as you say, the issue remains. > Still valid question: > >> 3. Why is buffered Ada.Text_IO as slow as non-buffered C's stdio? Who >> is eating the 20x factor? Text_IO is demonstrably slow. There are some speedy shortcuts in the GNAT implementation of Put (e.g. Write_Buf). But AFAICS there is (and has to be) a lot of protecting code around the OS calls. Using the following stupid programs for comparison, and using strace, I get 3370 calls to write(2) from C, but 50_000 from both C++ and Ada. Among other things open to speculation (or open to inspection). There are 4622 different lines in the 50_000 lines of output. I think that if you have a formatted (constrained) string, system I/O using fputs and flush might be a lot faster (modulo threading issues). #include <stdio.h> int main() { char s[68 + 1] = "********************************************************************"; for (int k = 0; k < 50000; ++k) { s[k % 68] = (char)(33 + k % 67); fputs(s, stdout), fputc('\n', stdout); } return 0; } #include <iostream> int main() { std::string s = "********************************************************************"; for (int k = 0; k < 50000; ++k) { s[k % 68] = static_cast<char>(33 + k % 67); std::cout << s << std::endl; } return 0; } with Ada.Text_IO; procedure Ada_Wrt is S: String := (1 .. 68 => '*'); begin for K in 0 .. 50_000 - 1 loop S(1 + K rem 68) := Character'Val(33 + K rem 67); Ada.Text_IO.Put_Line(S); end loop; end Ada_Wrt; -- Georg Bauhaus Y A Time Drain http://www.9toX.de ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: File output and buffering 2008-08-20 13:19 ` Georg Bauhaus @ 2008-08-20 14:41 ` Maciej Sobczak 0 siblings, 0 replies; 20+ messages in thread From: Maciej Sobczak @ 2008-08-20 14:41 UTC (permalink / raw) On 20 Sie, 15:19, Georg Bauhaus <rm.dash-bauh...@futureapps.de> wrote: > Using the following stupid programs for comparison, > and using strace, I get 3370 calls to write(2) from C, > but 50_000 from both C++ and Ada. The C++ part can be explained by the fact that you did not use it properly. > std::cout << s << std::endl; Try this instead: std::cout << s << '\n'; The difference is that std::endl performs *two* actions on the given stream: it inserts the newline and... flushes. If you intend to only insert the newline character, do what you mean. It is even less typing. (yes, 99% of "benchmarks" available on the web are broken for the same reason) -- Maciej Sobczak * www.msobczak.com * www.inspirel.com Database Access Library for Ada: www.inspirel.com/soci-ada ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2008-08-23 22:00 UTC | newest] Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2008-08-19 20:27 File output and buffering Maciej Sobczak 2008-08-20 6:45 ` Georg Bauhaus 2008-08-20 8:43 ` Maciej Sobczak 2008-08-20 8:59 ` Maciej Sobczak 2008-08-20 9:21 ` Dmitry A. Kazakov 2008-08-20 14:44 ` Maciej Sobczak 2008-08-20 15:39 ` Dmitry A. Kazakov 2008-08-21 7:10 ` Maciej Sobczak 2008-08-21 9:24 ` Dmitry A. Kazakov 2008-08-21 20:54 ` Maciej Sobczak 2008-08-21 21:27 ` Dmitry A. Kazakov 2008-08-22 11:53 ` Maciej Sobczak 2008-08-22 13:22 ` Dmitry A. Kazakov 2008-08-22 21:41 ` Maciej Sobczak 2008-08-23 10:25 ` Dmitry A. Kazakov 2008-08-23 13:41 ` Steve 2008-08-23 14:33 ` Dmitry A. Kazakov [not found] ` <Q7adnfmCI6Ly6S3VnZ2dnUVZ_jOdnZ2d@earthlink.com> 2008-08-23 22:00 ` Maciej Sobczak 2008-08-20 13:19 ` Georg Bauhaus 2008-08-20 14:41 ` Maciej Sobczak
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox