* Re: Interesting performance quirk.
2008-10-28 11:35 ` Peter C. Chapin
@ 2008-10-28 14:21 ` Robert A Duff
2008-10-29 1:42 ` Peter C. Chapin
2008-10-28 18:27 ` Jeffrey R. Carter
` (3 subsequent siblings)
4 siblings, 1 reply; 28+ messages in thread
From: Robert A Duff @ 2008-10-28 14:21 UTC (permalink / raw)
"Peter C. Chapin" <pcc482719@gmail.com> writes:
> I'm expecting my code to be slower because a) It is Ada and there are
> probably some extra run time checks, ...
You should also compile with all checks suppressed, to determine the
cost of the checks. This doesn't mean you have to suppress all checks
in order to get good performance. After this experiment, you might be
able to determine that suppressing checks in some inner loop (e.g. the
procedure you intend to inline, and its call site(s))) is good enough.
>... and b) I just wrote this thing and
> I haven't yet tried to tweak it for performance.
Sure.
> I tried running gprof over my code to see what I could learn, but I
> don't know how to use that tool very well yet so I didn't learn much.
It's worth learning. The gprof output can be confusing at first, but if
you puzzle over it for a while, it begins to make more sense. ;-)
>...I
> discovered that a certain procedure is called a HUGE number of times,
> but that's not unexpected considering where it's used. It's a small
> procedure so I might try inlining it.
It might also make sense to look at the machine code for that procedure.
And/or look at the expanded Ada code (using gnat's -gnatG switch).
You might also want to profile the C version, and look at whatever
tricks it uses in its inner loops, and perhaps duplicate them in
the Ada version.
Please continue to report your progress here. I, and I'm sure others,
are interested.
- Bob
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Interesting performance quirk.
2008-10-28 14:21 ` Robert A Duff
@ 2008-10-29 1:42 ` Peter C. Chapin
0 siblings, 0 replies; 28+ messages in thread
From: Peter C. Chapin @ 2008-10-29 1:42 UTC (permalink / raw)
Robert A Duff wrote:
> You should also...
Thanks for all the great suggestions. I have made a note of them for
future reference. And yes, I definitely intend to practice with gprof.
It looks useful.
Thanks also for your words of encouragement. Unfortunately progress on
this project tends to be spotty because I can only work on it here and
there between my "real" work. However, if I find out anything else
interesting or have any more questions I'll definitely post.
Peter
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Interesting performance quirk.
2008-10-28 11:35 ` Peter C. Chapin
2008-10-28 14:21 ` Robert A Duff
@ 2008-10-28 18:27 ` Jeffrey R. Carter
2008-10-29 1:39 ` Peter C. Chapin
2008-10-28 23:22 ` Ludovic Brenta
` (2 subsequent siblings)
4 siblings, 1 reply; 28+ messages in thread
From: Jeffrey R. Carter @ 2008-10-28 18:27 UTC (permalink / raw)
Peter C. Chapin wrote:
>
> I'm expecting my code to be slower because a) It is Ada and there are
> probably some extra run time checks, and b) I just wrote this thing and
> I haven't yet tried to tweak it for performance. The performance ratio
> on the Linux system between my immature Ada code and the much more
> mature OpenSSL seems pretty reasonable to me. I'm less content with the
> ratio in the Window XP environment.
Have you verified that your implementation and the OpenSSL library give the same
encryption results? There's not much point in worrying about relative speeds if
they aren't doing the same thing.
--
Jeff Carter
"Now look, Col. Batguano, if that really is your name."
Dr. Strangelove
31
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Interesting performance quirk.
2008-10-28 18:27 ` Jeffrey R. Carter
@ 2008-10-29 1:39 ` Peter C. Chapin
2008-10-29 5:27 ` Jeffrey R. Carter
0 siblings, 1 reply; 28+ messages in thread
From: Peter C. Chapin @ 2008-10-29 1:39 UTC (permalink / raw)
Jeffrey R. Carter wrote:
> Have you verified that your implementation and the OpenSSL library give
> the same encryption results? There's not much point in worrying about
> relative speeds if they aren't doing the same thing.
Well, I have unit tests for my code that are based on the Blowfish test
vectors on Schneier's page. My code passes those cases, so the
encryption results are apparently what Schneier thinks they should be. I
can only assume that OpenSSL can also pass those test cases (didn't try it).
I can see the sense in verifying interoperability between my code and
another major crypto library. On the other hand, when there are
published test cases it may not be worth it. For instance the Blowfish
ECB tests on Schneier's page all use 64 bit keys. Considering that the
algorithm can handle key lengths from 32 bits to 448 bits, those test
cases aren't as broad as ideal. Checking interoperability with OpenSSL
(say) for a range of key sizes might be worth doing.
But... this is getting off the topic of Ada a bit. I apologize for rambling.
Peter
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Interesting performance quirk.
2008-10-29 1:39 ` Peter C. Chapin
@ 2008-10-29 5:27 ` Jeffrey R. Carter
0 siblings, 0 replies; 28+ messages in thread
From: Jeffrey R. Carter @ 2008-10-29 5:27 UTC (permalink / raw)
Peter C. Chapin wrote:
>
> Well, I have unit tests for my code that are based on the Blowfish test
> vectors on Schneier's page. My code passes those cases, so the
> encryption results are apparently what Schneier thinks they should be. I
> can only assume that OpenSSL can also pass those test cases (didn't try it).
That's an assumption you ought to test. OpenSSL might be very fast because it's
wrong.
Doesn't seem likely, though.
--
Jeff Carter
"Now look, Col. Batguano, if that really is your name."
Dr. Strangelove
31
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Interesting performance quirk.
2008-10-28 11:35 ` Peter C. Chapin
2008-10-28 14:21 ` Robert A Duff
2008-10-28 18:27 ` Jeffrey R. Carter
@ 2008-10-28 23:22 ` Ludovic Brenta
2008-10-29 8:42 ` oenone
2008-10-29 9:59 ` Peter C. Chapin
2008-10-29 9:54 ` Alex R. Mosteo
2008-10-29 16:12 ` Colin Paul Gloster
4 siblings, 2 replies; 28+ messages in thread
From: Ludovic Brenta @ 2008-10-28 23:22 UTC (permalink / raw)
Peter C. Chapin writes:
> Windows XP Laptop (GNAT GPL 2008 for the Ada, Cygwin gcc for the C)
> -----
> My Library => 11 MB/s (with -O2 option and no debugging support)
> OpenSSL => 65 MB/s (Wow!)
>
> SUSE Linux in a VM on the same box
> -----
> My Library => 25 MB/s (odd)
> OpenSSL => 65 MB/s (now this makes sense at least)
I believe OpenSSL uses hand-written and carefully optimised assembly
for its inner loops.
Besides that, I was wondering if by any chance you were using one 32-
bit and one 64-bit operating system; if so, which one is Linux and
which one is Windows?
--
Ludovic Brenta.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Interesting performance quirk.
2008-10-28 23:22 ` Ludovic Brenta
@ 2008-10-29 8:42 ` oenone
2008-10-29 9:59 ` Peter C. Chapin
1 sibling, 0 replies; 28+ messages in thread
From: oenone @ 2008-10-29 8:42 UTC (permalink / raw)
On 29 Okt., 00:22, Ludovic Brenta <ludo...@ludovic-brenta.org> wrote:
> Besides that, I was wondering if by any chance you were using one 32-
> bit and one 64-bit operating system; if so, which one is Linux and
> which one is Windows?
And maybe check if there is a difference in static/dynamic linking...
The VM might do some JIT-optimizations on the binaries executing. You
could try running some Linux Live-CD with same GNAT version.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Interesting performance quirk.
2008-10-28 23:22 ` Ludovic Brenta
2008-10-29 8:42 ` oenone
@ 2008-10-29 9:59 ` Peter C. Chapin
2008-10-29 10:19 ` Martin
2008-11-17 6:31 ` David Thompson
1 sibling, 2 replies; 28+ messages in thread
From: Peter C. Chapin @ 2008-10-29 9:59 UTC (permalink / raw)
Ludovic Brenta wrote:
> I believe OpenSSL uses hand-written and carefully optimised assembly
> for its inner loops.
That's interesting. It would help to explain the speed difference...
although as I said I'm sure I can do better, at least a little better,
with the compiled Ada than I am currently.
In any event I don't want to "resort" to assembly language or to
removing the Ada mandated checks (at least not in the final version). My
assumption is the the sort of person who would be interested in using a
crypto library in Ada would have that interest precisely because of the
safety afforded by the Ada language. This is why I'm content with the
library being a bit slower than the competitors that are throwing safety
to the wind by, for example, using assembly language.
Considering that cryptography is, almost by definition, used in security
sensitive programs it seems like issues of safety and correctness should
take priority over raw efficiency. Fast is good, of course, but it's
even more important to be both right and secure. Hence Ada and not C
(and definitely not assembly language).
> Besides that, I was wondering if by any chance you were using one 32-
> bit and one 64-bit operating system; if so, which one is Linux and
> which one is Windows?
No, they are both 32-bit systems. The Linux VM is on the same hardware
as the Windows system.
Peter
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Interesting performance quirk.
2008-10-29 9:59 ` Peter C. Chapin
@ 2008-10-29 10:19 ` Martin
2008-11-17 6:31 ` David Thompson
1 sibling, 0 replies; 28+ messages in thread
From: Martin @ 2008-10-29 10:19 UTC (permalink / raw)
On Oct 29, 9:59 am, "Peter C. Chapin" <pcc482...@gmail.com> wrote:
> In any event I don't want to "resort" to assembly language or to
> removing the Ada mandated checks (at least not in the final version).
With a bit of care a lot of checks can be avoided or removed by the
compiler. A fairly common example of this wrt loops is:
type Array_Index_Type is range ...;
type Array_Type is array (Array_Index_Type) of ...;
procedure Example (A : Array_Type) is
begin
for I in Array_Index_Type loop
-- Do something with array I - a check is often included
end loop;
end Example;
is better written as:
procedure Example (A : Array_Type) is
begin
for I in A'Range loop
-- Do something with array I - no need for a check
end loop;
end Example;
I'm not a language lawyer, so I can't tell if that's compiler
inefficiencies or what the language requires but this was certainly
the behaviour of several compilers I've used***.
Cheers
-- Martin
*** I haven't checked this recently!!! :-)
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Interesting performance quirk.
2008-10-29 9:59 ` Peter C. Chapin
2008-10-29 10:19 ` Martin
@ 2008-11-17 6:31 ` David Thompson
2008-11-17 11:51 ` Peter C. Chapin
1 sibling, 1 reply; 28+ messages in thread
From: David Thompson @ 2008-11-17 6:31 UTC (permalink / raw)
On Wed, 29 Oct 2008 05:59:24 -0400, "Peter C. Chapin"
<pcc482719@gmail.com> wrote:
> Ludovic Brenta wrote:
>
> > I believe OpenSSL uses hand-written and carefully optimised assembly
> > for its inner loops.
>
For some algorithms and some targets, including x86 Blowfish.
Unless deselected at build (specifically configuration) time.
(The actual assembler source is created by perl effectively macros,
so it might be more exact to call it hand-designed.)
> That's interesting. It would help to explain the speed difference...
> although as I said I'm sure I can do better, at least a little better,
> with the compiled Ada than I am currently.
>
> In any event I don't want to "resort" to assembly language or to
> removing the Ada mandated checks (at least not in the final version). My
> assumption is the the sort of person who would be interested in using a
> crypto library in Ada would have that interest precisely because of the
> safety afforded by the Ada language. This is why I'm content with the
> library being a bit slower than the competitors that are throwing safety
> to the wind by, for example, using assembly language.
>
> Considering that cryptography is, almost by definition, used in security
> sensitive programs it seems like issues of safety and correctness should
> take priority over raw efficiency. Fast is good, of course, but it's
> even more important to be both right and secure. Hence Ada and not C
> (and definitely not assembly language).
>
Correctness is certainly vital, but the extreme nonlinearity of any
good crypto algorithm means that even a small functional error will
almost certainly show up very quickly and obviously. However, in
modern systems it has become sometimes important to control exactly
the instructions executed, and CPU's execution of those instructions
(typically using assembler-dependent special operations), because of
'side-channel' (mostly timing and power) attacks that have been
developed that take advantage of things like scheduling and caching
behavior of the crypto machine code. So carefully written assembler
can actually be more secure! But I don't know if all the openssl is.)
- formerly david.thompson1 || achar(64) || worldnet.att.net
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Interesting performance quirk.
2008-11-17 6:31 ` David Thompson
@ 2008-11-17 11:51 ` Peter C. Chapin
0 siblings, 0 replies; 28+ messages in thread
From: Peter C. Chapin @ 2008-11-17 11:51 UTC (permalink / raw)
David Thompson wrote:
>> Considering that cryptography is, almost by definition, used in security
>> sensitive programs it seems like issues of safety and correctness should
>> take priority over raw efficiency. Fast is good, of course, but it's
>> even more important to be both right and secure. Hence Ada and not C
>> (and definitely not assembly language).
>>
> Correctness is certainly vital, but the extreme nonlinearity of any
> good crypto algorithm means that even a small functional error will
> almost certainly show up very quickly and obviously.
I understand your point, but its not necessarily that simple. Take a look at
http://www.schneier.com/blowfish-bug.txt
which describes a "common" implementation bug where the code
inadvertently overwrites a large number of key bits with 1 bits. Because
encryption and decryption process the key in the same way, this bug does
not reveal itself as a problem with round trip processing.
Now in this case one would expect any reasonable degree of testing to
show the problem... but that assumes the test cases were generated with
an implementation that is free of this issue. If multiple carefully
written implementations all agree that increases one's confidence in the
result, but what does that really prove? They could all be wrong in
exactly the same way. Here "being wrong" might well mean "weakened
security relative to published theoretical results."
For example, consider Blowfish again... the algorithm requires a large
number of digits from the hexadecimal expansion of pi during its
initialization. Where did I get those initial values? I copied them from
Schneier's reference implementation. I'm sure that's probably what a lot
of Blowfish implementors do. Yet what if they are wrong? My test cases
based on Schneier's published test vectors (on the site above) wouldn't
show the problem because we'd both be using the wrong initial values.
Should I believe those are really the digits from the hexadecimal
expansion of pi just because Schneier says so? No... more checking needs
to be done.
What I *should* do is write my own program to compute those values, use
a tool like SPARK to prove my program correct, and then cross check its
output with several other independently computed values for pi. In fact,
it's on my to-do list to do exactly this (but I admit it's fairly low in
priority right now). I'd be very surprised, of course, if Schneier's
value of pi is wrong but, hey, you never know.
> However, in
> modern systems it has become sometimes important to control exactly
> the instructions executed, and CPU's execution of those instructions
> (typically using assembler-dependent special operations), because of
> 'side-channel' (mostly timing and power) attacks that have been
> developed that take advantage of things like scheduling and caching
> behavior of the crypto machine code. So carefully written assembler
> can actually be more secure!
That's interesting. I've heard of these kinds of attacks before but I
admit that I don't know much about them. I'll have to look into it.
Thanks for the heads-up.
Peter
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Interesting performance quirk.
2008-10-28 11:35 ` Peter C. Chapin
` (2 preceding siblings ...)
2008-10-28 23:22 ` Ludovic Brenta
@ 2008-10-29 9:54 ` Alex R. Mosteo
2008-10-30 11:16 ` Peter C. Chapin
2008-10-29 16:12 ` Colin Paul Gloster
4 siblings, 1 reply; 28+ messages in thread
From: Alex R. Mosteo @ 2008-10-29 9:54 UTC (permalink / raw)
Peter C. Chapin wrote:
> I tried running gprof over my code to see what I could learn, but I
> don't know how to use that tool very well yet so I didn't learn much. I
> discovered that a certain procedure is called a HUGE number of times,
> but that's not unexpected considering where it's used. It's a small
> procedure so I might try inlining it.
I'd also recomend valgrind with, e.g., kcachegrind to visualize the results. It might be easier to get into than gprof, and doesn't require special parameters to build the executable.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Interesting performance quirk.
2008-10-29 9:54 ` Alex R. Mosteo
@ 2008-10-30 11:16 ` Peter C. Chapin
0 siblings, 0 replies; 28+ messages in thread
From: Peter C. Chapin @ 2008-10-30 11:16 UTC (permalink / raw)
Alex R. Mosteo wrote:
> I'd also recomend valgrind with, e.g., kcachegrind to visualize the results. It might be easier to get into than gprof, and doesn't require special parameters to build the executable.
That's for the suggestion. I've never used valgrind before, but it is
actually on my to-do list to figure it out in connection with an
unrelated project (that actually pertains to my day job).
Peter
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Interesting performance quirk.
2008-10-28 11:35 ` Peter C. Chapin
` (3 preceding siblings ...)
2008-10-29 9:54 ` Alex R. Mosteo
@ 2008-10-29 16:12 ` Colin Paul Gloster
2008-10-30 11:23 ` Peter C. Chapin
4 siblings, 1 reply; 28+ messages in thread
From: Colin Paul Gloster @ 2008-10-29 16:12 UTC (permalink / raw)
In
news:4904516f$0$28741$4d3efbfe@news.sover.net
on October 26th, 2008, Peter C. Chapin submitted:
|------------------------------------------------------------------------|
|"[..] |
| |
|I am also planning to write a similar program in C using OpenSSL's |
|Blowfish implementation. I want to get a point of comparison to see if |
|my implementation has "reasonable" performance or not (here I'm assuming|
|that OpenSSL is reasonable). I'm curious now if the C/OpenSSL version |
|will also demonstrate this same performance quirk." |
|------------------------------------------------------------------------|
I had intended to reply with a claim that any language would be sped
up in the emulated system, however instead in
news:4906f908$0$5781$4d3efbfe@news.sover.net
on Tue, 28 Oct 2008, Peter C. Chapin submitted:
|------------------------------------------------------------------------|
|"[..] |
| |
|In further developments... |
| |
|I wrote a version of the benchmark program in C that uses OpenSSL's |
|Blowfish implementation. The program is the "same" in that it creates |
|the same amount of data and processes it in the same way. The only |
|significant difference is that it calls into OpenSSL. It's quite a bit |
|faster. Right now I get |
| |
|Windows XP Laptop (GNAT GPL 2008 for the Ada, Cygwin gcc for the C) |
|----- |
|My Library => 11 MB/s (with -O2 option and no debugging support) |
|OpenSSL => 65 MB/s (Wow!) |
| |
|SUSE Linux in a VM on the same box |
|----- |
|My Library => 25 MB/s (odd) |
|OpenSSL => 65 MB/s (now this makes sense at least) |
| |
|[..]" |
|------------------------------------------------------------------------|
Earlier this year I had used QEMU on Windows (possibly not Windows XP)
to have a GNU/Linux distribution (possibly RedHat) emulated. I ran a
Bourne shell script or a Bourne Again SHell script in the emulated
system which made thousands of fairly short I/O transactions. The
emulated system including its pretend harddisk were kept small enough
(no more than a few hundred megabytes) to be kept solely in the real
physical primary memory instead of relying on virtual memory.
It was faster than running the same script on Cygwin on the same
machine. (The throughput of my shell script was faster in the emulated
system, whereas the human user interface of the emulated system was
dominated by how quickly it starts to respond instead of throughput,
and suffered from sluggish latencies.) Conversely, on
WWW.Debian.org/intro/why_debian
it was claimed:
!-------------------------------------------------------------------!
!"If you are not already a GNU/Linux user, you may also enjoy the !
!following benefits: !
! !
![..] !
!Fast and easy on memory !
! Other operating systems may be as fast in one or two areas, but!
! being based on GNU/Linux, Debian is lean and mean. Windows !
! software run from GNU/Linux using an emulator sometimes runs !
! faster than when run in the native environment. !
![..]" !
!-------------------------------------------------------------------!
That Debian webpage lacks important information for a fair comparison
with Windows.
However, I do not know why Peter C. Chapin's Ada code is sped up but
OpenSSL is not.
Yours sincerely,
Colin Paul Gloster
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Interesting performance quirk.
2008-10-29 16:12 ` Colin Paul Gloster
@ 2008-10-30 11:23 ` Peter C. Chapin
2008-10-31 13:41 ` Colin Paul Gloster
0 siblings, 1 reply; 28+ messages in thread
From: Peter C. Chapin @ 2008-10-30 11:23 UTC (permalink / raw)
Colin Paul Gloster wrote:
> Earlier this year I had used QEMU on Windows (possibly not Windows XP)
> to have a GNU/Linux distribution (possibly RedHat) emulated. I ran a
> Bourne shell script or a Bourne Again Shell script in the emulated
> system which made thousands of fairly short I/O transactions. The
> emulated system including its pretend harddisk were kept small enough
> (no more than a few hundred megabytes) to be kept solely in the real
> physical primary memory instead of relying on virtual memory.
>
> It was faster than running the same script on Cygwin on the same
> machine.
That's interesting. I think it's probably conventional wisdom that doing
I/O in a VM would be slower than outside the virtual machine. I'm sure
that's true in many cases, although the situation you described shows
that it's not always true.
My program doesn't do any I/O during its main loop. Also the memory
block I work over is only 1 MB long so I don't think paging would be an
issue (there is no disk activity when I run it). In some respects the
program is ideal for performance analysis in that there are relatively
few complicating factors involved. In fact, that was my intention when I
wrote it.
One complicating issue that remains is the behavior of the memory cache.
I wonder if one of the programs is missing the cache more than the
other. I'm not clear on why it would do that, however. The same hardware
is being used after all. Perhaps the Windows compiler has organized the
executable in some cache un-friendly way.
Peter
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Interesting performance quirk.
2008-10-30 11:23 ` Peter C. Chapin
@ 2008-10-31 13:41 ` Colin Paul Gloster
2008-11-01 15:41 ` Gene
0 siblings, 1 reply; 28+ messages in thread
From: Colin Paul Gloster @ 2008-10-31 13:41 UTC (permalink / raw)
On Thu, 30 Oct 2008, Peter C. Chapin wrote:
|------------------------------------------------------------------------|
|"Colin Paul Gloster wrote: |
| |
|> Earlier this year I had used QEMU on Windows (possibly not Windows XP)|
|> to have a GNU/Linux distribution (possibly RedHat) emulated. I ran a |
|> Bourne shell script or a Bourne Again Shell script in the emulated |
|> system which made thousands of fairly short I/O transactions. The |
|> emulated system including its pretend harddisk were kept small enough |
|> (no more than a few hundred megabytes) to be kept solely in the real |
|> physical primary memory instead of relying on virtual memory. |
|> |
|> It was faster than running the same script on Cygwin on the same |
|> machine. |
| |
|That's interesting. I think it's probably conventional wisdom that doing|
|I/O in a VM would be slower than outside the virtual machine. I'm sure |
|that's true in many cases, although the situation you described shows |
|that it's not always true." |
|------------------------------------------------------------------------|
For clarity, I explain that the I/O of the virtual machine which I
referred to was merely I/O to its emulated filesystems, all of which
together plus the emulated memory were small enough to fit into the
genuine physical memory of the host operating system.
|------------------------------------------------------------------------|
|"My program [..] the memory |
|block I work over is only 1 MB long so I don't think paging would be an |
|issue (there is no disk activity when I run it). [..] |
| |
|[..]" |
|------------------------------------------------------------------------|
Your program's memory block's size might be of the order of one
megabyte, but I do not know whether the emulated filesystems which you
used were also small enough to fit into emulated memory. However, this
does not explain why one program you have tried has been sped up by
emulation whereas another has not been sped up.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Interesting performance quirk.
2008-10-31 13:41 ` Colin Paul Gloster
@ 2008-11-01 15:41 ` Gene
0 siblings, 0 replies; 28+ messages in thread
From: Gene @ 2008-11-01 15:41 UTC (permalink / raw)
On Oct 31, 9:41 am, Colin Paul Gloster <Colin_Paul_Glos...@ACM.org>
wrote:
> On Thu, 30 Oct 2008, Peter C. Chapin wrote:
>
> |------------------------------------------------------------------------||"Colin Paul Gloster wrote: |
>
> | |
> |> Earlier this year I had used QEMU on Windows (possibly not Windows XP)|
> |> to have a GNU/Linux distribution (possibly RedHat) emulated. I ran a |
> |> Bourne shell script or a Bourne Again Shell script in the emulated |
> |> system which made thousands of fairly short I/O transactions. The |
> |> emulated system including its pretend harddisk were kept small enough |
> |> (no more than a few hundred megabytes) to be kept solely in the real |
> |> physical primary memory instead of relying on virtual memory. |
> |> |
> |> It was faster than running the same script on Cygwin on the same |
> |> machine. |
> | |
> |That's interesting. I think it's probably conventional wisdom that doing|
> |I/O in a VM would be slower than outside the virtual machine. I'm sure |
> |that's true in many cases, although the situation you described shows |
> |that it's not always true." |
> |------------------------------------------------------------------------|
>
> For clarity, I explain that the I/O of the virtual machine which I
> referred to was merely I/O to its emulated filesystems, all of which
> together plus the emulated memory were small enough to fit into the
> genuine physical memory of the host operating system.
>
> |------------------------------------------------------------------------|
> |"My program [..] the memory |
> |block I work over is only 1 MB long so I don't think paging would be an |
> |issue (there is no disk activity when I run it). [..] |
> | |
> |[..]" |
> |------------------------------------------------------------------------|
>
> Your program's memory block's size might be of the order of one
> megabyte, but I do not know whether the emulated filesystems which you
> used were also small enough to fit into emulated memory. However, this
> does not explain why one program you have tried has been sped up by
> emulation whereas another has not been sped up.
Exactly right. The most obvious explanation is that system-dependent
code or build conventions have led to some important difference in the
run-time support. Detailed profiling is probably the only way to
figure how where. FWIW, I remember a similar situation that finally
turned out to be explained by compilation with the Windows
multithreaded debugging libraries. When we switched to production,
single-threaded libraries, the differences vanished or went in favor
of Windows.
^ permalink raw reply [flat|nested] 28+ messages in thread