comp.lang.ada
 help / color / mirror / Atom feed
* Possible heap problem on Windows, help sought
@ 2006-11-29 11:43 Niklas Holsti
  2006-11-29 11:54 ` Duncan Sands
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Niklas Holsti @ 2006-11-29 11:43 UTC (permalink / raw)


Hi all,

I'm using Gnat 3.15p (why change if it works ;-) on both MS Windows XP 
and Linux (Debian) and having some problems on the Windows platform that 
I suspect may be due to lack of heap space in my application. I'm not 
experienced with Windows programming so I may be omitting something obvious.

My compiled Ada application runs fine on Linux both for large and small 
jobs. On Windows it works for small jobs but stops in the middle for 
larger ones, with no error or exception message. Here "large" means 
about 250 - 300 MB of heap. I have some catch-all exception handlers at 
various levels of the software that should report exceptions using 
Text_IO, but nothing appears.

The funny thing is that if I run the application under gdb on Windows it 
works for large jobs too.

I've tried to link the application with larger heap space using the 
advice in the GNAT User Guide, for example

    gnatlink -g -v appname -Wl,--heap=0x18000000,--stack=0x1300000

but this did not help; the symptoms are the same. I'd be grateful for 
any help.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Possible heap problem on Windows, help sought
  2006-11-29 11:43 Possible heap problem on Windows, help sought Niklas Holsti
@ 2006-11-29 11:54 ` Duncan Sands
  2006-11-29 13:42   ` Niklas Holsti
  2006-11-29 14:06 ` Niklas Holsti
       [not found] ` <mailman.32.1164869912.4389.comp.lang.ada@ada-france.org>
  2 siblings, 1 reply; 12+ messages in thread
From: Duncan Sands @ 2006-11-29 11:54 UTC (permalink / raw)
  To: comp.lang.ada; +Cc: Niklas Holsti

Are you using tasking?

D.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Possible heap problem on Windows, help sought
  2006-11-29 11:54 ` Duncan Sands
@ 2006-11-29 13:42   ` Niklas Holsti
  0 siblings, 0 replies; 12+ messages in thread
From: Niklas Holsti @ 2006-11-29 13:42 UTC (permalink / raw)


Duncan Sands wrote:
> Are you using tasking?
> 
> D.

No, there are no tasks (except the predefined "environment task"), and 
its a vanilla "console" application, not a GUI one. At the point where 
it stops, the application has not had any special interaction with the 
O/S (Windows in this case), but later on it starts a couple of child 
processes with text pipes (in Unix-speak) in and out. The child 
processes work in small jobs and, as I said, in large jobs the program 
stops before it comes to create the children.

I'm not excluding a programming error such as some uninitialized 
variable, but I'd be happy to exclude a heap-size problem first. Maybe I 
should write a small test program to see how much heap I can allocate 
with my usual compilation and linking options.

I seem to remember a discussion on comp.lang.ada, perhaps a year ago, 
about heap size with Gnat in Windows, but a google search did not turn 
it up.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Possible heap problem on Windows, help sought
  2006-11-29 11:43 Possible heap problem on Windows, help sought Niklas Holsti
  2006-11-29 11:54 ` Duncan Sands
@ 2006-11-29 14:06 ` Niklas Holsti
  2006-11-29 14:20   ` Duncan Sands
       [not found] ` <mailman.32.1164869912.4389.comp.lang.ada@ada-france.org>
  2 siblings, 1 reply; 12+ messages in thread
From: Niklas Holsti @ 2006-11-29 14:06 UTC (permalink / raw)


Niklas Holsti (that's me) wrote:

> I'm using Gnat 3.15p (why change if it works ;-) on both MS Windows XP 
> and Linux (Debian) and having some problems on the Windows platform
...
> My compiled Ada application runs fine on Linux both for large and small 
> jobs. On Windows it works for small jobs but stops in the middle for 
> larger ones, with no error or exception message. Here "large" means 
> about 250 - 300 MB of heap. I have some catch-all exception handlers at 
> various levels of the software that should report exceptions using 
> Text_IO, but nothing appears.
> 
> The funny thing is that if I run the application under gdb on Windows it 
> works for large jobs too.

Well I made a small test program using my ordinary compilation options 
(-g -O2 -gnato -fstack-check) and it happily allocated and used 300 MB 
of heap, as I asked it to, so it seems that this is not a heap problem. 
Sorry for a mistaken hypothesis, I should have checked it before posting 
my question. But the problem remains and any advice will be gratefully 
received, although I now suspect that the cause may be a programming 
error, a Heisenbug that disappears in the debugger and does not manifest 
on Linux for some reason.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Possible heap problem on Windows, help sought
  2006-11-29 14:06 ` Niklas Holsti
@ 2006-11-29 14:20   ` Duncan Sands
  2006-11-29 15:04     ` Alex R. Mosteo
  0 siblings, 1 reply; 12+ messages in thread
From: Duncan Sands @ 2006-11-29 14:20 UTC (permalink / raw)
  To: comp.lang.ada; +Cc: Niklas Holsti

> Well I made a small test program using my ordinary compilation options 
> (-g -O2 -gnato -fstack-check) and it happily allocated and used 300 MB 
> of heap, as I asked it to, so it seems that this is not a heap problem. 
> Sorry for a mistaken hypothesis, I should have checked it before posting 
> my question. But the problem remains and any advice will be gratefully 
> received, although I now suspect that the cause may be a programming 
> error, a Heisenbug that disappears in the debugger and does not manifest 
> on Linux for some reason.

It could well be memory corruption or the use of an uninitialized variable.
These both have a tendency to manifest themselves differently in the debugger.
Try running your program under valgrind.

Ciao,

Duncan.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Possible heap problem on Windows, help sought
  2006-11-29 14:20   ` Duncan Sands
@ 2006-11-29 15:04     ` Alex R. Mosteo
  2006-11-29 16:40       ` Niklas Holsti
  0 siblings, 1 reply; 12+ messages in thread
From: Alex R. Mosteo @ 2006-11-29 15:04 UTC (permalink / raw)


Duncan Sands wrote:

>> Well I made a small test program using my ordinary compilation options
>> (-g -O2 -gnato -fstack-check) and it happily allocated and used 300 MB
>> of heap, as I asked it to, so it seems that this is not a heap problem.
>> Sorry for a mistaken hypothesis, I should have checked it before posting
>> my question. But the problem remains and any advice will be gratefully
>> received, although I now suspect that the cause may be a programming
>> error, a Heisenbug that disappears in the debugger and does not manifest
>> on Linux for some reason.
> 
> It could well be memory corruption or the use of an uninitialized
> variable. These both have a tendency to manifest themselves differently in
> the debugger. Try running your program under valgrind.

Also check Gnat.Debug_Pools (Is it already in Gnat 3.15p?)

With the default stack size, a windows program will lock when creating the
~250 task, but you said you didn't use tasks.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Possible heap problem on Windows, help sought
  2006-11-29 15:04     ` Alex R. Mosteo
@ 2006-11-29 16:40       ` Niklas Holsti
  2006-11-29 17:23         ` Ludovic Brenta
  0 siblings, 1 reply; 12+ messages in thread
From: Niklas Holsti @ 2006-11-29 16:40 UTC (permalink / raw)


Alex R. Mosteo wrote:
> Duncan Sands wrote:
>>...
>>It could well be memory corruption or the use of an uninitialized
>>variable. These both have a tendency to manifest themselves differently in
>>the debugger. Try running your program under valgrind.
> 
> 
> Also check Gnat.Debug_Pools (Is it already in Gnat 3.15p?)

Yes, Gnat.Debug_Pools seems to be in 3.15p (at least the .ads/.adb files 
are in adainclude). I'll try that, or valgrind, also -gnatVa. (It would 
be nice to have a compiler option telling GNAT to use Debug_Pool for all 
access types by default, but I couldn't find one.)

I'm grateful for your advice. Thanks to the Ada run-time checks it has 
been a while since I had to fight with this kind of bug, but I hope it 
will come back to me -- and then go away again :-)

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Possible heap problem on Windows, help sought
  2006-11-29 16:40       ` Niklas Holsti
@ 2006-11-29 17:23         ` Ludovic Brenta
  2006-11-29 17:53           ` Niklas Holsti
  0 siblings, 1 reply; 12+ messages in thread
From: Ludovic Brenta @ 2006-11-29 17:23 UTC (permalink / raw)


Niklas Holsti writes:
> I'm grateful for your advice. Thanks to the Ada run-time checks it
> has been a while since I had to fight with this kind of bug, but I
> hope it will come back to me -- and then go away again :-)

Same here; for the past coupld of months, I've been trying, on and
off, to fight a nasty heap corruption bug in GPS 3.1.3, 4.0.0 and now
4.1.1.  I've documented the various symptoms in Debian bugs #393636,
#400876 and #400883.  I realised that my skills are insufficient for
such a large-scale undertaking; Valgrind detected more that 10 million
errors during a short run of GPS, most of which in Python or GTK+
libraries, and not all of which are necessarily serious.  Any advice?

Today I replaced one instance of Unchecked_Deallocation with a no-op
and behold, GPS no longer crashes.  Obviously, it leaks, but that's
not as bad as crashing or hanging.  Because there was nothing
obviously wrong with the Unchecked_Deallocation that I removed, I
suspect there is still some heap corruption going on unnoticed.  See
my latest comment on #400876.

I'm thinking maybe I should instrument access types to use debug
pools, but there are too many of them.

That's obviously not on Windows; sorry for hijacking your thread,
Niklas :)

-- 
Ludovic Brenta.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Possible heap problem on Windows, help sought
  2006-11-29 17:23         ` Ludovic Brenta
@ 2006-11-29 17:53           ` Niklas Holsti
  2006-11-29 18:00             ` Ludovic Brenta
  2006-12-01  1:05             ` Randy Brukardt
  0 siblings, 2 replies; 12+ messages in thread
From: Niklas Holsti @ 2006-11-29 17:53 UTC (permalink / raw)


Ludovic Brenta wrote:
> Niklas Holsti writes:
> 
>>I'm grateful for your advice. Thanks to the Ada run-time checks it
>>has been a while since I had to fight with this kind of bug, but I
>>hope it will come back to me -- and then go away again :-)
> 
> 
> Same here; for the past coupld of months, I've been trying, on and
> off, to fight a nasty heap corruption bug in GPS 3.1.3, 4.0.0 and now
> 4.1.1.  I've documented the various symptoms in Debian bugs #393636,
> #400876 and #400883.  I realised that my skills are insufficient for
> such a large-scale undertaking; Valgrind detected more that 10 million
> errors during a short run of GPS, most of which in Python or GTK+
> libraries, and not all of which are necessarily serious.  Any advice?

Ouch. Makes me feel comforted, in a way, but sorry, no advice yet.

> Today I replaced one instance of Unchecked_Deallocation with a no-op
> and behold, GPS no longer crashes.  Obviously, it leaks, but that's
> not as bad as crashing or hanging.

My application is a batch computation, not interactive, so I 
deliberately avoid deallocation and cheerfully let all low-turnover 
allocations leak. (I know this will bite me if/when I try to build a GUI 
for this application.) So I have only a small number of access types 
where dangling pointers may occur (I think).

> I'm thinking maybe I should instrument access types to use debug
> pools, but there are too many of them.

What do you think of a GNAT option to make debug-pools the default? 
Would it make sense?

> That's obviously not on Windows; sorry for hijacking your thread,
> Niklas :)

Feel free, it seems my problem is not Windows-specific, either.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Possible heap problem on Windows, help sought
  2006-11-29 17:53           ` Niklas Holsti
@ 2006-11-29 18:00             ` Ludovic Brenta
  2006-12-01  1:05             ` Randy Brukardt
  1 sibling, 0 replies; 12+ messages in thread
From: Ludovic Brenta @ 2006-11-29 18:00 UTC (permalink / raw)


Niklas Holsti <niklas.holsti@nospam.please> writes:
> What do you think of a GNAT option to make debug-pools the default?
> Would it make sense?

Yes, definitely, and especially if it would synthesise a different
pool for each access type.  But I'm not aware of such an option.

>> That's obviously not on Windows; sorry for hijacking your thread,
>> Niklas :)
>
> Feel free, it seems my problem is not Windows-specific, either.

Thanks :)

-- 
Ludovic Brenta.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Possible heap problem on Windows, help sought
       [not found] ` <mailman.32.1164869912.4389.comp.lang.ada@ada-france.org>
@ 2006-11-30 20:48   ` Niklas Holsti
  0 siblings, 0 replies; 12+ messages in thread
From: Niklas Holsti @ 2006-11-30 20:48 UTC (permalink / raw)


Bj�rn Lundin wrote:
> 
> 29 nov 2006 kl. 12.43 skrev Niklas Holsti:
> 
>> Hi all,
>>
>> I'm using Gnat 3.15p (why change if it works ;-)
> 
> 
> Seems like it does not work?

Touch�. But I do need the GMGPL at present. I'm considering a transition 
to GPL but it's not my choice alone.

> You may want to try a newer compiler. I've experienced silent deaths  
> with 3.15p;
> more often when the program grows in complexity.
> 
> With the 5-series, I do not get that anymore.

Very interesting. Were the silent deaths on Windows, Linux or both?

> Testrun it with the gpl'd 2005 compiler, just to rule out compiler  
> problems,
> that are solved in later releases

Also good advice, thanks. I probably should test-compile my application 
with the newest Gnat GPL anyway, to check for possible illegalities that 
3.15p may not have detected -- I understand that there may be such.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Possible heap problem on Windows, help sought
  2006-11-29 17:53           ` Niklas Holsti
  2006-11-29 18:00             ` Ludovic Brenta
@ 2006-12-01  1:05             ` Randy Brukardt
  1 sibling, 0 replies; 12+ messages in thread
From: Randy Brukardt @ 2006-12-01  1:05 UTC (permalink / raw)


"Niklas Holsti" <niklas.holsti@nospam.please> wrote in message
news:456dc966$0$31542$39db0f71@news.song.fi...
> Ludovic Brenta wrote:
...
> > Same here; for the past coupld of months, I've been trying, on and
> > off, to fight a nasty heap corruption bug in GPS 3.1.3, 4.0.0 and now
> > 4.1.1.  I've documented the various symptoms in Debian bugs #393636,
> > #400876 and #400883.  I realised that my skills are insufficient for
> > such a large-scale undertaking; Valgrind detected more that 10 million
> > errors during a short run of GPS, most of which in Python or GTK+
> > libraries, and not all of which are necessarily serious.  Any advice?
>
> Ouch. Makes me feel comforted, in a way, but sorry, no advice yet.

I fought a problem like that in our spam filter for several months. I
introduced debug pools on virtually every access type, and only managed to
slow the filter down a lot. I'd pretty much given up on trying to find the
problem when I ran the program under a Windows debugger: and while tracing
it I got some bizarre messages about heap problems. I later found out that
Windows automatically uses a debug heap when a program is run under the
debugger. Anyway, tracing the messages led to a Free call in an DNS binding.
The Ada code for the allocator and deallocators were similar, while the
allocator used **DNS_Rec and the deallocator used *DNS_Rec. So I was passing
a pointer into the stack into the deallocator; no wonder things got
confused. Oddly, the binding in question had been in production use for more
than 3 years; apparently, it was never used in a program that was long
running enough to show the corruption.

Moral: for memory corruption bugs: they can be anywhere, even in code that's
supposedly well-tested. And a likely culprit is bindings to non-Ada code,
because the checking that Ada provides is lost there (the compiler can't
detect mistakes).

                             Randy.





^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2006-12-01  1:05 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-11-29 11:43 Possible heap problem on Windows, help sought Niklas Holsti
2006-11-29 11:54 ` Duncan Sands
2006-11-29 13:42   ` Niklas Holsti
2006-11-29 14:06 ` Niklas Holsti
2006-11-29 14:20   ` Duncan Sands
2006-11-29 15:04     ` Alex R. Mosteo
2006-11-29 16:40       ` Niklas Holsti
2006-11-29 17:23         ` Ludovic Brenta
2006-11-29 17:53           ` Niklas Holsti
2006-11-29 18:00             ` Ludovic Brenta
2006-12-01  1:05             ` Randy Brukardt
     [not found] ` <mailman.32.1164869912.4389.comp.lang.ada@ada-france.org>
2006-11-30 20:48   ` Niklas Holsti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox