Linear Search

comp.lang.ada
 help / color / mirror / Atom feed

* Linear Search
@ 2016-09-28 16:41 Jeffrey R. Carter
  2016-09-28 17:23 ` Niklas Holsti
  2016-09-29  4:39 ` Paul Rubin
  0 siblings, 2 replies; 10+ messages in thread
From: Jeffrey R. Carter @ 2016-09-28 16:41 UTC (permalink / raw)


This is not really Ada related, but on an Intel Atom x5-Z8500 computer running
Linux Mint 18, I can find the maximum of 70 million random Unsigned_32s
(generated using the Threefry RNG), using a sequential linear search, in about
200 ms. This is fast enough that it appears instantaneous to a human.

Since the processor has 4 cores, a parallelization of the search using 8 tasks
should be able to search 500 million values in about the same time (if they fit
in memory).

I can't imagine any real application in which I would keep that many values in
memory. Usually they'd be in some sort of DB that would do the search for me. So
now I'm wondering if there's really any need for O(log N)-search-time structures.

-- 
Jeff Carter
"Death awaits you all, with nasty, big, pointy teeth!"
Monty Python & the Holy Grail
20


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Linear Search
  2016-09-28 16:41 Linear Search Jeffrey R. Carter
@ 2016-09-28 17:23 ` Niklas Holsti
  2016-09-28 18:24   ` Jeffrey R. Carter
  2016-09-28 20:21   ` Randy Brukardt
  2016-09-29  4:39 ` Paul Rubin
  1 sibling, 2 replies; 10+ messages in thread
From: Niklas Holsti @ 2016-09-28 17:23 UTC (permalink / raw)


On 16-09-28 18:41 , Jeffrey R. Carter wrote:
> This is not really Ada related, but on an Intel Atom x5-Z8500 computer running
> Linux Mint 18, I can find the maximum of 70 million random Unsigned_32s
> (generated using the Threefry RNG), using a sequential linear search, in about
> 200 ms. This is fast enough that it appears instantaneous to a human.
>
> Since the processor has 4 cores, a parallelization of the search using 8 tasks
> should be able to search 500 million values in about the same time (if they fit
> in memory).
>
> I can't imagine any real application in which I would keep that many values in
> memory. Usually they'd be in some sort of DB that would do the search for me. So
> now I'm wondering if there's really any need for O(log N)-search-time structures.

Sure there is: when you have to do the search, say, 1000 times per 
second... because you are servicing 1000 net queries per second...

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Linear Search
  2016-09-28 17:23 ` Niklas Holsti
@ 2016-09-28 18:24   ` Jeffrey R. Carter
  2016-09-28 21:03     ` Niklas Holsti
  2016-09-28 20:21   ` Randy Brukardt
  1 sibling, 1 reply; 10+ messages in thread
From: Jeffrey R. Carter @ 2016-09-28 18:24 UTC (permalink / raw)


On 09/28/2016 10:23 AM, Niklas Holsti wrote:
> 
> Sure there is: when you have to do the search, say, 1000 times per second...
> because you are servicing 1000 net queries per second...

But how many values are you keeping in memory? If you keep 100k values in
memory, which still seems like a lot to me, then a linear search takes about 0.3
ms, and you could easily do 1000 of those per second.

-- 
Jeff Carter
"Brave Sir Robin ran away."
Monty Python and the Holy Grail
59


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Linear Search
  2016-09-28 17:23 ` Niklas Holsti
  2016-09-28 18:24   ` Jeffrey R. Carter
@ 2016-09-28 20:21   ` Randy Brukardt
  2016-09-28 23:08     ` Jeffrey R. Carter
  1 sibling, 1 reply; 10+ messages in thread
From: Randy Brukardt @ 2016-09-28 20:21 UTC (permalink / raw)


"Niklas Holsti" <niklas.holsti@tidorum.invalid> wrote in message 
news:e52ck2Fos4U1@mid.individual.net...
> On 16-09-28 18:41 , Jeffrey R. Carter wrote:
>> This is not really Ada related, but on an Intel Atom x5-Z8500 computer 
>> running
>> Linux Mint 18, I can find the maximum of 70 million random Unsigned_32s
>> (generated using the Threefry RNG), using a sequential linear search, in 
>> about
>> 200 ms. This is fast enough that it appears instantaneous to a human.
>>
>> Since the processor has 4 cores, a parallelization of the search using 8 
>> tasks
>> should be able to search 500 million values in about the same time (if 
>> they fit
>> in memory).
>>
>> I can't imagine any real application in which I would keep that many 
>> values in
>> memory. Usually they'd be in some sort of DB that would do the search for 
>> me. So
>> now I'm wondering if there's really any need for O(log N)-search-time 
>> structures.
>
> Sure there is: when you have to do the search, say, 1000 times per 
> second... because you are servicing 1000 net queries per second...

Or, say compiling Ada and the search is in generated parse tables. (I was 
able to speed up the parsing speed of our compiler by an order of magitude 
by replacing the linear search for the next symbol by a tuned search.)

                                      Randy.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Linear Search
  2016-09-28 18:24   ` Jeffrey R. Carter
@ 2016-09-28 21:03     ` Niklas Holsti
  2016-09-28 22:20       ` Jeffrey R. Carter
  0 siblings, 1 reply; 10+ messages in thread
From: Niklas Holsti @ 2016-09-28 21:03 UTC (permalink / raw)


On 16-09-28 21:24 , Jeffrey R. Carter wrote:
> On 09/28/2016 10:23 AM, Niklas Holsti wrote:
>>
>> Sure there is: when you have to do the search, say, 1000 times per second...
>> because you are servicing 1000 net queries per second...
>
> But how many values are you keeping in memory?

All that will fit, if you aim at maximum performance. In-RAM databases 
are very common nowadays...

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Linear Search
  2016-09-28 21:03     ` Niklas Holsti
@ 2016-09-28 22:20       ` Jeffrey R. Carter
  0 siblings, 0 replies; 10+ messages in thread
From: Jeffrey R. Carter @ 2016-09-28 22:20 UTC (permalink / raw)


On 09/28/2016 02:03 PM, Niklas Holsti wrote:
> 
> All that will fit, if you aim at maximum performance. In-RAM databases are very
> common nowadays...

That could be many (AKA a number > 70M).

-- 
Jeff Carter
"Brave Sir Robin ran away."
Monty Python and the Holy Grail
59

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Linear Search
  2016-09-28 20:21   ` Randy Brukardt
@ 2016-09-28 23:08     ` Jeffrey R. Carter
  2016-09-29  5:40       ` Randy Brukardt
  0 siblings, 1 reply; 10+ messages in thread
From: Jeffrey R. Carter @ 2016-09-28 23:08 UTC (permalink / raw)

On 09/28/2016 01:21 PM, Randy Brukardt wrote:
> 
> Or, say compiling Ada and the search is in generated parse tables. (I was 
> able to speed up the parsing speed of our compiler by an order of magitude 
> by replacing the linear search for the next symbol by a tuned search.)

Sure, I've done that, too. And when I did it, the time difference made the
program significantly more responsive to users. But I was never searching
anything close to 7M items. If I had written the same program today, with the
same linear search, it seems likely that it would have been fast enough, and I'd
never have needed to change it.

-- 
Jeff Carter
"Perfidious English mouse-dropping hoarders."
Monty Python & the Holy Grail
10

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Linear Search
  2016-09-28 16:41 Linear Search Jeffrey R. Carter
  2016-09-28 17:23 ` Niklas Holsti
@ 2016-09-29  4:39 ` Paul Rubin
  2016-09-29  4:47   ` Jeffrey R. Carter
  1 sibling, 1 reply; 10+ messages in thread
From: Paul Rubin @ 2016-09-29  4:39 UTC (permalink / raw)


"Jeffrey R. Carter" <spam.jrcarter.not@spam.not.acm.org> writes:
> I can't imagine any real application in which I would keep that many
> values in memory. Usually they'd be in some sort of DB that would do
> the search for me. 

What if your program is a DB?  Its purpose is to do big searches...

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Linear Search
  2016-09-29  4:39 ` Paul Rubin
@ 2016-09-29  4:47   ` Jeffrey R. Carter
  0 siblings, 0 replies; 10+ messages in thread
From: Jeffrey R. Carter @ 2016-09-29  4:47 UTC (permalink / raw)


On 09/28/2016 09:39 PM, Paul Rubin wrote:
> 
> What if your program is a DB?  Its purpose is to do big searches...

The one's I've used don't keep everything in memory.

-- 
Jeff Carter
"Gentlemen, you can't fight in here. This is the War Room!"
Dr. Strangelove
30

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Linear Search
  2016-09-28 23:08     ` Jeffrey R. Carter
@ 2016-09-29  5:40       ` Randy Brukardt
  0 siblings, 0 replies; 10+ messages in thread
From: Randy Brukardt @ 2016-09-29  5:40 UTC (permalink / raw)

"Jeffrey R. Carter" <spam.jrcarter.not@spam.not.acm.org> wrote in message 
news:nshih3$lk$1@dont-email.me...
> On 09/28/2016 01:21 PM, Randy Brukardt wrote:
>>
>> Or, say compiling Ada and the search is in generated parse tables. (I was
>> able to speed up the parsing speed of our compiler by an order of 
>> magitude
>> by replacing the linear search for the next symbol by a tuned search.)
>
> Sure, I've done that, too. And when I did it, the time difference made the
> program significantly more responsive to users. But I was never searching
> anything close to 7M items. If I had written the same program today, with 
> the
> same linear search, it seems likely that it would have been fast enough, 
> and I'd
> never have needed to change it.

The problem with that logic is that people seem to expect to be able to 
compile ever-bigger programs with their much faster machines. I can't take 
all of that performance improvement for the compiler itself. (Indeed, we 
have a formal policy to take no more than the square root of the improvement 
for the optimizer, which is handled by limiting the maximum chunk that can 
be optimized.) And they also want the compiles to get done as fast as 
possible; million line programs aren't that uncommon. Sloppy algorithms can 
matter a lot. (Of course, the key is to only improve the ones that really 
matter, and not the other 90% that have no effect. But that 10% can be a big 
deal.)

                                     Randy.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-09-29  5:40 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-28 16:41 Linear Search Jeffrey R. Carter
2016-09-28 17:23 ` Niklas Holsti
2016-09-28 18:24   ` Jeffrey R. Carter
2016-09-28 21:03     ` Niklas Holsti
2016-09-28 22:20       ` Jeffrey R. Carter
2016-09-28 20:21   ` Randy Brukardt
2016-09-28 23:08     ` Jeffrey R. Carter
2016-09-29  5:40       ` Randy Brukardt
2016-09-29  4:39 ` Paul Rubin
2016-09-29  4:47   ` Jeffrey R. Carter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox