BTW, binary_search 
Author Message
 BTW, binary_search

if  binary_search(..) would return iterator instead of bool, it would be
possible to write code like this:
// in case we fill some container in one step and then use it many times

typedef C *iterator;
C        arr[N];
iterator begin= &arr[0];
iterator end= &arr[N];
    fill(arr, N);
    sort(begin, end);
// it's several times faster then filling set<C>

.....

iterator it= binary_search(begin,end,key);
    if( it!=end )
        use(*it);
// it takes approximately the same time as set<C>::find(key)

it's easy to solve this problem by writing code like this

template<class T, class I> inline I bin_search(I begin, I end, const T &k)
{ I     it= lower_bound(begin, end, k);
    return  (it==end || less<T>()(k,*it)) ? end : it;

Quote:
}

but why should I do this?


Sun, 16 May 2004 03:40:08 GMT  
 BTW, binary_search
Which of potentially many equivalent elements should binary_search return an
iterator to? The answer is unambiguous for lower_bound, upper_bound and
equal_range.
--
With best wishes,
    Igor Tandetnik

"For every complex problem, there is a solution that is simple, neat, and
wrong." H.L. Mencken


Quote:
> if  binary_search(..) would return iterator instead of bool, it would be
> possible to write code like this:
> // in case we fill some container in one step and then use it many times

> typedef C *iterator;
> C        arr[N];
> iterator begin= &arr[0];
> iterator end= &arr[N];
>     fill(arr, N);
>     sort(begin, end);
> // it's several times faster then filling set<C>

> .....

> iterator it= binary_search(begin,end,key);
>     if( it!=end )
>         use(*it);
> // it takes approximately the same time as set<C>::find(key)

> it's easy to solve this problem by writing code like this

> template<class T, class I> inline I bin_search(I begin, I end, const T &k)
> { I     it= lower_bound(begin, end, k);
>     return  (it==end || less<T>()(k,*it)) ? end : it;
> }

> but why should I do this?



Sun, 16 May 2004 03:49:04 GMT  
 BTW, binary_search
obviously the first one, like multiset<>::find(..) does.


Quote:
> Which of potentially many equivalent elements should binary_search return
an
> iterator to? The answer is unambiguous for lower_bound, upper_bound and
> equal_range.
> --
> With best wishes,
>     Igor Tandetnik

> "For every complex problem, there is a solution that is simple, neat, and
> wrong." H.L. Mencken



Sun, 16 May 2004 04:04:09 GMT  
 BTW, binary_search
And it would be also fine to have smth like this

template<class T,class A> map<T,A>::iterator
binary_search(map<T,A>::iterator beg, map<T,A>::iterator en, const T &key)
{ somehow map::iterator      it= TheMap.find(key);  // implementation
specific

    return (it<en && !(it<beg)) ? it : en;

Quote:
}



Quote:
> obviously the first one, like multiset<>::find(..) does.



> > Which of potentially many equivalent elements should binary_search
return
> an
> > iterator to? The answer is unambiguous for lower_bound, upper_bound and
> > equal_range.
> > --
> > With best wishes,
> >     Igor Tandetnik

> > "For every complex problem, there is a solution that is simple, neat,
and
> > wrong." H.L. Mencken



Sun, 16 May 2004 04:25:04 GMT  
 BTW, binary_search
There is no requirement that multimap::find return an iterator to the first
element. It returns an iterator to arbitrary element with a key equivalent
to the given key, if any.

It probably does make sense to have a generic algorithm mimicking this
functionality. There isn't any, but equal_range comes closest I think. Not
sure what the rationale is behind current definition of binary_search.
--
With best wishes,
    Igor Tandetnik

"For every complex problem, there is a solution that is simple, neat, and
wrong." H.L. Mencken


Quote:
> obviously the first one, like multiset<>::find(..) does.



> > Which of potentially many equivalent elements should binary_search
return
> an
> > iterator to? The answer is unambiguous for lower_bound, upper_bound and
> > equal_range.
> > --
> > With best wishes,
> >     Igor Tandetnik

> > "For every complex problem, there is a solution that is simple, neat,
and
> > wrong." H.L. Mencken



Sun, 16 May 2004 04:26:14 GMT  
 BTW, binary_search


Quote:
> Which of potentially many equivalent elements should binary_search return
an
> iterator to?

I would say any. If this was sorted vector where comparisons are expensive,
any  element that compares as equivalent will do. This makes it like C's
bsearch().
There could be many possibilites where it is useful to have an iterator
rather than true/false returned.

In contrast lower_bound() has to continue with comparisons until the element
found partitions the range searched.

Stephen Howe



Sun, 16 May 2004 04:42:32 GMT  
 BTW, binary_search
BTW, both STL implementations (SGI and MS) of multiset<>::find(..) return
"the earliest element in the controlled sequence whose sort key equals key".
As for equal_range: "function effectively returns pair( lower_bound(first,
last, val), upper_bound(first, last, val)), "...  Do we need the
upper_bound? Not necessarily. At the same time it's slower...

Quote:

> There is no requirement that multimap::find return an iterator to the
first
> element. It returns an iterator to arbitrary element with a key equivalent
> to the given key, if any.

> It probably does make sense to have a generic algorithm mimicking this
> functionality. There isn't any, but equal_range comes closest I think. Not
> sure what the rationale is behind current definition of binary_search.
> --
> With best wishes,
>     Igor Tandetnik



Sun, 16 May 2004 05:08:13 GMT  
 BTW, binary_search
BTW, bsearch probably makes 2 comparisons on each step
    x<key,  key<x
lower_bound does 1 comparison
    x<key
Do you realy belive that bsearch is faster? Especially when comparisons are
expensive? You can run some tests, lower_bound is faster.


Quote:



> > Which of potentially many equivalent elements should binary_search
return
> an
> > iterator to?

> I would say any. If this was sorted vector where comparisons are
expensive,
> any  element that compares as equivalent will do. This makes it like C's
> bsearch().
> There could be many possibilites where it is useful to have an iterator
> rather than true/false returned.

> In contrast lower_bound() has to continue with comparisons until the
element
> found partitions the range searched.

> Stephen Howe



Sun, 16 May 2004 05:21:03 GMT  
 BTW, binary_search
All the standard says is (23.1.2 Table 69):

a.find(k)
returns an iterator pointing to an element with the key equivalent to k, or
a.end() if such an element is not found.

I have already agreed with you that a generic algorithm with similar
behavior would be desirable.
--
With best wishes,
    Igor Tandetnik

"For every complex problem, there is a solution that is simple, neat, and
wrong." H.L. Mencken


Quote:
> BTW, both STL implementations (SGI and MS) of multiset<>::find(..) return
> "the earliest element in the controlled sequence whose sort key equals
key".
> As for equal_range: "function effectively returns pair( lower_bound(first,
> last, val), upper_bound(first, last, val)), "...  Do we need the
> upper_bound? Not necessarily. At the same time it's slower...


> > There is no requirement that multimap::find return an iterator to the
> first
> > element. It returns an iterator to arbitrary element with a key
equivalent
> > to the given key, if any.

> > It probably does make sense to have a generic algorithm mimicking this
> > functionality. There isn't any, but equal_range comes closest I think.
Not
> > sure what the rationale is behind current definition of binary_search.
> > --
> > With best wishes,
> >     Igor Tandetnik



Sun, 16 May 2004 05:22:27 GMT  
 BTW, binary_search

Quote:



> > Which of potentially many equivalent elements should binary_search return
> an
> > iterator to? The answer is unambiguous for lower_bound, upper_bound and
> > equal_range.

> obviously the first one, like multiset<>::find(..) does.

So binary_search should be O(n) for all equal elements instead of
O(logn)?

--
Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)



Sun, 16 May 2004 05:30:15 GMT  
 BTW, binary_search

Quote:
> BTW, bsearch probably makes 2 comparisons on each step
>     x<key,  key<x

No. Not necessarily true. The compare function should check to see if elem1
is < , == or > than elem2. 1 comparison may be enough, 2 at most are
necessary. Each call of the compare function returns enough information to
partition the array into 3.

Quote:
> lower_bound does 1 comparison
>     x<key
> Do you realy belive that bsearch is faster?

In general, no I don't, but for a different reason. bsearch() _always_ has
to call an out of line function pointer. lower_bound() may inline the
comparison.

But there will be circumstances where bsearch() will be faster. For example,
if you had an 1024 elements in an array and they were the same element.
bsearch() will only ever do one call of the compare function() and whatever
element it picks it will return. lower_bound() will have to do at least 10
comparisons just to work out the lower boundary. But this is loading the
dice :-)

Quote:
> Especially when comparisons are
> expensive? You can run some tests, lower_bound is faster.

I did those tests months ago. But what you said is only the general case, i
can cook up specific instances of data where bsearch() _IS_ faster.

Stephen Howe



Sun, 16 May 2004 11:21:26 GMT  
 BTW, binary_search
Not necessarily. It will be O(longN)+1 in any case, if imlemented with
lower_bownd<>, as I wrote before

template<class T, class I> inline I bin_search(I begin, I end, const T &k)
{ I     it= lower_bound(begin, end, k);
    return  (it==end || less<T>()(k,*it)) ? end : it;

Quote:
}

> > > Which of potentially many equivalent elements should binary_search
return
> > an
> > > iterator to? The answer is unambiguous for lower_bound, upper_bound
and
> > > equal_range.

> > obviously the first one, like multiset<>::find(..) does.

> So binary_search should be O(n) for all equal elements instead of
> O(logn)?

> --
> Pete Becker
> Dinkumware, Ltd. (http://www.dinkumware.com)



Sun, 16 May 2004 22:28:32 GMT  
 BTW, binary_search
OK, you are right. bsearch is faster in some cases. For example in case of
array of long strings which differ in last characters. BTW it's possible to
write inline version of bsearch, smth like this

template <class I, class T> I bsearch(I first, I last, T val)
{
  int     len = last-first;
  int     half;
  I        middle;

 while( len > 0)
 {int    res;
    half = len >> 1;
    middle = first+half;
    if( (res=compare(*middle,val)<0 )
   {
      first= middle;
      ++first;
      len = len-half-1;
    }else if( res>0 )
          len = half;
    }else
    {   first= middle;  // in case of several equal array elements result is
unpredictable
        break;
    }
  }
  return  len>0 ? first : last;

Quote:
}

I didn't test it, just wanted to show the idea. BTW it will be 25% faster at
best (I mean faster than binary_search). That's because of pyramid nature,
1/2 of elements are at the bottom line, than means in 50% we will have the
same number of comparisons as in binary_search.


Quote:
> > BTW, bsearch probably makes 2 comparisons on each step
> >     x<key,  key<x

> No. Not necessarily true. The compare function should check to see if
elem1
> is < , == or > than elem2. 1 comparison may be enough, 2 at most are
> necessary. Each call of the compare function returns enough information to
> partition the array into 3.

> > lower_bound does 1 comparison
> >     x<key

> > Do you realy belive that bsearch is faster?

> In general, no I don't, but for a different reason. bsearch() _always_ has
> to call an out of line function pointer. lower_bound() may inline the
> comparison.

> But there will be circumstances where bsearch() will be faster. For
example,
> if you had an 1024 elements in an array and they were the same element.
> bsearch() will only ever do one call of the compare function() and
whatever
> element it picks it will return. lower_bound() will have to do at least 10
> comparisons just to work out the lower boundary. But this is loading the
> dice :-)

> > Especially when comparisons are
> > expensive? You can run some tests, lower_bound is faster.

> I did those tests months ago. But what you said is only the general case,
i
> can cook up specific instances of data where bsearch() _IS_ faster.

> Stephen Howe



Sun, 16 May 2004 22:55:25 GMT  
 BTW, binary_search

Quote:

> Not necessarily. It will be O(longN)+1 in any case, if imlemented with
> lower_bownd<>,

Sorry, I was hallucinating.

--
Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)



Mon, 17 May 2004 22:32:20 GMT  
 
 [ 14 post ] 

 Relevant Pages 

1. binary_search

2. binary_search weirdness

3. binary_search of vector

4. binary_search like algorithm

5. multimap and binary_search

6. diff btw struct and union

7. Different btw Implicit and Explicit

8. conv btw diff integr types

9. Compilers (and C, BTW :-))

10. btw!! what next

11. newbie: whats the difference btw ADO and DAO

12. BTW: How to get screen dimention? :-)

 

 
Powered by phpBB® Forum Software