BTW, binary_search
Author Message
BTW, binary_search

if  binary_search(..) would return iterator instead of bool, it would be
possible to write code like this:
// in case we fill some container in one step and then use it many times

typedef C *iterator;
C        arr[N];
iterator begin= &arr[0];
iterator end= &arr[N];
fill(arr, N);
sort(begin, end);
// it's several times faster then filling set<C>

.....

iterator it= binary_search(begin,end,key);
if( it!=end )
use(*it);
// it takes approximately the same time as set<C>::find(key)

it's easy to solve this problem by writing code like this

template<class T, class I> inline I bin_search(I begin, I end, const T &k)
{ I     it= lower_bound(begin, end, k);
return  (it==end || less<T>()(k,*it)) ? end : it;

Quote:
}

but why should I do this?

Sun, 16 May 2004 03:40:08 GMT
BTW, binary_search
Which of potentially many equivalent elements should binary_search return an
iterator to? The answer is unambiguous for lower_bound, upper_bound and
equal_range.
--
With best wishes,
Igor Tandetnik

"For every complex problem, there is a solution that is simple, neat, and
wrong." H.L. Mencken

Quote:
> if  binary_search(..) would return iterator instead of bool, it would be
> possible to write code like this:
> // in case we fill some container in one step and then use it many times

> typedef C *iterator;
> C        arr[N];
> iterator begin= &arr[0];
> iterator end= &arr[N];
>     fill(arr, N);
>     sort(begin, end);
> // it's several times faster then filling set<C>

> .....

> iterator it= binary_search(begin,end,key);
>     if( it!=end )
>         use(*it);
> // it takes approximately the same time as set<C>::find(key)

> it's easy to solve this problem by writing code like this

> template<class T, class I> inline I bin_search(I begin, I end, const T &k)
> { I     it= lower_bound(begin, end, k);
>     return  (it==end || less<T>()(k,*it)) ? end : it;
> }

> but why should I do this?

Sun, 16 May 2004 03:49:04 GMT
BTW, binary_search
obviously the first one, like multiset<>::find(..) does.

Quote:
> Which of potentially many equivalent elements should binary_search return
an
> iterator to? The answer is unambiguous for lower_bound, upper_bound and
> equal_range.
> --
> With best wishes,
>     Igor Tandetnik

> "For every complex problem, there is a solution that is simple, neat, and
> wrong." H.L. Mencken

Sun, 16 May 2004 04:04:09 GMT
BTW, binary_search
And it would be also fine to have smth like this

template<class T,class A> map<T,A>::iterator
binary_search(map<T,A>::iterator beg, map<T,A>::iterator en, const T &key)
{ somehow map::iterator      it= TheMap.find(key);  // implementation
specific

return (it<en && !(it<beg)) ? it : en;

Quote:
}

Quote:
> obviously the first one, like multiset<>::find(..) does.

> > Which of potentially many equivalent elements should binary_search
return
> an
> > iterator to? The answer is unambiguous for lower_bound, upper_bound and
> > equal_range.
> > --
> > With best wishes,
> >     Igor Tandetnik

> > "For every complex problem, there is a solution that is simple, neat,
and
> > wrong." H.L. Mencken

Sun, 16 May 2004 04:25:04 GMT
BTW, binary_search
There is no requirement that multimap::find return an iterator to the first
element. It returns an iterator to arbitrary element with a key equivalent
to the given key, if any.

It probably does make sense to have a generic algorithm mimicking this
functionality. There isn't any, but equal_range comes closest I think. Not
sure what the rationale is behind current definition of binary_search.
--
With best wishes,
Igor Tandetnik

"For every complex problem, there is a solution that is simple, neat, and
wrong." H.L. Mencken

Quote:
> obviously the first one, like multiset<>::find(..) does.

> > Which of potentially many equivalent elements should binary_search
return
> an
> > iterator to? The answer is unambiguous for lower_bound, upper_bound and
> > equal_range.
> > --
> > With best wishes,
> >     Igor Tandetnik

> > "For every complex problem, there is a solution that is simple, neat,
and
> > wrong." H.L. Mencken

Sun, 16 May 2004 04:26:14 GMT
BTW, binary_search

Quote:
> Which of potentially many equivalent elements should binary_search return
an
> iterator to?

I would say any. If this was sorted vector where comparisons are expensive,
any  element that compares as equivalent will do. This makes it like C's
bsearch().
There could be many possibilites where it is useful to have an iterator
rather than true/false returned.

In contrast lower_bound() has to continue with comparisons until the element
found partitions the range searched.

Stephen Howe

Sun, 16 May 2004 04:42:32 GMT
BTW, binary_search
BTW, both STL implementations (SGI and MS) of multiset<>::find(..) return
"the earliest element in the controlled sequence whose sort key equals key".
As for equal_range: "function effectively returns pair( lower_bound(first,
last, val), upper_bound(first, last, val)), "...  Do we need the
upper_bound? Not necessarily. At the same time it's slower...

Quote:

> There is no requirement that multimap::find return an iterator to the
first
> element. It returns an iterator to arbitrary element with a key equivalent
> to the given key, if any.

> It probably does make sense to have a generic algorithm mimicking this
> functionality. There isn't any, but equal_range comes closest I think. Not
> sure what the rationale is behind current definition of binary_search.
> --
> With best wishes,
>     Igor Tandetnik

Sun, 16 May 2004 05:08:13 GMT
BTW, binary_search
BTW, bsearch probably makes 2 comparisons on each step
x<key,  key<x
lower_bound does 1 comparison
x<key
Do you realy belive that bsearch is faster? Especially when comparisons are
expensive? You can run some tests, lower_bound is faster.

Quote:

> > Which of potentially many equivalent elements should binary_search
return
> an
> > iterator to?

> I would say any. If this was sorted vector where comparisons are
expensive,
> any  element that compares as equivalent will do. This makes it like C's
> bsearch().
> There could be many possibilites where it is useful to have an iterator
> rather than true/false returned.

> In contrast lower_bound() has to continue with comparisons until the
element
> found partitions the range searched.

> Stephen Howe

Sun, 16 May 2004 05:21:03 GMT
BTW, binary_search
All the standard says is (23.1.2 Table 69):

a.find(k)
returns an iterator pointing to an element with the key equivalent to k, or

I have already agreed with you that a generic algorithm with similar
behavior would be desirable.
--
With best wishes,
Igor Tandetnik

"For every complex problem, there is a solution that is simple, neat, and
wrong." H.L. Mencken

Quote:
> BTW, both STL implementations (SGI and MS) of multiset<>::find(..) return
> "the earliest element in the controlled sequence whose sort key equals
key".
> As for equal_range: "function effectively returns pair( lower_bound(first,
> last, val), upper_bound(first, last, val)), "...  Do we need the
> upper_bound? Not necessarily. At the same time it's slower...

> > There is no requirement that multimap::find return an iterator to the
> first
> > element. It returns an iterator to arbitrary element with a key
equivalent
> > to the given key, if any.

> > It probably does make sense to have a generic algorithm mimicking this
> > functionality. There isn't any, but equal_range comes closest I think.
Not
> > sure what the rationale is behind current definition of binary_search.
> > --
> > With best wishes,
> >     Igor Tandetnik

Sun, 16 May 2004 05:22:27 GMT
BTW, binary_search

Quote:

> > Which of potentially many equivalent elements should binary_search return
> an
> > iterator to? The answer is unambiguous for lower_bound, upper_bound and
> > equal_range.

> obviously the first one, like multiset<>::find(..) does.

So binary_search should be O(n) for all equal elements instead of
O(logn)?

--
Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)

Sun, 16 May 2004 05:30:15 GMT
BTW, binary_search

Quote:
> BTW, bsearch probably makes 2 comparisons on each step
>     x<key,  key<x

No. Not necessarily true. The compare function should check to see if elem1
is < , == or > than elem2. 1 comparison may be enough, 2 at most are
necessary. Each call of the compare function returns enough information to
partition the array into 3.

Quote:
> lower_bound does 1 comparison
>     x<key
> Do you realy belive that bsearch is faster?

In general, no I don't, but for a different reason. bsearch() _always_ has
to call an out of line function pointer. lower_bound() may inline the
comparison.

But there will be circumstances where bsearch() will be faster. For example,
if you had an 1024 elements in an array and they were the same element.
bsearch() will only ever do one call of the compare function() and whatever
element it picks it will return. lower_bound() will have to do at least 10
comparisons just to work out the lower boundary. But this is loading the
dice :-)

Quote:
> Especially when comparisons are
> expensive? You can run some tests, lower_bound is faster.

I did those tests months ago. But what you said is only the general case, i
can cook up specific instances of data where bsearch() _IS_ faster.

Stephen Howe

Sun, 16 May 2004 11:21:26 GMT
BTW, binary_search
Not necessarily. It will be O(longN)+1 in any case, if imlemented with
lower_bownd<>, as I wrote before

template<class T, class I> inline I bin_search(I begin, I end, const T &k)
{ I     it= lower_bound(begin, end, k);
return  (it==end || less<T>()(k,*it)) ? end : it;

Quote:
}

> > > Which of potentially many equivalent elements should binary_search
return
> > an
> > > iterator to? The answer is unambiguous for lower_bound, upper_bound
and
> > > equal_range.

> > obviously the first one, like multiset<>::find(..) does.

> So binary_search should be O(n) for all equal elements instead of
> O(logn)?

> --
> Pete Becker
> Dinkumware, Ltd. (http://www.dinkumware.com)

Sun, 16 May 2004 22:28:32 GMT
BTW, binary_search
OK, you are right. bsearch is faster in some cases. For example in case of
array of long strings which differ in last characters. BTW it's possible to
write inline version of bsearch, smth like this

template <class I, class T> I bsearch(I first, I last, T val)
{
int     len = last-first;
int     half;
I        middle;

while( len > 0)
{int    res;
half = len >> 1;
middle = first+half;
if( (res=compare(*middle,val)<0 )
{
first= middle;
++first;
len = len-half-1;
}else if( res>0 )
len = half;
}else
{   first= middle;  // in case of several equal array elements result is
unpredictable
break;
}
}
return  len>0 ? first : last;

Quote:
}

I didn't test it, just wanted to show the idea. BTW it will be 25% faster at
best (I mean faster than binary_search). That's because of pyramid nature,
1/2 of elements are at the bottom line, than means in 50% we will have the
same number of comparisons as in binary_search.

Quote:
> > BTW, bsearch probably makes 2 comparisons on each step
> >     x<key,  key<x

> No. Not necessarily true. The compare function should check to see if
elem1
> is < , == or > than elem2. 1 comparison may be enough, 2 at most are
> necessary. Each call of the compare function returns enough information to
> partition the array into 3.

> > lower_bound does 1 comparison
> >     x<key

> > Do you realy belive that bsearch is faster?

> In general, no I don't, but for a different reason. bsearch() _always_ has
> to call an out of line function pointer. lower_bound() may inline the
> comparison.

> But there will be circumstances where bsearch() will be faster. For
example,
> if you had an 1024 elements in an array and they were the same element.
> bsearch() will only ever do one call of the compare function() and
whatever
> element it picks it will return. lower_bound() will have to do at least 10
> comparisons just to work out the lower boundary. But this is loading the
> dice :-)

> > Especially when comparisons are
> > expensive? You can run some tests, lower_bound is faster.

> I did those tests months ago. But what you said is only the general case,
i
> can cook up specific instances of data where bsearch() _IS_ faster.

> Stephen Howe

Sun, 16 May 2004 22:55:25 GMT
BTW, binary_search

Quote:

> Not necessarily. It will be O(longN)+1 in any case, if imlemented with
> lower_bownd<>,

Sorry, I was hallucinating.

--
Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)

Mon, 17 May 2004 22:32:20 GMT

 Page 1 of 1 [ 14 post ]

Relevant Pages