Binary Search

Wow that is ancient technology you might think – can more be written about that? Probably yes 🙂

So Binary Search has been around for as long as we’ve had sorted lists and the need to locate a specific element in the list. If you have studied Computer Science you most probably has come across binary search algorithms.

If you need to locate an element in a list and just start searching from one end you in average will do n/2 comparisons if there are n elements in the list. That is O(n) performance.

That is probably fine if the list always contains a limited number of elements or you don’t have performance as a requirement.

If the elements you are dealing with can be compared you can sort the list and use binary search to locate the element. The algorithm basically starts in the middle of the sorted list and if the element you are searching for is “smaller” than the middle element the algorithm repeats for the middle element in the first half of the list, and if it is “larger” it repeats in the middle for the last half of the list. And so on, until the element is located.

So with n elements in the list the algorithm does on average log₂(n) comparisons. That is O(log n) performance. If there are 1 million elements in the sorted list the algorithm will do at most 19 comparisons before the element is found or it is determined that the element does not exist.

So why are we discussing Binary Search again? Doesn’t JAVA have binary search like built-in?

Absolutely for example Collections has a few variants and so does Arrays. But my use-case is a bit special so neither fits and I felt like rolling my own 🙂

So in my case I have a number of ranges given and I need to relatively fast locate a given number in one of the ranges. For example 4455000000 belongs to the range 4450000000..4459999999. In the particular challenge these strings of digits are actually not treated as numbers and they might actually have different lengths, so 44550000 also would belong to the range 4450000000..4459999999…

So I have a list of ranges, they can be sorted, and I have a string of digits and need to know which range it belongs to. That of course could be solved by normalizing my single string of digits and turn it into a short range like 4455000000..4455000000 and the standard Collections.binarySearch() would probably do nicely.

But that is no fun. Also the range contains other stuff than just the low and high values of the range and it would feel unnatural to turn my string of digits into a range. In my “day-time job” implementation (which is not the same as the one presented here) the normalization takes place in the comparison so we never have to bother about which length of the string of digits is the right. It could be 16, it could be 18 or even 19 – we don’t care and do not have to make a decision about that 😉

So I decided to roll my own completely generic BinarySearcher<T> that can do binary searches on sorted lists locating anything that can be compared to the elements in the list. It uses a BiComparator<T, K> because it needs to be able to compare elements of potentially different types. As the BinarySearcherSimpleIntegerTest illustrates it can also be used with simple types.

In my setup the BinarySearcher owns the list of elements to be searched because it makes best sense in my setup. But it can easily be turned into a static method and be used in other setups.

Enjoy 😀

About Jesper Udby

Leave a Reply Cancel reply

Archives

Meta

Personal

LinkedIn