Mar 27, 2023

An introduction to the most important problem in Computer Science and one of the most clever algorithms ever devised.

8 Comments

Mar 27, 2023

‪Nicely written 👌👌. In the past, when I’ve had to introduce binary search to students, I’ve used the same example of the books.‬

‪

It’s worth noticing that in real life, we (us as humans) don’t really do binary search, but randomised binary search instead, since it’s hard for us to pick the exact middle.

We could think of it as picking the book to check randomly. It’s not hard to prove that this way, the expected number of steps until having a single book is logarithmic wrt to the amount of books.

Expand full comment

Reply (2)

Anier Velasco Sotomayor

Mar 27, 2023

If we just do it uniformly, it guarantees a logarithmic expected number of steps anyways.

A rather informal explanation could be this:

We could think of the sequence to be divided in 3 contiguos pieces of length=1/3*N. Then, with probability=1/3 the picked element belongs to the middle piece, and we get rid of at least 1/3 of the elements.

Hence, with probability 1/3, from N elements we move to 2/3 * N elements, so N is reduced by a factor of 3/2.

Reducing N by a constant factor c > 1 with a constant probability p > 0 makes the expected number of steps to be logarithmic.

Expand full comment

Reply (1)

Alejandro Piad Morffis

Mar 28, 2023

Can this be shown with the master theorem?

Expand full comment

Reply (1)

Anier Velasco Sotomayor

Apr 16, 2023

I’m not sure. It seems that in order to formulate it formally we need to include the answer as an argument of the state (something like T(n,s) being the time complexity of the state (n,s) where n is the cardinality of the search space, and s is the cardinality of the space containing only elements less than the answer). For sure running Monte-Carlos’s is convincing enough.

Let me know if you succeed in proving it with Master theorem.

Expand full comment

Alejandro Piad Morffis

Mar 27, 2023

Cool! When you say randomly, it's uniformly distributed or biased towards the median? Does the answer change the expected cost?

Expand full comment

Dave Dolan

Jun 5, 2023

In around 2006-2009 I was working on my own database engine. A Graph db before they were cool. For the prototype, I mostly only needed one real on-disk data structure in addition to the blobs for attribute values, and that was the b-tree. Some structures I was using are like B-trees of b-trees, or even b-trees of b-trees of b-trees, but still yes this concept is central to the whole shebang. I eventually migrated to using Lucene, and wrote my own query language and server for it, just in time to close my company down and get a real job. But it was indeed the binary search that was central to the whole thing. I still have a soft spot in my heart for the algorithm, and I see binary trees or the potential to build them, everywhere. When it comes to data, I see it in two ways now, a 'pay now' or 'pay later' for consuming it: you either sort your data now when you're about to store it, or spend time sorting it on the way out when you're fetching it, at least if it has comparable keys!

Expand full comment

Reply (1)

Alejandro Piad Morffis

Jun 5, 2023

I only played with B-trees at college, but yes, they are incredible (and incredibly hard to code compared to AVLs or red-black trees or any other classic in-memory data structure). Now I'm teaching Algorithm Design and binary search appears everywhere, in the middle of a DP or a flow algorithm. It's one of the most clever little ideas in CS, that's why it's so beautiful.

Expand full comment

Gold Bassey Edem

Mar 29, 2023

Beautiful piece Alejandro!

I am deeply curious to know though if you could talk about how search engines such as Google's search and sorting systems work.

Expand full comment

Binary Search