Sunday, October 23, 2011

Insert & Median

Maintain a data structure that supports two operations 'insert' & 'get median'. Preferably the median operation should be in constant time.

First solution that came to mind was using a linear linked list and maintain a sorted list of elements. The cost of both the operations are O(log n) in this case.

When looking for a better solution, I stumbled upon this ingenious solution of using 2 heaps to achieve the same. The basic idea is to maintain 2 heaps and a single element having the total number of elements seen so far. The first heap should have half the total number of elements & maintained as a max heap whereas the second heap should have the other half of the elements & maintained as a min heap. This way, the median is always the root of the max heap (if the #elements is odd) or the average of the roots of both heaps (if the #elements is even). The trick is in how do we maintain this structure / property all throughout the insertions. As the heaps are equally split in the #elements, we also know right upfront on which heap a given insertion which land up in. Once we know that, we can always find if the given element directly fits into the heap or it replaces (or rather moves to the other heap) an existing element in a heap. Using this approach we can provide a O(1) 'get median' operation & O(log n) insertion time.

Heap is my favorite data structure. Something about it just fascinates me no end. I would love to find out if there are other such interesting problems that can be solved using heaps.

No comments:

Post a Comment