Sunday, October 2, 2011

Find the median of two sorted arrays

This is one of those rare problems where the solution for the generalized problem is just as costly as the solution for a specific case. In other words, we can find any kth item of this combined array with the same complexity.

The brute-force approach is to do a merge-sort & find the median. The cost of this would be O(n+m) where n is the size of the first array & m is the size of the second. Second simple way is to see that we don't need to do a complete merge-sort but rather just till we find the kth element. Hence the cost would be down to O(k). Obviously, we need something better than this. Turns out that this can be done in O(lg n + lg m).

First, compare the middle elements of A and B, which we identify as Ai and Bj. If Ai is between Bj and Bj-1, we have just found the i+j-1 smallest element. Therefore, if we choose i and j such that i+j = k+1, we are able to find the kth smallest element. Basically, the above can be stated as,

Maintaining the invariant
i + j = k + 1,

If Bj-1 < Ai < Bj, then Ai must be the kth smallest,
or else if Ai-1 < Bj < Ai, then Bj must be the kth smallest.

When one of the above condition is satisfied for our initial i & j, we are done. Otherwise, we need to think about which set to use for the further comparisons.

If Ai < Bj, then Ai < Bj-1, because otherwise, we must have chosen Ai. This means that, the chosen Ai is quite low and we must include bigger elements from the set A (i.e., from i+1 to n). On the other hand, the chosen value from B is already too large. Hence, we shall not consider anything above the current chosen value of B (i.e., from 1 to j-1). So, basically we are shrinking our sample space in a logarithmic fashion & hence the final cost would be O(lg n + lg m).


No comments:

Post a Comment