Skip to main content

QuickSelect - Medium

Problem : Given an unsorted array, find the kth smallest element.

The problem is called the Selection problem. It's been intensively studied and has a couple of very interesting algorithms that do the job. I'll be  describing an algorithm called QuickSelect. The algorithm derives its name from QuickSort. You will probably recognise that most of the code directly borrows from QuickSort. The only difference being there is a single recursive call rather than 2 in QuickSort.

The naive solution is obvious, simply sort the array `O(nlogn)` and return the kth element. Infact, you can partially sort it and use the Selection sort to get the solution in `O(nk)`

An interesting side effect of finding the kth smallest element is you end up finding the k smallest elements. This also effectively gives you (n - k) largest elements in the array as well. These elements are not in any particular order though.

The version I'm using uses a random pivot selection, this part of the algorithm usually decides how fast it will be. Ideally you should pick the median, but calculating median of an array is basically a selection problem itself! (where k = n/2)

So we usually use the cheap route and pick a random pivot. We use the `partition` function from `quickSort`to separate the array into 2 subarrays. The left one will contain all elements less than or equal to the pivot value. The right one will have elements greater than pivot.

At this point we check if the k lies within [0,pivotIndex] or [pivotIndex,n].
We then recurse in that direction, ignoring the other subarray , it doesn't interest us anymore since it won't have our solution.



Complexity: O(nlogn) on average.

Theoretically, It is possible to do this in O(n). But the implementation hides huge constants and complexities which doesn't make it worth the effort.

Comments

Popular posts from this blog

Find Increasing Triplet Subsequence - Medium

Problem - Given an integer array A[1..n], find an instance of i,j,k where 0 < i < j < k <= n and A[i] < A[j] < A[k]. Let's start with the obvious solution, bruteforce every three element combination until we find a solution. Let's try to reduce this by searching left and right for each element, we search left for an element smaller than the current one, towards the right an element greater than the current one. When we find an element that has both such elements, we found the solution. The complexity of this solution is O(n^2). To reduce this even further, One can simply apply the longest increasing subsequence algorithm and break when we get an LIS of length 3. But the best algorithm that can find an LIS is O(nlogn) with O( n ) space . An O(nlogn) algorithm seems like an overkill! Can this be done in linear time? The Algorithm: We iterate over the array and keep track of two things. The minimum value iterated over (min) The minimum increa...

Shortest Interval in k-sorted list - Hard

Given k-sorted array, find the minimum interval such that there is at least one element from each array within the interval. Eg. [1,10,14],[2,5,10],[3,40,50]. Output : 1-3 To solve this problem, we perform a k-way merge as described here . At each point of 'popping' an element from an array. We keep track of the minimum and maximum head element (the first element) from all the k-lists. The minimum and maximum will obviously contain the rest of the header elements of the k-arrays. As we keep doing this, we find the smallest interval (max - min). That will be our solution. Here's a pictorial working of the algorithm. And here's the Python code. Time Complexity : O(nlogk)

The 2 Missing Duplicate Numbers - Medium

This is a slightly trickier version of The Missing Duplicate Number . Instead of one single missing number, we have 2. Problem : Given an array of Integers. Each number is repeated even number of times. Except 2 numbers which occur odd number of times. Find the two numbers Ex. [1,2,3,1,2,3,4,5] The Output should be 4,5. The naive solution is similar to the previous solution. Use a hash table to keep track of the frequencies and find the elements that occur an odd number of times. The algorithm has a Time and Space Complexity of O(n). Say those two required numbers are a and b If you remember the previous post, we used the XOR over all the elements to find the required element. If we do the same here, we get a value xor which is actually a ^ b. So how can we use this? We know that a and b are different, so their xor is not zero. Infact their XOR value has its bit set for all its dissimilar bits in both a and b. (eg.  1101 ^ 0110 = 0011). It's important to notice ...