Find the running median hackerrank solution
Given that integers are read from a data stream. Find median of elements read so for in an efficient way. For simplicity assume, there are no duplicates. For example, let us consider the stream 5, 15, 1, 3 … Making it clear, when the input size is odd, we take the middle element of sorted data. If the input size is even, we pick the average of the middle two elements in the sorted stream. Method 1: Insertion Sort If we can sort the data as it appears, we can easily locate the median element. Insertion Sort is one such online algorithm that sorts the data appeared so far. At any instance of sorting, say after sorting i-th element, the first i elements of the array are sorted. The insertion sort doesn’t depend on future data to sort data input till that point. In other words, insertion sort considers data sorted so far while inserting the next element. This is the key part of insertion sort that makes it an online algorithm. However, insertion sort takes O(n2) time to sort n elements. Perhaps we can use binary search on insertion sort to find the location of the next element in O(log n) time. Yet, we can’t do data movement in O(log n) time. No matter how efficient the implementation is, it takes polynomial time in case of insertion sort.
Output Median after reading 1 element is 5 Median after reading 2 elements is 10 Median after reading 3 elements is 5 Median after reading 4 elements is 4 Median after reading 5 elements is 3 Median after reading 6 elements is 4 Median after reading 7 elements is 5 Median after reading 8 elements is 6 Median after reading 9 elements is 7 Median after reading 10 elements is 6 Median after reading 11 elements is 7 Median after reading 12 elements is 6 Time Complexity: O(n2) Method 2: Augmented self-balanced binary search tree (AVL, RB, etc…) If the left and right subtrees contain a same number of elements, the root node holds the average of left and right subtree root data. Otherwise, the root contains the same data as the root of subtree which is having more elements. After processing an incoming element, the left and right subtrees (BST) are differed utmost by 1. Self-balancing BST is costly in managing the balancing factor of BST. However, they provide sorted data which we don’t need. We need median only. The next method makes use of Heaps to trace the median. Method 3: Heaps After processing an incoming element, the number of elements in heaps differs utmost by 1 element. When both heaps contain the same number of elements, we pick the average of heaps root data as effective median. When the heaps are not balanced, we select effective median from the root of the heap containing more elements. Given below is the implementation of the above method. For the algorithm to build these heaps, please read the highlighted code.
Time Complexity: If we omit the way how stream was read, complexity of median finding is O(N log N), as we need to read the stream, and due to heap insertions/deletions. Auxiliary Space: O(N) Median of Stream of Running Integers using STL |