Find Median from Data Stream
Explanation & Solution
Description
Given an array of integers nums representing a data stream, process the numbers one by one and find the median of all elements seen so far after each insertion.
The median is the middle value in an ordered list. If the list size is even, the median is the average of the two middle values.
Return an array of medians — one after each number is added to the stream.
Examples
Explanation:
- After adding 1: list = [1], median = 1
- After adding 2: list = [1, 2], median = (1 + 2) / 2 = 1.5
- After adding 3: list = [1, 2, 3], median = 2
Explanation:
- After adding 5: list = [5], median = 5
- After adding 2: list = [2, 5], median = (2 + 5) / 2 = 3.5
- After adding 8: list = [2, 5, 8], median = 5
Explanation:
- After adding 6: [6], median = 6
- After adding 10: [6, 10], median = 8
- After adding 2: [2, 6, 10], median = 6
- After adding 6: [2, 6, 6, 10], median = 6
- After adding 5: [2, 5, 6, 6, 10], median = 6
Constraints
1 <= nums.length <= 10^5-10^5 <= nums[i] <= 10^5- Median values should be exact (no floating-point rounding issues for the given test cases).
Approach
Top K Elements pattern
1. Use Two Heaps to Track the Median
- Maintain a max-heap (
lo) for the lower half of the numbers. - Maintain a min-heap (
hi) for the upper half of the numbers. - The max-heap's top gives the largest value in the lower half, and the min-heap's top gives the smallest value in the upper half.
2. Insert Each Number
- For each new number, first push it into the max-heap (
lo). - Then pop the max of
loand push it into the min-heap (hi). This ensures every element inlois ≤ every element inhi.
3. Rebalance the Heaps
- If
hihas more elements thanlo, pop the min ofhiand push it intolo. - This guarantees
loalways has equal or one more element thanhi.
4. Compute the Median
- If the total count is odd, the median is
lo.peek()(the top of the max-heap). - If the total count is even, the median is
(lo.peek() + hi.peek()) / 2(average of the two middle values).
5. Collect All Medians
- After processing each number, append the current median to the result array.
- Return the result array after all numbers have been processed.
Key Insight
By splitting the stream into two halves using a max-heap and a min-heap, we can find the median in O(1) time after each insertion, with each insertion costing O(log n) for heap operations. This gives an overall O(n log n) time complexity and O(n) space complexity — far more efficient than sorting after every insertion.
Visualization
No animation available