Design an algorithm that, given a list of n elements in an array, finds all the elements that appear more than n/3 times in the list.
The algorithm should run in linear time. (n >=0 )
The algorithm should run in linear time. (n >=0 )
You are expected to use comparisons and achieve linear time. No hashing/excessive space/ and don't use standard linear time deterministic selection algo.
Approach:
Let N = size of the input array A.
Now we need to find all the elements in the array which occur more than (N/K) times in the input Array A.
maintain a Map (Key=A[i], and value is the frequency of Key in array A so far).
as soon as the size of the Map reaches K, decrement the value of all the Keys by 1, if the value is 1, then after decrementing it will be 0 - Delete those keys from Map.
At max you will have to do (N/K) deletions.
In the end you will be left with atmost (K-1) unique Keys - which are candidate answers.
Now again re-iterate through the array A and report the Keys whose frequency is greater than (N/K).
Time Complexity: O(nlogk).
space complexity: O(k).
Ex:
int [] A = {5, 4, 4, 3, 2, 3, 4, 5, 5, 8};
N = 10;
K = 5;
we need to find all the elements in A which occurs more than (N/K) = 2 times.
As you iterate through the Array A, the steps would be - Imagine you are playing tetris, where same keys stack upon each other and you put the different keys in the adjacent column, when you have a row of size K - you delete that row:
When 8 is coming, the tetris would look like:
5 4
5 4 3
5 4 3 2
When 8 has dropped tetris would look like:
5 4
5 4 3
5 4 3 2 8 <= This row is now full (means its size has K=5)
After deleting the tetris snapshot would be:
5 4
5 4 3
So now 3,4,5 are your candidate elements which can occur more than (N/K) = (10/5) = 2 times in A.
Again iterate through the Array A, and report elements among 3,4,5 which occur more than 2 times.
Ans is:
Key: 4 Value: 3
Key: 5 Value: 3
you can prove that in the end at most (K-1) unique values would be remaining as candidate answer.
output for the below code:
Key: 4 Value: 6
Key: 10 Value: 6
End!
output for the below code:
Key: 4 Value: 6
Key: 10 Value: 6
End!
Wow Awesome !!!!!
ReplyDeleteyou say no hashing , But yet use a hash table?
ReplyDeleteyou say no hashing , But yet use a hash table?
ReplyDeletewhat are you doing in b/w lines 79-89 ?
ReplyDeleteIn the explanation you say that
>> Again iterate through the Array A, and report elements among 3,4,5 which occur more than 2 times. >>
So, shouldn't this give you time = n*k ? Or you are again modifying the counts in kBagMap<> ?
ReplyDeleteheres the python code:
let l=[1,2,3,4,5,6,1,1,2,3,4,5,6,2,1,2,47,7,8,5,5,5,5,5,5,5,6,4,12,7,78,4,5,5,5,5]
c=max(l)
b=[]
for i in range(c+1):
b.append(0)
for i in l:
b[i]=b[i]+1
if b[i]>=(len(l))/3:
print i
why there is log factor in time complexity ,can u explain that??/
ReplyDeleteExtreme cleverness! How does that come to your mind? :-)
ReplyDelete