Sunday, February 19, 2012

Print all the anagrams for a word from dictionary.

Imagine you had a dictionary. How would you print all anagrams of a word? What if you had to do this repeatedly? Could you optimize it?


Approach:
We need to develop a key for a given word such that all the words in an anagram set can be represented uniquely.
one way is to convert the word string to lowecase and sort the words lexicographically.
example: Tea, Eat, Ate will become: aet.
Then we can use a HashMap to store the anagram set, with "aet" as key and "Tea, Eat, Ate" as the list of words.


Another approach could be we can assign each letters from a..z a prime numbers (2, 3, 5, 7, 11, 13, 17, 19, 23, 29, .. so on)
and then for any word, we can calculate its key as the multiples of all the prime number corresponding to characters in the word.


say tea = 71*11*2 = 1562
similarly eat and ate will yeild 1562. Now 1562 is the key in your hashMap and value will be the list of words - "tea, eat, ate";


a=2, b=3, c=5, d=7, e=11, f=13, g=17, h=19, i=23, j=29, 
k=31, l=37, m=41, n=43, o=47, p=53, q=59, r=61, s=67, t=71, 
u=73, v=79, w=83, x=89, y=97, z=101


If we have to do the operation repeatedly then we will preprocess and calculate all the anagrams sets in the dictionary and cache it.

1 comment:

  1. Ideally, you would need not only "complete" anagrams, but also stuff with multiple words like "Joseph Stalin" <-> "Hapless joint" and "The late President of America, Ronald Wilson Reagan" <-> "Elect me, for singlehanded war on Asian proletariat". Makes things a bit more complicated, and just checking all permutations of the letters in the input string would be far too costly, right?

    ReplyDelete