119 lines
5.1 KiB
Markdown
119 lines
5.1 KiB
Markdown
|
---
|
||
|
title: Data Structure Trie
|
||
|
---
|
||
|
## Introduction to Trie
|
||
|
|
||
|
The word trie is an inflix of the word "re**trie**val", because the trie can find a single word in a dictionary with only a prefix of the word.
|
||
|
Trie is an efficient data retrieval data structure, using trie, search complexities can be brought to an optimal limit, i.e. length of the string.
|
||
|
It is a multi-way tree structure useful for storing strings over an alphabet, when we are storing them.
|
||
|
It has been used to store large dictionaries of English, say, words in spell-checking programs.
|
||
|
However, the penalty on tries is the storage requirement.
|
||
|
|
||
|
## What is a trie?
|
||
|
|
||
|
A trie is a tree like data structure which stores strings, and helps you find the data associated with that string using the prefix of the string.
|
||
|
For example, say you plan on building a dictionary to store strings along with their meanings. You must be wondering why can't I simply use a hash table, to get the information.
|
||
|
Yes, you obviously can get information using a hash table, but, the <a>hash tables</a> can only find data where the string exactly matches the one we've added. But trie will give us the capability to find strings with common prefixes, a missing character etc in lesser time, in comparison to a hash table.
|
||
|
A trie typically, looks something like this,
|
||
|
|
||
|
![Trie](//discourse-user-assets.s3.amazonaws.com/original/2X/c/c43e222a6f9152512d73f97b8117db5c074bbc8e.png)
|
||
|
|
||
|
This is an image of a Trie, which stores the words {assoc, algo, all, also, tree, trie}.
|
||
|
|
||
|
## How to implement a trie?
|
||
|
|
||
|
Let's implement a trie in python, for storing words with their meanings from english dictionary.
|
||
|
|
||
|
ALPHABET_SIZE = 26 # For English
|
||
|
|
||
|
class TrieNode:
|
||
|
def __init__(self):
|
||
|
self.edges = [None]*(ALPHABET_SIZE) # Each index respective to each character.
|
||
|
self.meaning = None # Meaning of the word.
|
||
|
self.ends_here = False # Tells us if the word ends here.
|
||
|
|
||
|
As you can see, edges are 26 in length, each index referring to each character in the alphabet. 'A' corresponding to 0, 'B' to 1, 'C' to 2 ... 'Z' to 25th index. If the character you are looking for is pointing to `None`, that implies the word is not there in the trie.
|
||
|
|
||
|
A typical Trie should implement at least these two functions:
|
||
|
|
||
|
* `add_word(word,meaning)`
|
||
|
* `search_word(word)`
|
||
|
* `delete_word(word)`
|
||
|
|
||
|
Additionally, one can also add something like
|
||
|
|
||
|
* `get_all_words()`
|
||
|
* `get_all_words_with_prefix(prefix)`
|
||
|
|
||
|
#### Adding Word to the trie
|
||
|
|
||
|
def add_word(self,word,meaning):
|
||
|
if len(word)==0:
|
||
|
self.ends_here = True # Because we have reached the end of the word
|
||
|
self.meaning = meaning # Adding the meaning to that node
|
||
|
return
|
||
|
ch = word[0] # First character
|
||
|
# ASCII value of the first character (minus) the ASCII value of 'a'-> the first character of our ALPHABET gives us the index of the edge we have to look up.
|
||
|
index = ord(ch) - ord('a')
|
||
|
if self.edges[index] == None:
|
||
|
# This implies that there's no prefix with this character yet.
|
||
|
new_node = TrieNode()
|
||
|
self.edges[index] = new_node
|
||
|
|
||
|
self.edges[index].add(word[1:],meaning) #Adding the remaining word
|
||
|
|
||
|
#### Retrieving data
|
||
|
|
||
|
def search_word(self,word):
|
||
|
if len(word)==0:
|
||
|
if self.ends_here:
|
||
|
return True
|
||
|
else:
|
||
|
return "Word doesn't exist in the Trie"
|
||
|
ch = word[0]
|
||
|
index = ord(ch)-ord('a')
|
||
|
if self.edge[index]== None:
|
||
|
return False
|
||
|
else:
|
||
|
return self.edge[index].search_word(word[1:])
|
||
|
|
||
|
The `search_word` function will tell us if the word exists in the Trie or not. Since ours is a dictionary, we need to fetch the meaning as well, now lets declare a function to do that.
|
||
|
|
||
|
def get_meaning(self,word):
|
||
|
if len(word)==0 :
|
||
|
if self.ends_here:
|
||
|
return self.meaning
|
||
|
else:
|
||
|
return "Word doesn't exist in the Trie"
|
||
|
ch = word[0]
|
||
|
index = ord(ch) - ord('a')
|
||
|
if self.edges[index] == None:
|
||
|
return "Word doesn't exist in the Trie"
|
||
|
else:
|
||
|
return self.edges[index].get_meaning(word[1:])
|
||
|
|
||
|
#### Deleting data
|
||
|
|
||
|
By deleting data, you just need to change the variable `ends_here` to `False`. Doing that doesn't alter the prefixes, but stills deletes the meaning and the existence of the word from the trie.
|
||
|
|
||
|
def delete_word(self,word):
|
||
|
if len(word)==0:
|
||
|
if self.ends_here:
|
||
|
self.ends_here = False
|
||
|
self.meaning = None
|
||
|
return "Deleted"
|
||
|
else:
|
||
|
return "Word doesn't exist in the Trie"
|
||
|
ch = word[0]
|
||
|
index = ord(ch) - ord('a')
|
||
|
if self.edges[index] == None:
|
||
|
return "Word doesn't exist in the Trie"
|
||
|
else:
|
||
|
return self.edges[index].delete_word(word[1:])
|
||
|
|
||
|
![:rocket:](//forum.freecodecamp.com/images/emoji/emoji_one/rocket.png?v=2 ":rocket:") <a href='https://repl.it/CWbr' target='_blank' rel='nofollow'>Run Code</a>
|
||
|
|
||
|
## Resources
|
||
|
|
||
|
* For further reading, you can try this <a href='https://www.topcoder.com/community/data-science/data-science-tutorials/using-tries/' target='_blank' rel='nofollow'>topcoder</a> tutorial.
|
||
|
* Also, a tutorial from <a href='http://www.geeksforgeeks.org/trie-insert-and-search/' target='_blank' rel='nofollow'>geeksforgeeks</a>
|