Trie

A trie (or prefix tree) is a tree‑shaped data structure specialized for storing and querying strings (or sequences) by their prefixes.

Instead of storing each string as a single value, a trie stores characters along the path from root to a node.

Structure

Each node in a trie typically contains:

  • A map from character → child node.
  • A flag indicating whether this node marks the end of a word.

Example set of words:

  • ["cat", "car", "dog"]

Conceptually, the trie looks like:

  • Root
    • 'c'
      • 'a'
        • 't' (end of word: "cat")
        • 'r' (end of word: "car")
    • 'd'
      • 'o'
        • 'g' (end of word: "dog")

Core Operations

For a dictionary of n words with total length L:

  • Insert(word):
    • Walk from the root, following/creating edges for each character.
    • Mark the last node as an end of word.
    • Time: O(len(word)).
  • Search(word):
    • Follow edges for each character.
    • Check if last node is marked as end of word.
    • Time: O(len(word)).
  • StartsWith(prefix):
    • Follow edges for each character in prefix.
    • Return true if traversal succeeds (regardless of end-of-word flag).
    • Time: O(len(prefix)).

Trie operations depend on string length, not total number of words, which is powerful when strings are long and share prefixes.

Use Cases

Tries are a good fit for:

  • Prefix queries:
    • Autocomplete, suggestions for search.
    • Check if any word starts with a given prefix.
  • Dictionary problems:
    • Word search in a grid.
    • Matching with wildcards or patterns.
  • String sets with many shared prefixes:
    • DNS or URL routing (in conceptual variants).

In such cases, tries can be more efficient or more expressive than hash tables.

Trade‑offs

Pros:

  • Fast prefix queries (O(length)).
  • Naturally groups words by prefix.
  • Useful for problems where ordering/lexicographical traversal matters.

Cons:

  • Higher memory usage than hash sets for small or random word sets.
  • Constant factors can be larger (pointers/maps per node).

Trie variants (compressed tries, DAWGs, radix trees) address some of these downsides, especially in production systems.

Example: Basic Trie Implementation (Java)

Java
class TrieNode {
    Map<Character, TrieNode> children = new HashMap<>();
    boolean isWord = false;
}

class Trie {
    private final TrieNode root = new TrieNode();

    public void insert(String word) {
        TrieNode node = root;
        for (char c : word.toCharArray()) {
            node = node.children.computeIfAbsent(c, k -> new TrieNode());
        }
        node.isWord = true;
    }

    public boolean search(String word) {
        TrieNode node = findNode(word);
        return node != null && node.isWord;
    }

    public boolean startsWith(String prefix) {
        return findNode(prefix) != null;
    }

    private TrieNode findNode(String s) {
        TrieNode node = root;
        for (char c : s.toCharArray()) {
            node = node.children.get(c);
            if (node == null) return null;
        }
        return node;
    }
}

This code follows the conceptual structure described above: characters along a path form prefixes, and isWord marks complete words.