blog
authentication

How to Use a Reverse Trie for Fast Disposable Email Domain Detection

Learn how to use a reverse Trie to efficiently detect disposable email domains. Optimize your domain lookups with a scalable, memory-efficient solution tailored for fast and precise results.


Written on December 5, 2022.
Go to the free tool

Disposable emails can cause issues like fake signups and spam. The user grabs an address from one of thousands of temporary email generators and hands it over. Not even the GOAT email regex can save you here.

Personally, I find having a big list of all disposable email domains is the easiest yet most effective solution. But before you assemble that list and start a for ... of loop to check against it, think of the O(n) complexity!

A great way to identify them is by using a reverse Trie, an efficient data structure for fast lookups.

What Is a Reverse Trie?

First, let's grasp what a Trie is. It is a data structure where strings are:

  • chopped up, char per char
  • assembled in a tree structure

example, if we feed boa, bro, brie, it would assemble them using Map as:

b
 ├── o ── a
 └── r ── o  
     └─── i ── e

This approach allows direct lookups without cycling through the entire list. Each character guides the search deeper.

It trades memory for efficiency. The time it takes to find the string does not depend on the size of the list, but on the length of the string!

A reverse Trie stores strings in reverse order, ideal for domains:

  • mailinator.com becomes moc.rotanliam
  • trashmail.com becomes moc.liambhsart

Note on This Implementatin

By reversing domains, searches start at the TLD (e.g., .com), which is shared across many domains. To optimize further, it stores TLDs as a single key (com), rather than splitting them into characters. The rest of the domain follows a standard Trie structure.

Reverse Trie Domains Implementation

Since this is a tree structure, each node will reference its children:

type TrieNode = Map<string, TrieNode>;

First, a utility function to split the TLD from the rest of the domain:

private splitTLDFromRest(input: string) {
    const dot = input.lastIndexOf('.');
    const TLD = input.substring(dot + 1);
    const rest = input.substring(0, dot);
    return [TLD, rest];
}

Using lastIndexOf ensures subdomains like foo.bar.baz.com are handled correctly.

Next, the constructor will assemble the Trie:

export class ReverseTrieDomains {
	private root: TrieNode = new Map();
 
	// ...
 
	constructor(...domains: string[]) {
		for (const domain of domains) {
			// For "didof.dev"
			const [TLD, rest] = this.splitTLDFromRest(domain);
			// dev, didof
 
			// Keep the refence to the TLD node for final set
			let node = this.root.get(TLD);
			if (!node) node = new Map();
 
			// Start from TLD node, walk along the string in reverse
			let currentNode: TrieNode = node;
			for (let i = rest.length - 1; i >= 0; i--) {
				const char = rest[i];
				let childNode = currentNode.get(char);
				if (!childNode) {
					childNode = new Map();
					currentNode.set(char, childNode);
				}
				currentNode = childNode;
			}
 
			this.root.set(TLD, node);
		}
	}
}

To check if a domain is disposable, traverse the Trie:

export class ReverseTrieDomains {
	// ...
 
	public has(domain: string) {
        const [TLD, rest] = this.splitTLDFromRest(domain)
 
        const node = this.root.get(TLD)
        if (!node) return false
 
        let currentNode: TrieNode = node
        let isFullDomainFound = false
        for (let i = rest.length - 1; i >= 0; i--) {
            const char = rest[i]
            const childNode = currentNode.get(char)
            console.log(i, char, childNode)
            if (!childNode) return false
            currentNode = childNode
            if (i === 0) {
                isFullDomainFound = currentNode.size === 0;
            }
        }
 
        return isFullDomainFound
    }
}

Conclusion

Using a reverse Trie offers several benefits:

  • Fast Lookups: Traverse characters step-by-step for quick results.
  • Memory Efficiency: Common suffixes like .com are stored only once.
  • Scalability: Handles large domain lists effortlessly.

If you’re dealing with disposable emails, this is a smart, scalable solution to implement.