Vinh Ngo - PhD Portfolio

How It Works

Use the interactive visualization above to explore Bloom filters hands-on. Switch between Insert and Query modes, type or generate random elements, and watch how the bit array fills up. Observe how false positive probability climbs as the bit array becomes saturated.

Algorithm Breakdown

A Bloom filter is a space-efficient probabilistic data structure, conceived by Burton Howard Bloom in 1970, that answers a simple question: "Is this element in the set?" It trades perfect accuracy for memory efficiency. A Bloom filter can represent a set of millions of items in just a few kilobytes.

The key insight is that it allows false positives (saying "maybe" when the answer should be "no") but never false negatives (if it says "no", the element is definitely not in the set). This asymmetry makes it ideal for use as a fast pre-filter before expensive lookups.

Insertion

To add an element, feed it to k independent hash functions. Each hash function maps the element to one of m positions in the bit array. Set all k corresponding bits to 1. This is an O(k) operation — constant time regardless of how many elements are already in the filter.

Query

To query for an element, feed it to the same k hash functions. If any of the resulting bit positions is 0, the element is definitely not in the set. If all positions are 1, the element is probably in the set — but could be a false positive caused by hash collisions from other inserted elements.

False Positives

As more elements are inserted, more bits are set to 1, increasing the chance that a query for a non-member element will find all its hash positions already set. This is the false positive phenomenon. The visualization above shows the false positive probability in real time — watch it climb as the filter fills up.

Bloom Filter

Technical Implementation

Core Logic

How It Works

Algorithm Breakdown