How It Works
Use the interactive visualization above to explore the ReSketch algorithm hands-on. Insert individual items or use Batch Mode to generate data streams. Query items to see frequency estimates via the median-of-rows approach. Use the Expand and Shrink buttons to dynamically resize the sketch and watch how items are remapped across buckets using consistent hashing.
Algorithm Breakdown
ReSketch is a frequency estimation data structure designed for distributed and parallel environments. Unlike traditional sketches (e.g., Count-Min Sketch), ReSketch supports structural operations, i.e., expand, shrink, merge, and split, that allow the sketch to adapt its memory footprint at runtime without losing accuracy.
Consistent Hashing Ring
Each row of the sketch uses a consistent hashing ring, a sorted list of random hash points mapped to bucket IDs. When an item is inserted, its hash is placed on the ring and assigned to the nearest bucket. This design enables seamless resizing: adding or removing ring points only redistributes a fraction of items.
KLL Quantile Sketch
Each bucket contains a KLL sketch (a compactor-based quantile summary). Instead of storing raw counts, the KLL tracks the distribution of hash values that land in the bucket. To estimate an item's frequency, we query the KLL for how many times that specific hash value appears.
Multi-Row Median Estimation
The sketch maintains multiple independent rows (depth), each with its own hash function and ring. The final frequency estimate is the median across all rows, which provides robustness against hash collisions, which is similar to how Count-Min Sketch uses the minimum across rows.
Expand & Shrink
Expand adds new hash points to the ring, creating new buckets. Items from existing buckets are redistributed to the new buckets using the KLL's range-query capability. Shrink removes hash points and merges the affected items back into neighboring buckets. Both operations preserve the statistical properties of the KLL sketches.
Structural Operations
Expand
When expanding from width w to w':
- New random hash points are added to each ring
- For each new bucket boundary, the KLL sketch of the overlapping old bucket is split
- Items in the split range are redistributed to the new bucket
- The ring is re-sorted to maintain the consistent hashing invariant
Shrink
When shrinking from width w to w':
- Hash points are randomly selected for removal from each ring
- Items from removed buckets are merged into their neighboring buckets
- KLL sketches are merged to combine the frequency information
- Bucket IDs are reassigned sequentially
Implementation Notes
This visualization uses MurmurHash3 (32-bit) for all hashing. The consistent hashing ring operates in the 32-bit hash space. The KLL sketch uses a compaction factor of c = 2/3 and supports weighted item insertion for accurate redistribution during structural operations.