Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Cache & Bloom Filters

NoKV’s LSM tier layers a multi-level block cache with bloom filter caching to accelerate lookups. The implementation is in lsm/cache.go.


1. Components

ComponentPurposeSource
cache.indexsTable index cache (fid*pb.TableIndex) reused across reopen.utils/cache
blockCacheRistretto-based block cache (L0/L1 only) with per-table direct slots.lsm/cache.go
bloomCacheLRU cache of bloom filter bitsets per SST.lsm/cache.go
cacheMetricsAtomic hit/miss counters for L0/L1 blocks and blooms.lsm/cache.go#L30-L110

Badger uses a similar block cache split (Pinner/Cache) while RocksDB exposes block cache(s) via the BlockBasedTableOptions. NoKV keeps it Go-native and GC-friendly.


1.1 Index Cache & Handles

  • SSTable metadata stays with the table struct, while decoded protobuf indexes are stored in cache.indexs. Lookups first hit the cache before falling back to disk.
  • SST handles are reopened on demand for lower levels. L0/L1 tables keep their file descriptors pinned, while deeper levels close them once no iterator is using the table.

2. Block Cache Strategy

User-space block cache (L0/L1, parsed blocks, Ristretto LFU-ish)
Deeper levels rely on OS page cache + mmap readahead
  • Options.BlockCacheSize sets capacity in blocks (cost=1 per block). Entries keep parsed blocks (data slice + offsets/baseKey/checksum), so hits avoid re-parsing.
  • Per-table direct slots (table.cacheSlots[idx]) give a lock-free fast path. Misses fall back to the shared Ristretto cache (approx LFU with admission).
  • Evictions clear the table slot via OnEvict; user-space cache only tracks L0/L1 blocks. Deeper levels depend on the OS page cache.
  • Access patterns: getBlock also updates hit/miss metrics for L0/L1; deeper levels bypass the cache and do not affect metrics.
flowchart LR
  Read --> CheckCache
  CheckCache -->|hit| Return
  CheckCache -->|miss| LoadFromTable["LoadFromTable (mmap + OS page cache)"]
  LoadFromTable --> InsertCache
  InsertCache --> Return

By default only L0 and L1 blocks are cached (level > 1 short-circuits), reflecting the higher re-use for top levels.


3. Bloom Cache

  • bloomCache stores the raw filter bitset (utils.Filter) per table ID. Entries are deep-copied (SafeCopy) to avoid sharing memory with mmaps.
  • Cache policy is LRU.
  • Capacity is controlled by Options.BloomCacheSize.
  • Bloom hits/misses are recorded via cacheMetrics.recordBloom, feeding into StatsSnapshot.Cache.BloomHitRate.

4. Metrics & Observability

cache.metricsSnapshot() produces:

type CacheMetrics struct {
    L0Hits, L0Misses uint64
    L1Hits, L1Misses uint64
    BloomHits, BloomMisses uint64
    IndexHits, IndexMisses uint64
}

Stats.Snapshot converts these into hit rates. Monitor them alongside the block cache sizes to decide when to scale memory.


5. HotRing Integration

  • Hot detection: HotRing counts on read/write paths and triggers targeted prefetch for hot keys.
  • Cache warmup: prefetch loads target blocks into the normal L0/L1 block cache path.
  • Compaction coupling: HotRing top-k feeds compaction scoring; levels/ingest shards covering hot ranges get higher scores to trim overlap sooner.
  • Tuning: Hot thresholds come from HotRing options (window/decay configurable).

6. Interaction with Value Log

  • Keys stored as value pointers (large values) still populate block cache entries for the key/index block. The value payload is read directly from the vlog (valueLog.read), so block cache hit rates remain meaningful.
  • Discard stats from flushes can demote cached blocks via cache.dropBlock, ensuring obsolete SST data leaves the cache quickly.

7. Comparison

FeatureRocksDBBadgerDBNoKV
Block cache policyConfigurable multiple cachesSingle cacheRistretto for L0/L1 + OS page cache for deeper levels
Bloom cacheEnabled per table, no explicit cacheOptionalDedicated LRU storing filters
MetricsBlock cache stats via GetAggregatedIntPropertyLimitedNoKV.Stats.cache.* hit rates

8. Operational Tips

  • If bloom hit rate falls below ~60%, consider increasing bits-per-key or Bloom cache size.
  • Track nokv stats --json cache metrics over time; drops often indicate iterator misuse or working-set shifts.

More on SST layout lives in docs/manifest.md and docs/architecture.md.