Skip to main content

Core Concepts

Tenant Isolation

Every tenant in Remem gets hard isolation across all storage layers:
LayerIsolation Method
PostgreSQLRow-Level Security (RLS) policies
QdrantDedicated collection per tenant
S3Tenant-prefixed object keys
EncryptionPer-tenant Data Encryption Key (DEK)
There is no shared vector space — a query for Tenant A can never return results from Tenant B.

Encryption Model

Remem uses envelope encryption with application-level field encryption:
KEK (Key Encryption Key)
 └── encrypts → DEK (Data Encryption Key, per tenant)
      └── encrypts → Document content, titles, metadata
  • Content, titles, and metadata are stored as ciphertext in PostgreSQL
  • S3 objects use client-side encryption before upload
  • Qdrant vectors contain only embeddings and minimal non-sensitive metadata
  • Logs are scrubbed to prevent sensitive data leakage

Crypto-Shredding

When a tenant requests data deletion, Remem can destroy the DEK, making all encrypted data permanently unrecoverable — even if database backups are retained.

Processing Pipeline

When a document is ingested, it flows through an async pipeline:
Ingest API
  → Redis Streams (job queue)
    → Worker picks up job
      1. Encrypt content with tenant DEK
      2. Store encrypted content in S3
      3. Chunk text into segments
      4. Generate embeddings (voyage-3.5-lite)
      5. Classify & extract entities (Grok 4.1 Fast via xAI API)
      6. Index chunks in Qdrant
      7. Store metadata in PostgreSQL
The API returns a job_id immediately. Processing typically completes within seconds.

Query Modes

Fast Mode (mode: "fast")

  • Latency: <500ms (typically 200-600ms)
  • Method: Hybrid search combining vector similarity (Qdrant) and BM25 keyword matching
  • Results: Ranked chunks with Reciprocal Rank Fusion (RRF)
  • No LLM call — pure retrieval
{"query": "Q1 priorities", "mode": "fast"}

Rich Mode (mode: "rich")

  • Latency: <5s cold, <3s cached
  • Method: Query expansion → parallel retrieval → RRF fusion → LLM reranking → LLM synthesis (all using Grok 4.1 Fast via xAI API)
  • Results: Reranked chunks plus optional natural language synthesis field
  • Budget-aware: Skips reranking/synthesis if time budget exhausted
  • Caching: Expansion and rerank results cached in Redis for 15 minutes
{"query": "Q1 priorities", "mode": "rich", "synthesize": true}

Filtering

Query results can be filtered by metadata assigned during classification:
FilterTypeExample
categoriesList[str]["meeting_notes", "planning"]
tags_anyList[str]["q1", "priorities"]
tags_allList[str]["urgent", "approved"]
tags_prefixstr"project-"
sensitivityList[str]["public", "internal"]
date_from / date_tostr (ISO 8601)"2026-01-01"
source_typesList[str]["email", "note"]
storage_typesList[str]["document", "chat_history"]
languagesList[str]["en", "fr"]
has_extractable_databooltrue
classifier_modelsList[str]["grok-4.1-fast"]