Data Lifecycle
Remem is designed for privacy-first data management with comprehensive controls for versioning, retention, deletion, and regulatory compliance.Overview
Encrypted at Rest
All content encrypted with per-tenant keys using envelope encryption
Version Control
Documents can be versioned while preserving originals
Flexible Deletion
Soft delete (recoverable) or hard delete (permanent)
GDPR/CCPA Ready
Full data export and right-to-erasure support
Document Versioning
When you update a document, Remem creates a new version while preserving the original.Creating a New Version
- Original document is marked as superseded
- New document is created with incremented version number
- New chunks are generated and indexed
- New embeddings are created with the same embedding version
- Old chunks/vectors remain queryable until cleanup
Query results return the latest version by default. Previous versions are excluded from search results but remain in storage.
Version Metadata
Each version includes:version_number— Increments with each update (starts at 1)document_id— Unique ID for each versionsupersedes— ID of the previous version (if applicable)content_hash— SHA-256 hash to detect duplicate content
Document Deletion
Remem supports two deletion modes: soft delete (recoverable) and hard delete (permanent).Soft Delete
Marks a document as deleted without removing data from storage.- Document is excluded from search results immediately
- Data remains in storage (PostgreSQL, Qdrant, S3)
- Recoverable by admin if needed
- Does not count toward tenant storage quotas
Soft-deleted documents can be restored by support staff within 30 days. After 30 days, they are automatically hard-deleted.
Hard Delete
Permanently removes the document and all associated data.- Document metadata (PostgreSQL)
- All chunks and embeddings (PostgreSQL + Qdrant)
- Raw files from S3 storage
- All extracted entities and structured data
- Cache entries (Redis)
Data Export
For GDPR Article 15 (right of access) and CCPA right to know, Remem provides a complete data export.Exporting Your Data
What’s Included
The export archive contains:Documents with Content
Documents with Content
- All documents with decrypted content (original plaintext)
- Document titles and summaries
- Metadata (categories, tags, sensitivity levels)
- Source information (file type, original filename, upload date)
Classification Results
Classification Results
- AI-generated categories and tags
- Extracted entities (people, organizations, dates, amounts)
- Structured data extracted from documents (invoice amounts, dates, etc.)
- Confidence scores for classifications
Raw Uploaded Files
Raw Uploaded Files
- Original files uploaded to Remem (PDFs, images, text files)
- Included by default (
include_raw_files=true) - Can be excluded to reduce archive size:
?include_raw_files=false
Audit Trail (if available)
Audit Trail (if available)
- Access logs for your data (who queried, when)
- Modification history (document updates, deletions)
- API key usage
Export Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
include_raw_files | boolean | true | Include original uploaded files in the archive |
Right to Erasure
For GDPR Article 17 and CCPA right to delete, Remem supports complete tenant data deletion.Deleting All Data
| Field | Type | Required | Description |
|---|---|---|---|
confirm | boolean | Yes | Must be true to proceed (safety check) |
crypto_shred | boolean | No (default: true) | Destroy the encryption key, making all data unrecoverable |
What Gets Deleted
PostgreSQL Records
All rows with your tenant ID:
- Documents, chunks, entities, extracted data
- API keys (revoked immediately)
- Tenant metadata (name, email)
Qdrant Vector Index
Your dedicated Qdrant collection is deleted, removing all embeddings and vector metadata.
Deletion Confirmation
Upon success, the API returns:Tenant Deletion
Admin users can delete tenants via the tenant management API.Soft Tenant Deletion (Default)
Marks the tenant as inactive without removing data.- Tenant is marked inactive
- All API keys stop working immediately
- Data is preserved (can be restored by admin)
- Tenant is excluded from billing
Hard Tenant Deletion
Permanently removes all tenant data.- All documents, chunks, and embeddings
- Qdrant collection
- S3 objects
- PostgreSQL records
- Encryption keys (DEK destroyed)
Encryption Model
Understanding how encryption works in Remem helps explain the permanence of crypto-shredding.Envelope Encryption
Envelope Encryption
Remem uses a two-tier key system:
- KEK (Key Encryption Key): Stored in AWS KMS (or customer’s KMS for enterprise). This is the master key.
- DEK (Data Encryption Key): One per tenant. Encrypted by the KEK and stored in PostgreSQL.
- Decrypts your tenant’s DEK using the KEK
- Encrypts your document content with the DEK (AES-256-GCM)
- Stores the ciphertext in PostgreSQL/S3
- Clears the DEK from memory
What's Encrypted
What's Encrypted
Encrypted fields (ciphertext in database):
- Document content, titles, summaries
- Metadata (categories, tags, extracted data)
- Original filenames
- Chunk content
- Entity values (names, amounts, account numbers)
- Document IDs, timestamps
- Source type (e.g., “pdf”, “email”)
- Sensitivity level (e.g., “confidential”)
- Language code (e.g., “en”)
- Embedding vectors (mathematical representations, not human-readable)
S3 Storage
S3 Storage
Files uploaded to S3 are encrypted client-side before upload:
- Remem encrypts the file with your tenant’s DEK
- Only ciphertext is stored in S3
- Object keys use UUIDs only (no filenames)
- Original filenames are stored encrypted in PostgreSQL
Qdrant Vector Storage
Qdrant Vector Storage
Qdrant stores:
- Embeddings: 1024-dimensional float arrays (not human-readable)
- Metadata: Only filter-relevant fields (category, tags, sensitivity, dates)
- Document content or summaries
- Extracted data values
- Titles
- Any other human-readable sensitive information
Retention & Time-to-Live (TTL)
Remem automatically cleans up old files to reduce storage costs.Default Retention
| Resource | Default TTL | Configurable? |
|---|---|---|
| Raw uploaded files (S3) | 90 days | Yes (per-tenant) |
| Document metadata | Indefinite | Manual deletion only |
| Soft-deleted documents | 30 days | No |
| Query result caches (Redis) | 5-30 minutes | No |
Keep Forever
Documents can be flagged to prevent TTL expiration:user_starred flag for important documents.
TTL Sweeper
A background job runs periodically to:- Delete expired raw files from S3
- Hard-delete soft-deleted documents after 30 days
- Clean up orphaned vectors and cache entries
Compliance Checklist
Use this checklist to verify Remem meets your regulatory requirements.GDPR Article 15: Right of Access
✅
GET /v1/dsar/export provides complete data export in machine-readable format (ZIP archive with JSON/CSV)GDPR Article 17: Right to Erasure
✅
POST /v1/dsar/delete with crypto_shred: true makes data unrecoverable, even from backupsGDPR Article 20: Data Portability
✅ Export archive includes all data in structured formats (JSON), with original files
CCPA: Right to Know
✅
GET /v1/dsar/export discloses all personal information collectedCCPA: Right to Delete
✅
POST /v1/dsar/delete deletes all personal information, with crypto-shredding for assuranceData Minimization
✅ Per-tenant isolation, application-level encryption, log scrubbing, and minimal Qdrant metadata
Additional Compliance Features
- Audit logging: All access, modification, and deletion events are logged with 7-year retention
- Encryption at rest: AES-256-GCM for all sensitive data
- Encryption in transit: TLS 1.3 for all API communications
- Key rotation: DEK rotation supported (zero-downtime re-encryption)
- Breach notification: Automated alerts for suspicious activity
API Reference
Export Data
X-API-Key: vlt_...(required)
include_raw_files(boolean, default:true) — Include original uploaded files
200 OK— ZIP archive (for small tenants)202 Accepted— Job ID (for large tenants, poll/v1/jobs/{job_id})
Delete All Tenant Data
X-API-Key: vlt_...(required)Content-Type: application/json
Frequently Asked Questions
Can I recover data after crypto-shredding?
Can I recover data after crypto-shredding?
No. Crypto-shredding destroys the encryption key (DEK). All encrypted data becomes permanently unreadable, even if the ciphertext exists in backups. This is by design — it ensures complete deletion for regulatory compliance.
What happens if I delete a document that's been updated multiple times?
What happens if I delete a document that's been updated multiple times?
Only the specified version is deleted. If you delete the latest version, the previous version becomes queryable again. To delete all versions, delete the original document (the first version).
How long does a data export take?
How long does a data export take?
For most tenants (under 10GB): under 1 minute. For larger tenants: the API returns
202 Accepted with a job_id. Poll /v1/jobs/{job_id} for status. Large exports (100GB+) can take 10-30 minutes.Can I preview what will be deleted before confirming?
Can I preview what will be deleted before confirming?
Yes. Call
GET /v1/dsar/preview to see a summary of what will be deleted (document count, storage size, etc.) without actually deleting anything.What if I soft-delete a document by accident?
What if I soft-delete a document by accident?
Contact support within 30 days. Soft-deleted documents can be restored by admin staff. After 30 days, they are automatically hard-deleted and cannot be recovered.
Does Remem support HIPAA compliance?
Does Remem support HIPAA compliance?
Yes, for enterprise customers. Contact us for a Business Associate Agreement (BAA) and dedicated infrastructure with additional security controls.