Skip to main content

Data Lifecycle

Remem is designed for privacy-first data management with comprehensive controls for versioning, retention, deletion, and regulatory compliance.

Overview

Encrypted at Rest

All content encrypted with per-tenant keys using envelope encryption

Version Control

Documents can be versioned while preserving originals

Flexible Deletion

Soft delete (recoverable) or hard delete (permanent)

GDPR/CCPA Ready

Full data export and right-to-erasure support
All content is encrypted at rest with per-tenant keys (envelope encryption). Documents can be versioned, soft-deleted, or permanently destroyed. Full GDPR/CCPA compliance includes data export and right-to-erasure. Crypto-shredding destroys the encryption key, making all data unrecoverable.

Document Versioning

When you update a document, Remem creates a new version while preserving the original.

Creating a New Version

curl -X POST https://api.remem.io/v1/documents/{document_id}/update \
  -H "X-API-Key: vlt_..." \
  -H "Content-Type: application/json" \
  -d '{
    "content": "Updated content here",
    "title": "Updated Title (optional)"
  }'
What happens:
  • Original document is marked as superseded
  • New document is created with incremented version number
  • New chunks are generated and indexed
  • New embeddings are created with the same embedding version
  • Old chunks/vectors remain queryable until cleanup
Query results return the latest version by default. Previous versions are excluded from search results but remain in storage.

Version Metadata

Each version includes:
  • version_number — Increments with each update (starts at 1)
  • document_id — Unique ID for each version
  • supersedes — ID of the previous version (if applicable)
  • content_hash — SHA-256 hash to detect duplicate content

Document Deletion

Remem supports two deletion modes: soft delete (recoverable) and hard delete (permanent).

Soft Delete

Marks a document as deleted without removing data from storage.
curl -X DELETE https://api.remem.io/v1/documents/{document_id} \
  -H "X-API-Key: vlt_..."
Behavior:
  • Document is excluded from search results immediately
  • Data remains in storage (PostgreSQL, Qdrant, S3)
  • Recoverable by admin if needed
  • Does not count toward tenant storage quotas
Soft-deleted documents can be restored by support staff within 30 days. After 30 days, they are automatically hard-deleted.

Hard Delete

Permanently removes the document and all associated data.
curl -X DELETE https://api.remem.io/v1/documents/{document_id}?hard_delete=true \
  -H "X-API-Key: vlt_..."
What is deleted:
  • Document metadata (PostgreSQL)
  • All chunks and embeddings (PostgreSQL + Qdrant)
  • Raw files from S3 storage
  • All extracted entities and structured data
  • Cache entries (Redis)
Hard deletion is irreversible. Once a document is hard-deleted, it cannot be recovered — even from backups, since the data is encrypted and the ciphertext is overwritten.

Data Export

For GDPR Article 15 (right of access) and CCPA right to know, Remem provides a complete data export.

Exporting Your Data

curl https://api.remem.io/v1/dsar/export \
  -H "X-API-Key: vlt_..." \
  -o remem-export.zip
Response format: ZIP archive

What’s Included

The export archive contains:
  • All documents with decrypted content (original plaintext)
  • Document titles and summaries
  • Metadata (categories, tags, sensitivity levels)
  • Source information (file type, original filename, upload date)
  • AI-generated categories and tags
  • Extracted entities (people, organizations, dates, amounts)
  • Structured data extracted from documents (invoice amounts, dates, etc.)
  • Confidence scores for classifications
  • Original files uploaded to Remem (PDFs, images, text files)
  • Included by default (include_raw_files=true)
  • Can be excluded to reduce archive size: ?include_raw_files=false
  • Access logs for your data (who queried, when)
  • Modification history (document updates, deletions)
  • API key usage

Export Parameters

ParameterTypeDefaultDescription
include_raw_filesbooleantrueInclude original uploaded files in the archive
For large tenants, export may take several minutes. The API returns a 202 Accepted status with a job_id. Poll /v1/jobs/{job_id} for completion status.

Right to Erasure

For GDPR Article 17 and CCPA right to delete, Remem supports complete tenant data deletion.

Deleting All Data

curl -X POST https://api.remem.io/v1/dsar/delete \
  -H "Content-Type: application/json" \
  -H "X-API-Key: vlt_..." \
  -d '{
    "confirm": true,
    "crypto_shred": true
  }'
Request fields:
FieldTypeRequiredDescription
confirmbooleanYesMust be true to proceed (safety check)
crypto_shredbooleanNo (default: true)Destroy the encryption key, making all data unrecoverable
This action is irreversible. When crypto_shred: true (the default), Remem destroys the tenant’s Data Encryption Key (DEK). All encrypted content — documents, chunks, metadata, filenames — becomes permanently unreadable, even if database records persist in backups.There is no undo. This is a nuclear option for privacy compliance.

What Gets Deleted

1

PostgreSQL Records

All rows with your tenant ID:
  • Documents, chunks, entities, extracted data
  • API keys (revoked immediately)
  • Tenant metadata (name, email)
2

Qdrant Vector Index

Your dedicated Qdrant collection is deleted, removing all embeddings and vector metadata.
3

S3 Storage

All objects in your tenant’s S3 prefix are deleted, including encrypted raw files.
4

Redis Caches

All cached query results and embeddings for your tenant are purged.
5

Encryption Key (if crypto_shred: true)

Your tenant’s DEK is destroyed. Even if ciphertext remains in backups, it cannot be decrypted.

Deletion Confirmation

Upon success, the API returns:
{
  "status": "deleted",
  "tenant_id": "uuid",
  "deleted_at": "2026-02-04T12:34:56Z",
  "crypto_shredded": true,
  "resources_deleted": {
    "documents": 1234,
    "chunks": 5678,
    "vectors": 5678,
    "s3_objects": 1234
  }
}

Tenant Deletion

Admin users can delete tenants via the tenant management API.

Soft Tenant Deletion (Default)

Marks the tenant as inactive without removing data.
curl -X DELETE https://api.remem.io/v1/tenants/{tenant_id} \
  -H "Authorization: Bearer {admin_jwt}"
Behavior:
  • Tenant is marked inactive
  • All API keys stop working immediately
  • Data is preserved (can be restored by admin)
  • Tenant is excluded from billing

Hard Tenant Deletion

Permanently removes all tenant data.
curl -X DELETE "https://api.remem.io/v1/tenants/{tenant_id}?hard_delete=true" \
  -H "Authorization: Bearer {admin_jwt}"
What is deleted:
  • All documents, chunks, and embeddings
  • Qdrant collection
  • S3 objects
  • PostgreSQL records
  • Encryption keys (DEK destroyed)
Hard tenant deletion is equivalent to a DSAR deletion with crypto_shred: true. It is irreversible.

Encryption Model

Understanding how encryption works in Remem helps explain the permanence of crypto-shredding.
Remem uses a two-tier key system:
  • KEK (Key Encryption Key): Stored in AWS KMS (or customer’s KMS for enterprise). This is the master key.
  • DEK (Data Encryption Key): One per tenant. Encrypted by the KEK and stored in PostgreSQL.
When you upload a document, Remem:
  1. Decrypts your tenant’s DEK using the KEK
  2. Encrypts your document content with the DEK (AES-256-GCM)
  3. Stores the ciphertext in PostgreSQL/S3
  4. Clears the DEK from memory
The KEK never touches your data directly — it only wraps/unwraps the DEK.
Encrypted fields (ciphertext in database):
  • Document content, titles, summaries
  • Metadata (categories, tags, extracted data)
  • Original filenames
  • Chunk content
  • Entity values (names, amounts, account numbers)
NOT encrypted (needed for filtering):
  • Document IDs, timestamps
  • Source type (e.g., “pdf”, “email”)
  • Sensitivity level (e.g., “confidential”)
  • Language code (e.g., “en”)
  • Embedding vectors (mathematical representations, not human-readable)
Files uploaded to S3 are encrypted client-side before upload:
  • Remem encrypts the file with your tenant’s DEK
  • Only ciphertext is stored in S3
  • Object keys use UUIDs only (no filenames)
  • Original filenames are stored encrypted in PostgreSQL
Browsing the DigitalOcean Spaces dashboard shows only encrypted blobs.
Qdrant stores:
  • Embeddings: 1024-dimensional float arrays (not human-readable)
  • Metadata: Only filter-relevant fields (category, tags, sensitivity, dates)
Qdrant does NOT store:
  • Document content or summaries
  • Extracted data values
  • Titles
  • Any other human-readable sensitive information
This minimizes exposure if Qdrant access is compromised.

Retention & Time-to-Live (TTL)

Remem automatically cleans up old files to reduce storage costs.

Default Retention

ResourceDefault TTLConfigurable?
Raw uploaded files (S3)90 daysYes (per-tenant)
Document metadataIndefiniteManual deletion only
Soft-deleted documents30 daysNo
Query result caches (Redis)5-30 minutesNo

Keep Forever

Documents can be flagged to prevent TTL expiration:
curl -X PATCH https://api.remem.io/v1/documents/{document_id} \
  -H "X-API-Key: vlt_..." \
  -d '{"keep_forever": true}'
Alternatively, use the user_starred flag for important documents.

TTL Sweeper

A background job runs periodically to:
  • Delete expired raw files from S3
  • Hard-delete soft-deleted documents after 30 days
  • Clean up orphaned vectors and cache entries

Compliance Checklist

Use this checklist to verify Remem meets your regulatory requirements.

GDPR Article 15: Right of Access

GET /v1/dsar/export provides complete data export in machine-readable format (ZIP archive with JSON/CSV)

GDPR Article 17: Right to Erasure

POST /v1/dsar/delete with crypto_shred: true makes data unrecoverable, even from backups

GDPR Article 20: Data Portability

✅ Export archive includes all data in structured formats (JSON), with original files

CCPA: Right to Know

GET /v1/dsar/export discloses all personal information collected

CCPA: Right to Delete

POST /v1/dsar/delete deletes all personal information, with crypto-shredding for assurance

Data Minimization

✅ Per-tenant isolation, application-level encryption, log scrubbing, and minimal Qdrant metadata

Additional Compliance Features

  • Audit logging: All access, modification, and deletion events are logged with 7-year retention
  • Encryption at rest: AES-256-GCM for all sensitive data
  • Encryption in transit: TLS 1.3 for all API communications
  • Key rotation: DEK rotation supported (zero-downtime re-encryption)
  • Breach notification: Automated alerts for suspicious activity

API Reference

Export Data

GET /v1/dsar/export?include_raw_files=true
Headers:
  • X-API-Key: vlt_... (required)
Query Parameters:
  • include_raw_files (boolean, default: true) — Include original uploaded files
Response:
  • 200 OK — ZIP archive (for small tenants)
  • 202 Accepted — Job ID (for large tenants, poll /v1/jobs/{job_id})

Delete All Tenant Data

POST /v1/dsar/delete
Headers:
  • X-API-Key: vlt_... (required)
  • Content-Type: application/json
Request Body:
{
  "confirm": true,
  "crypto_shred": true
}
Response:
{
  "status": "deleted",
  "tenant_id": "uuid",
  "deleted_at": "2026-02-04T12:34:56Z",
  "crypto_shredded": true,
  "resources_deleted": {
    "documents": 1234,
    "chunks": 5678,
    "vectors": 5678,
    "s3_objects": 1234
  }
}

Frequently Asked Questions

No. Crypto-shredding destroys the encryption key (DEK). All encrypted data becomes permanently unreadable, even if the ciphertext exists in backups. This is by design — it ensures complete deletion for regulatory compliance.
Only the specified version is deleted. If you delete the latest version, the previous version becomes queryable again. To delete all versions, delete the original document (the first version).
For most tenants (under 10GB): under 1 minute. For larger tenants: the API returns 202 Accepted with a job_id. Poll /v1/jobs/{job_id} for status. Large exports (100GB+) can take 10-30 minutes.
Yes. Call GET /v1/dsar/preview to see a summary of what will be deleted (document count, storage size, etc.) without actually deleting anything.
Contact support within 30 days. Soft-deleted documents can be restored by admin staff. After 30 days, they are automatically hard-deleted and cannot be recovered.
Yes, for enterprise customers. Contact us for a Business Associate Agreement (BAA) and dedicated infrastructure with additional security controls.