Embeddings#
Embeddings are vector representations of text stored in embapi for similarity search and retrieval.
What are Embeddings?#
Embeddings are numerical representations (vectors) of text that capture semantic meaning:
- Vector: Array of floating-point numbers (e.g., 1536 or 3072 dimensions)
- Dimensions: Fixed length determined by LLM model
- Similarity: Vectors of similar text are close in vector space
- Purpose: Enable semantic search and retrieval
Embedding Structure#
Required Fields#
- text_id: Unique identifier for the document (max 300 characters)
- instance_handle: LLM service instance that generated the embedding
- vector: Array of float32 values (embedding vector)
- vector_dim: Declared dimension count (must match vector length)
Optional Fields#
- text: Original text content (for reference)
- metadata: Structured JSON data about the document
Example#
{
"text_id": "doc-123",
"instance_handle": "my-openai",
"text": "Introduction to machine learning concepts",
"vector": [0.023, -0.015, 0.087, ..., 0.042],
"vector_dim": 3072,
"metadata": {
"title": "ML Introduction",
"author": "Alice",
"year": 2024,
"category": "tutorial"
}
}Creating Embeddings#
Single Embedding#
POST /v1/embeddings/alice/research-docs
{
"embeddings": [
{
"text_id": "doc1",
"instance_handle": "my-openai",
"vector": [0.1, 0.2, ..., 0.3],
"vector_dim": 3072,
"metadata": {"author": "Alice"}
}
]
}Batch Upload#
POST /v1/embeddings/alice/research-docs
{
"embeddings": [
{
"text_id": "doc1",
"instance_handle": "my-openai",
"vector": [...],
"vector_dim": 3072
},
{
"text_id": "doc2",
"instance_handle": "my-openai",
"vector": [...],
"vector_dim": 3072
},
...
]
}Batch upload tips:
- Upload 100-1000 embeddings per request
- Use consistent instance_handle
- Ensure all vectors have same dimensions
- Include metadata for searchability
Text Identifiers#
Format#
Text IDs can be any string up to 300 characters:
Common patterns:
- URLs:
https://id.example.com/doc/123 - URNs:
urn:example:doc:123 - Paths:
/corpus/section1/doc123 - IDs:
doc-abc-123-xyz
URL Encoding#
URL-encode text IDs when using them in API paths:
# Original ID
text_id="https://id.example.com/texts/W0017:1.3.1"
# URL-encoded for API
encoded="https%3A%2F%2Fid.example.com%2Ftexts%2FW0017%3A1.3.1"
# Use in API call
GET /v1/embeddings/alice/project/$encodedUniqueness#
Text IDs must be unique within a project:
- Same ID in different projects: ✅ Allowed
- Same ID twice in one project: ❌ Conflict error
Validation#
Dimension Validation#
The system automatically validates vector dimensions:
Checks performed:
vector_dimmatches declared instance dimensions- Actual
vectorarray length matchesvector_dim - All embeddings in project have consistent dimensions
Example error:
{
"title": "Bad Request",
"status": 400,
"detail": "dimension validation failed: vector dimension mismatch: embedding declares 3072 dimensions but LLM service 'my-openai' expects 1536 dimensions"
}Metadata Validation#
If project has a metadata schema, all embeddings are validated:
Example error:
{
"title": "Bad Request",
"status": 400,
"detail": "metadata validation failed for text_id 'doc1': metadata validation failed:\n - author is required\n - year must be integer"
}See Metadata Validation Guide for details.
Retrieving Embeddings#
List All Embeddings#
GET /v1/embeddings/alice/research-docs?limit=100&offset=0Returns paginated list of embeddings with:
- text_id
- metadata
- vector_dim
- created_at
Vectors are included by default (can be large).
Get Single Embedding#
GET /v1/embeddings/alice/research-docs/doc1Returns complete embedding including vector.
Pagination#
Use limit and offset for large projects:
# First page (0-99)
GET /v1/embeddings/alice/research-docs?limit=100&offset=0
# Second page (100-199)
GET /v1/embeddings/alice/research-docs?limit=100&offset=100
# Third page (200-299)
GET /v1/embeddings/alice/research-docs?limit=100&offset=200Updating Embeddings#
Currently, embeddings cannot be updated directly. To modify:
- Delete existing embedding
- Upload new version with same text_id
# Delete old version
DELETE /v1/embeddings/alice/research-docs/doc1
# Upload new version
POST /v1/embeddings/alice/research-docs
{
"embeddings": [{
"text_id": "doc1",
"instance_handle": "my-openai",
"vector": [...new vector...],
"vector_dim": 3072,
"metadata": {...updated metadata...}
}]
}Deleting Embeddings#
Delete Single Embedding#
DELETE /v1/embeddings/alice/research-docs/doc1Delete All Embeddings#
DELETE /v1/embeddings/alice/research-docsWarning: This deletes all embeddings in the project permanently.
Metadata#
Purpose#
Metadata provides structured information about documents:
- Filtering: Exclude documents in similarity searches
- Organization: Categorize and group documents
- Context: Store additional document information
- Validation: Ensure consistent structure (with schema)
Structure#
Metadata is stored as JSONB in PostgreSQL:
{
"author": "William Shakespeare",
"title": "Hamlet",
"year": 1603,
"act": 1,
"scene": 1,
"genre": "drama",
"language": "English"
}Nested Metadata#
Complex structures are supported:
{
"author": {
"name": "William Shakespeare",
"birth_year": 1564,
"nationality": "English"
},
"publication": {
"year": 1603,
"publisher": "First Folio",
"edition": 1
},
"tags": ["tragedy", "revenge", "madness"]
}Filtering by Metadata#
Use metadata to exclude documents from similarity searches:
# Exclude documents from same author
GET /v1/similars/alice/project/doc1?metadata_path=author&metadata_value=ShakespeareSee Metadata Filtering Guide for details.
Storage Considerations#
Vector Storage#
Vectors are stored using pgvector extension:
- Type:
vector(N)where N is dimension count - Size: 4 bytes per dimension + overhead
- Example: 3072-dimension vector ≈ 12KB
Storage Calculation#
Estimate storage per embedding:
Vector: 4 bytes × dimensions
Text ID: length in bytes (avg ~50 bytes)
Text: length in bytes (optional)
Metadata: JSON size (varies, avg ~500 bytes)
Overhead: ~100 bytes (indexes, etc.)
Example (3072-dim with metadata):
4 × 3072 + 50 + 500 + 100 ≈ 13KB per embeddingLarge Projects#
For projects with millions of embeddings:
- Use pagination when listing
- Consider partial indexes for metadata
- Monitor database size
- Plan backup strategy
Performance#
Upload Performance#
- Small batches (1-10): ~100ms per request
- Medium batches (100-500): ~500ms-2s per request
- Large batches (1000+): ~2-10s per request
Retrieval Performance#
- Single embedding: <10ms
- Paginated list (100 items): ~50ms
- Large project scan: Use pagination
Optimization Tips#
- Batch uploads when possible
- Use appropriate page sizes
- Include only needed fields
- Monitor query performance
Common Patterns#
Document Chunking#
Split long documents into chunks:
{
"embeddings": [
{
"text_id": "doc1:chunk1",
"text": "First part of document...",
"vector": [...],
"metadata": {"doc_id": "doc1", "chunk": 1}
},
{
"text_id": "doc1:chunk2",
"text": "Second part of document...",
"vector": [...],
"metadata": {"doc_id": "doc1", "chunk": 2}
}
]
}Versioned Documents#
Track document versions:
{
"text_id": "doc1:v2",
"vector": [...],
"metadata": {
"doc_id": "doc1",
"version": 2,
"updated_at": "2024-01-15T10:30:00Z"
}
}Multi-Language Documents#
Store embeddings for different languages:
{
"embeddings": [
{
"text_id": "doc1:en",
"text": "English version...",
"vector": [...],
"metadata": {"doc_id": "doc1", "language": "en"}
},
{
"text_id": "doc1:de",
"text": "Deutsche Version...",
"vector": [...],
"metadata": {"doc_id": "doc1", "language": "de"}
}
]
}Troubleshooting#
Dimension Mismatch#
Error: “vector dimension mismatch”
Cause: Vector dimensions don’t match instance configuration
Solution:
- Check instance dimensions:
GET /v1/llm-services/owner/instance - Regenerate embeddings with correct model
- Ensure
vector_dimmatches actual vector length
Metadata Validation Failed#
Error: “metadata validation failed”
Cause: Metadata doesn’t match project schema
Solution:
- Check project schema:
GET /v1/projects/owner/project - Update metadata to match schema
- Or update schema to accept metadata
Text ID Conflict#
Error: “embedding with text_id already exists”
Cause: Attempting to upload duplicate text_id
Solution:
- Use different text_id
- Delete existing embedding first
- Check for unintended duplicates