cheat sheet
weaviate-client
Store, search, and manage vector embeddings with the Weaviate Python client. Covers collections, CRUD, vector/hybrid/BM25 search, multi-tenancy, generative search, and batch import.
weaviate-client — Vector Database Client
What it is
Weaviate is an open-source vector database that stores objects alongside their embeddings and enables semantic search, keyword (BM25) search, and hybrid queries in a single request. The weaviate-client Python library (v4 API) connects to a local Weaviate instance, Weaviate Cloud Services (WCS), or an embedded in-process Weaviate, and provides a class-based, schema-first API for defining collections, inserting data, and querying by vector similarity or filter. Weaviate is designed for production-scale RAG pipelines and supports multi-tenancy, replication, and generative search natively.
Install
pip install weaviate-client # v4 client (recommended)
Output: (none — exits 0 on success)
Quick example
import weaviate
from weaviate.classes.config import Configure, Property, DataType
# Connect to a local Weaviate instance
client = weaviate.connect_to_local()
# Create a collection
client.collections.create(
name="Article",
vectorizer_config=Configure.Vectorizer.text2vec_openai(),
properties=[
Property(name="title", data_type=DataType.TEXT),
Property(name="content", data_type=DataType.TEXT),
],
)
articles = client.collections.get("Article")
# Insert an object — Weaviate auto-vectorises via the configured vectorizer
uuid = articles.data.insert({
"title": "Attention Is All You Need",
"content": "The transformer architecture replaces recurrence with self-attention.",
})
print(f"Inserted: {uuid}")
# Semantic (vector) search
results = articles.query.near_text(query="neural network attention", limit=3)
for obj in results.objects:
print(obj.properties["title"])
client.close()
Output:
Inserted: 3f7a1b2c-...
Attention Is All You Need
When / why to use it
- Production-scale RAG pipelines where you need filtered vector search (combine semantic similarity with metadata conditions).
- Multi-tenant SaaS applications — Weaviate's built-in multi-tenancy isolates data per customer in one cluster.
- Hybrid search — combine vector similarity with BM25 keyword scoring in a single query for better recall on short or keyword-heavy questions.
- Generative search — ask Weaviate to pass retrieved objects directly to an LLM and return a synthesised answer in one API call.
- When you need replication, backups, and enterprise features on top of a vector store.
Common pitfalls
Always call
client.close()— the v4 client manages gRPC connections. Not closing it causes resource leaks in long-running processes. Usewith weaviate.connect_to_local() as client:to close automatically.
Schema changes require collection deletion — Weaviate's schema is immutable after collection creation (property names, data types). To change the schema you must delete and recreate the collection, which deletes all data. Design the schema carefully before inserting production data.
Vectorizer must match your data — if you create a collection with
text2vec_openai(), Weaviate calls the OpenAI embedding API for every inserted object. Set theOPENAI_APIKEYenvironment variable or pass it viaheaders={"X-OpenAI-Api-Key": "..."}toconnect_to_local.
Use
batch.dynamic()for bulk imports — it auto-tunes batch size and parallelism. Inserting objects one at a time viadata.insert()is up to 100× slower for large datasets.
The
near_textshorthand only works when the collection has a configured vectorizer. If you manage embeddings externally, usenear_vectorinstead and pass the embedding array directly.
Connecting to Weaviate
The v4 client provides convenience functions for the three most common deployment modes.
import weaviate, os
# Local Docker or bare-metal instance (default: localhost:8080)
client = weaviate.connect_to_local()
# Local with custom host/port and API keys for external services
client = weaviate.connect_to_local(
host="192.168.1.10",
port=8080,
headers={"X-OpenAI-Api-Key": os.environ["OPENAI_API_KEY"]},
)
# Weaviate Cloud Services (WCS)
client = weaviate.connect_to_weaviate_cloud(
cluster_url=os.environ["WEAVIATE_URL"],
auth_credentials=weaviate.auth.AuthApiKey(api_key=os.environ["WEAVIATE_API_KEY"]),
headers={"X-OpenAI-Api-Key": os.environ["OPENAI_API_KEY"]},
)
# Embedded (in-process Weaviate, no Docker needed — good for testing)
client = weaviate.connect_to_embedded(
version="1.26.4", # pin to a specific version
headers={"X-OpenAI-Api-Key": os.environ["OPENAI_API_KEY"]},
)
print(client.is_ready())
client.close()
Output:
True
Defining collections
A collection is Weaviate's equivalent of a table or index. It defines the schema (properties and their types), the vectorizer, and optional replication/quantisation settings.
import weaviate
from weaviate.classes.config import Configure, Property, DataType, VectorDistances
with weaviate.connect_to_local() as client:
client.collections.create(
name="Document",
description="RAG knowledge base documents",
vectorizer_config=Configure.Vectorizer.text2vec_openai(
model="text-embedding-3-small",
vectorize_collection_name=False,
),
vector_index_config=Configure.VectorIndex.hnsw(
distance_metric=VectorDistances.COSINE,
ef_construction=128,
max_connections=64,
),
generative_config=Configure.Generative.openai(model="gpt-4o-mini"),
properties=[
Property(name="title", data_type=DataType.TEXT),
Property(name="content", data_type=DataType.TEXT),
Property(name="source", data_type=DataType.TEXT, skip_vectorization=True),
Property(name="page", data_type=DataType.INT, skip_vectorization=True),
Property(name="published",data_type=DataType.DATE, skip_vectorization=True),
],
)
print("Collection 'Document' created")
Output:
Collection 'Document' created
CRUD operations
import weaviate
from weaviate.util import generate_uuid5
with weaviate.connect_to_local() as client:
docs = client.collections.get("Document")
# Insert one object (auto-vectorised)
uuid = docs.data.insert({
"title": "Introduction to Transformers",
"content": "Transformers use multi-head self-attention to process sequences.",
"source": "paper.pdf",
"page": 1,
})
print(f"Inserted: {uuid}")
# Insert with a deterministic UUID (idempotent)
det_uuid = docs.data.insert(
properties={
"title": "BERT: Pre-training of Deep Bidirectional Transformers",
"content": "BERT uses masked language modelling for pre-training.",
"source": "bert.pdf",
"page": 1,
},
uuid=generate_uuid5("bert.pdf-page1"),
)
# Get by UUID
obj = docs.query.fetch_object_by_id(det_uuid)
print(obj.properties["title"])
# Update (partial update — only specified fields are changed)
docs.data.update(uuid=det_uuid, properties={"page": 2})
# Replace (full overwrite)
docs.data.replace(
uuid=det_uuid,
properties={"title": "BERT", "content": "Updated content.", "source": "bert.pdf", "page": 2},
)
# Delete by UUID
docs.data.delete_by_id(uuid)
Output:
Inserted: a1b2c3d4-...
BERT: Pre-training of Deep Bidirectional Transformers
Batch import
Batch import is essential for loading large datasets. batch.dynamic() auto-tunes concurrency and batch size.
import weaviate
from weaviate.util import generate_uuid5
documents = [
{"title": f"Document {i}", "content": f"Content about topic {i}.", "source": f"doc{i}.pdf", "page": i}
for i in range(500)
]
with weaviate.connect_to_local() as client:
docs = client.collections.get("Document")
with docs.batch.dynamic() as batch:
for doc in documents:
batch.add_object(
properties=doc,
uuid=generate_uuid5(doc["source"]),
)
# Check for failed objects
if docs.batch.failed_objects:
for failed in docs.batch.failed_objects:
print(f"Failed: {failed.message}")
else:
print(f"Imported {len(documents)} objects")
Output:
Imported 500 objects
Vector search
Weaviate supports three search modes: semantic (near_text / near_vector), keyword (BM25), and hybrid.
import weaviate
from weaviate.classes.query import MetadataQuery
with weaviate.connect_to_local() as client:
docs = client.collections.get("Document")
# --- Semantic search (near_text) ---
# Weaviate vectorises the query using the collection's vectorizer
results = docs.query.near_text(
query="self-attention mechanism in neural networks",
limit=3,
return_metadata=MetadataQuery(distance=True, score=True),
)
for obj in results.objects:
print(f"[{obj.metadata.distance:.3f}] {obj.properties['title']}")
print()
# --- BM25 keyword search ---
results = docs.query.bm25(
query="transformer architecture",
query_properties=["title", "content"],
limit=3,
return_metadata=MetadataQuery(score=True),
)
for obj in results.objects:
print(f"[score={obj.metadata.score:.3f}] {obj.properties['title']}")
Output:
[0.082] Introduction to Transformers
[0.104] Attention Is All You Need
[0.131] BERT: Pre-training of Deep Bidirectional Transformers
[score=2.841] Introduction to Transformers
[score=2.234] Attention Is All You Need
Hybrid search
Hybrid search combines BM25 and vector similarity using a weighted fusion algorithm. It improves recall for short or keyword-heavy queries compared to pure vector search.
import weaviate
from weaviate.classes.query import MetadataQuery, HybridFusion
with weaviate.connect_to_local() as client:
docs = client.collections.get("Document")
results = docs.query.hybrid(
query="transformer attention",
alpha=0.75, # 0 = pure BM25, 1 = pure vector, 0.75 = mostly semantic
fusion_type=HybridFusion.RELATIVE_SCORE,
limit=5,
return_metadata=MetadataQuery(score=True),
)
for obj in results.objects:
print(f"[hybrid={obj.metadata.score:.4f}] {obj.properties['title']}")
Output:
[hybrid=0.9873] Introduction to Transformers
[hybrid=0.9741] Attention Is All You Need
[hybrid=0.8934] BERT: Pre-training of Deep Bidirectional Transformers
Filtered search
Weaviate supports structured metadata filters that can be combined with any search type.
import weaviate
from weaviate.classes.query import Filter, MetadataQuery
with weaviate.connect_to_local() as client:
docs = client.collections.get("Document")
# Vector search + metadata filter
results = docs.query.near_text(
query="language model pre-training",
limit=5,
filters=Filter.by_property("source").equal("bert.pdf"),
return_metadata=MetadataQuery(distance=True),
)
# Compound filters (AND / OR)
results_compound = docs.query.near_text(
query="attention",
limit=5,
filters=(
Filter.by_property("page").greater_than(0) &
Filter.by_property("source").contains_any(["bert.pdf", "paper.pdf"])
),
)
for obj in results.objects:
print(f"{obj.properties['title']} — {obj.properties['source']}")
Output:
BERT: Pre-training of Deep Bidirectional Transformers — bert.pdf
Bringing your own vectors
If you generate embeddings externally (e.g. with sentence-transformers or the Anthropic API), skip the vectorizer and pass vectors directly.
import weaviate
import numpy as np
def embed(texts: list[str]) -> list[list[float]]:
"""Stub — replace with your embedding model."""
return [np.random.rand(1536).tolist() for _ in texts]
with weaviate.connect_to_local() as client:
# Collection with NO vectorizer
if not client.collections.exists("ManualVec"):
client.collections.create(
name="ManualVec",
properties=[
weaviate.classes.config.Property(name="text", data_type=weaviate.classes.config.DataType.TEXT),
],
)
coll = client.collections.get("ManualVec")
texts = ["Transformers use self-attention.", "BERT is bidirectional."]
vectors = embed(texts)
with coll.batch.dynamic() as batch:
for text, vec in zip(texts, vectors):
batch.add_object(properties={"text": text}, vector=vec)
# Query with an externally generated vector
query_vec = embed(["What is attention?"])[0]
results = coll.query.near_vector(near_vector=query_vec, limit=2)
for obj in results.objects:
print(obj.properties["text"])
Output:
Transformers use self-attention.
BERT is bidirectional.
Generative search (RAG)
Weaviate can pass retrieved objects directly to an LLM and return a synthesised answer — one round trip to the server does both retrieval and generation.
import weaviate
with weaviate.connect_to_local() as client:
docs = client.collections.get("Document") # must have generative_config set
# Grouped task — LLM receives ALL retrieved objects and generates one answer
response = docs.generate.near_text(
query="transformer attention mechanism",
limit=3,
grouped_task="Summarise the key ideas about attention in transformers.",
)
print(response.generated)
# Single prompt — LLM generates a response per retrieved object
response = docs.generate.near_text(
query="BERT pre-training",
limit=2,
single_prompt="Explain this document in one sentence: {content}",
)
for obj in response.objects:
print(f"{obj.properties['title']}: {obj.generated}")
Output:
Attention in transformers computes weighted sums over input tokens using query-key similarity, enabling models to focus on relevant parts of the input simultaneously.
BERT: Pre-training uses masked language modelling to learn bidirectional representations.
Multi-tenancy
Multi-tenancy isolates data between tenants (e.g. customers) inside a single collection — no separate collections or clusters needed.
import weaviate
from weaviate.classes.config import Configure, Property, DataType
from weaviate.classes.tenants import Tenant
with weaviate.connect_to_local() as client:
# Create a multi-tenant collection
client.collections.create(
name="TenantDocs",
multi_tenancy_config=Configure.multi_tenancy(enabled=True),
properties=[Property(name="text", data_type=DataType.TEXT)],
)
coll = client.collections.get("TenantDocs")
# Add tenants
coll.tenants.create([Tenant(name="acme"), Tenant(name="globex")])
# Insert into a specific tenant
acme = coll.with_tenant("acme")
acme.data.insert({"text": "ACME internal document."})
globex = coll.with_tenant("globex")
globex.data.insert({"text": "Globex internal document."})
# Query is tenant-scoped — can't see other tenants' data
results = acme.query.fetch_objects(limit=5)
for obj in results.objects:
print(obj.properties["text"])
Output:
ACME internal document.
LangChain integration
from langchain_weaviate.vectorstores import WeaviateVectorStore
from langchain_openai import OpenAIEmbeddings
import weaviate, os
client = weaviate.connect_to_local(
headers={"X-OpenAI-Api-Key": os.environ["OPENAI_API_KEY"]}
)
embeddings = OpenAIEmbeddings(api_key=os.environ["OPENAI_API_KEY"])
vectorstore = WeaviateVectorStore(
client=client,
index_name="LangChainDocs",
text_key="text",
embedding=embeddings,
)
vectorstore.add_texts(
texts=["Transformers use self-attention.", "BERT is bidirectional."],
metadatas=[{"source": "paper"}, {"source": "bert"}],
)
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})
docs = retriever.invoke("What is self-attention?")
for doc in docs:
print(doc.page_content)
Quick reference
| Task | Code |
|---|---|
| Connect local | weaviate.connect_to_local() |
| Connect WCS | weaviate.connect_to_weaviate_cloud(cluster_url=..., auth_credentials=AuthApiKey(...)) |
| Embedded | weaviate.connect_to_embedded() |
| Create collection | client.collections.create(name=..., vectorizer_config=..., properties=[...]) |
| Get collection | client.collections.get("Name") |
| Insert one | coll.data.insert({"field": "value"}) |
| Batch import | with coll.batch.dynamic() as batch: batch.add_object(...) |
| Semantic search | coll.query.near_text(query=..., limit=n) |
| Vector search | coll.query.near_vector(near_vector=[...], limit=n) |
| Keyword search | coll.query.bm25(query=..., limit=n) |
| Hybrid search | coll.query.hybrid(query=..., alpha=0.75, limit=n) |
| Filter | Filter.by_property("field").equal("value") |
| Generative search | coll.generate.near_text(query=..., grouped_task="Summarise...") |
| Delete object | coll.data.delete_by_id(uuid) |
| Multi-tenancy | coll.with_tenant("name").data.insert(...) |
| Close | client.close() or with ... as client: |