cheat sheet

weaviate-client

Store, search, and manage vector embeddings with the Weaviate Python client. Covers collections, CRUD, vector/hybrid/BM25 search, multi-tenancy, generative search, and batch import.

weaviate-client — Vector Database Client

What it is

Weaviate is an open-source vector database that stores objects alongside their embeddings and enables semantic search, keyword (BM25) search, and hybrid queries in a single request. The weaviate-client Python library (v4 API) connects to a local Weaviate instance, Weaviate Cloud Services (WCS), or an embedded in-process Weaviate, and provides a class-based, schema-first API for defining collections, inserting data, and querying by vector similarity or filter. Weaviate is designed for production-scale RAG pipelines and supports multi-tenancy, replication, and generative search natively.

Install

bash
pip install weaviate-client           # v4 client (recommended)

Output: (none — exits 0 on success)

Quick example

python
import weaviate
from weaviate.classes.config import Configure, Property, DataType

# Connect to a local Weaviate instance
client = weaviate.connect_to_local()

# Create a collection
client.collections.create(
    name="Article",
    vectorizer_config=Configure.Vectorizer.text2vec_openai(),
    properties=[
        Property(name="title",   data_type=DataType.TEXT),
        Property(name="content", data_type=DataType.TEXT),
    ],
)

articles = client.collections.get("Article")

# Insert an object — Weaviate auto-vectorises via the configured vectorizer
uuid = articles.data.insert({
    "title":   "Attention Is All You Need",
    "content": "The transformer architecture replaces recurrence with self-attention.",
})
print(f"Inserted: {uuid}")

# Semantic (vector) search
results = articles.query.near_text(query="neural network attention", limit=3)
for obj in results.objects:
    print(obj.properties["title"])

client.close()

Output:

text
Inserted: 3f7a1b2c-...
Attention Is All You Need

When / why to use it

  • Production-scale RAG pipelines where you need filtered vector search (combine semantic similarity with metadata conditions).
  • Multi-tenant SaaS applications — Weaviate's built-in multi-tenancy isolates data per customer in one cluster.
  • Hybrid search — combine vector similarity with BM25 keyword scoring in a single query for better recall on short or keyword-heavy questions.
  • Generative search — ask Weaviate to pass retrieved objects directly to an LLM and return a synthesised answer in one API call.
  • When you need replication, backups, and enterprise features on top of a vector store.

Common pitfalls

Always call client.close() — the v4 client manages gRPC connections. Not closing it causes resource leaks in long-running processes. Use with weaviate.connect_to_local() as client: to close automatically.

Schema changes require collection deletion — Weaviate's schema is immutable after collection creation (property names, data types). To change the schema you must delete and recreate the collection, which deletes all data. Design the schema carefully before inserting production data.

Vectorizer must match your data — if you create a collection with text2vec_openai(), Weaviate calls the OpenAI embedding API for every inserted object. Set the OPENAI_APIKEY environment variable or pass it via headers={"X-OpenAI-Api-Key": "..."} to connect_to_local.

Use batch.dynamic() for bulk imports — it auto-tunes batch size and parallelism. Inserting objects one at a time via data.insert() is up to 100× slower for large datasets.

The near_text shorthand only works when the collection has a configured vectorizer. If you manage embeddings externally, use near_vector instead and pass the embedding array directly.

Connecting to Weaviate

The v4 client provides convenience functions for the three most common deployment modes.

python
import weaviate, os

# Local Docker or bare-metal instance (default: localhost:8080)
client = weaviate.connect_to_local()

# Local with custom host/port and API keys for external services
client = weaviate.connect_to_local(
    host="192.168.1.10",
    port=8080,
    headers={"X-OpenAI-Api-Key": os.environ["OPENAI_API_KEY"]},
)

# Weaviate Cloud Services (WCS)
client = weaviate.connect_to_weaviate_cloud(
    cluster_url=os.environ["WEAVIATE_URL"],
    auth_credentials=weaviate.auth.AuthApiKey(api_key=os.environ["WEAVIATE_API_KEY"]),
    headers={"X-OpenAI-Api-Key": os.environ["OPENAI_API_KEY"]},
)

# Embedded (in-process Weaviate, no Docker needed — good for testing)
client = weaviate.connect_to_embedded(
    version="1.26.4",  # pin to a specific version
    headers={"X-OpenAI-Api-Key": os.environ["OPENAI_API_KEY"]},
)

print(client.is_ready())
client.close()

Output:

text
True

Defining collections

A collection is Weaviate's equivalent of a table or index. It defines the schema (properties and their types), the vectorizer, and optional replication/quantisation settings.

python
import weaviate
from weaviate.classes.config import Configure, Property, DataType, VectorDistances

with weaviate.connect_to_local() as client:
    client.collections.create(
        name="Document",
        description="RAG knowledge base documents",
        vectorizer_config=Configure.Vectorizer.text2vec_openai(
            model="text-embedding-3-small",
            vectorize_collection_name=False,
        ),
        vector_index_config=Configure.VectorIndex.hnsw(
            distance_metric=VectorDistances.COSINE,
            ef_construction=128,
            max_connections=64,
        ),
        generative_config=Configure.Generative.openai(model="gpt-4o-mini"),
        properties=[
            Property(name="title",    data_type=DataType.TEXT),
            Property(name="content",  data_type=DataType.TEXT),
            Property(name="source",   data_type=DataType.TEXT,  skip_vectorization=True),
            Property(name="page",     data_type=DataType.INT,   skip_vectorization=True),
            Property(name="published",data_type=DataType.DATE,  skip_vectorization=True),
        ],
    )
    print("Collection 'Document' created")

Output:

text
Collection 'Document' created

CRUD operations

python
import weaviate
from weaviate.util import generate_uuid5

with weaviate.connect_to_local() as client:
    docs = client.collections.get("Document")

    # Insert one object (auto-vectorised)
    uuid = docs.data.insert({
        "title":   "Introduction to Transformers",
        "content": "Transformers use multi-head self-attention to process sequences.",
        "source":  "paper.pdf",
        "page":    1,
    })
    print(f"Inserted: {uuid}")

    # Insert with a deterministic UUID (idempotent)
    det_uuid = docs.data.insert(
        properties={
            "title":   "BERT: Pre-training of Deep Bidirectional Transformers",
            "content": "BERT uses masked language modelling for pre-training.",
            "source":  "bert.pdf",
            "page":    1,
        },
        uuid=generate_uuid5("bert.pdf-page1"),
    )

    # Get by UUID
    obj = docs.query.fetch_object_by_id(det_uuid)
    print(obj.properties["title"])

    # Update (partial update — only specified fields are changed)
    docs.data.update(uuid=det_uuid, properties={"page": 2})

    # Replace (full overwrite)
    docs.data.replace(
        uuid=det_uuid,
        properties={"title": "BERT", "content": "Updated content.", "source": "bert.pdf", "page": 2},
    )

    # Delete by UUID
    docs.data.delete_by_id(uuid)

Output:

text
Inserted: a1b2c3d4-...
BERT: Pre-training of Deep Bidirectional Transformers

Batch import

Batch import is essential for loading large datasets. batch.dynamic() auto-tunes concurrency and batch size.

python
import weaviate
from weaviate.util import generate_uuid5

documents = [
    {"title": f"Document {i}", "content": f"Content about topic {i}.", "source": f"doc{i}.pdf", "page": i}
    for i in range(500)
]

with weaviate.connect_to_local() as client:
    docs = client.collections.get("Document")

    with docs.batch.dynamic() as batch:
        for doc in documents:
            batch.add_object(
                properties=doc,
                uuid=generate_uuid5(doc["source"]),
            )

    # Check for failed objects
    if docs.batch.failed_objects:
        for failed in docs.batch.failed_objects:
            print(f"Failed: {failed.message}")
    else:
        print(f"Imported {len(documents)} objects")

Output:

text
Imported 500 objects

Weaviate supports three search modes: semantic (near_text / near_vector), keyword (BM25), and hybrid.

python
import weaviate
from weaviate.classes.query import MetadataQuery

with weaviate.connect_to_local() as client:
    docs = client.collections.get("Document")

    # --- Semantic search (near_text) ---
    # Weaviate vectorises the query using the collection's vectorizer
    results = docs.query.near_text(
        query="self-attention mechanism in neural networks",
        limit=3,
        return_metadata=MetadataQuery(distance=True, score=True),
    )
    for obj in results.objects:
        print(f"[{obj.metadata.distance:.3f}] {obj.properties['title']}")

    print()

    # --- BM25 keyword search ---
    results = docs.query.bm25(
        query="transformer architecture",
        query_properties=["title", "content"],
        limit=3,
        return_metadata=MetadataQuery(score=True),
    )
    for obj in results.objects:
        print(f"[score={obj.metadata.score:.3f}] {obj.properties['title']}")

Output:

text
[0.082] Introduction to Transformers
[0.104] Attention Is All You Need
[0.131] BERT: Pre-training of Deep Bidirectional Transformers

[score=2.841] Introduction to Transformers
[score=2.234] Attention Is All You Need

Hybrid search combines BM25 and vector similarity using a weighted fusion algorithm. It improves recall for short or keyword-heavy queries compared to pure vector search.

python
import weaviate
from weaviate.classes.query import MetadataQuery, HybridFusion

with weaviate.connect_to_local() as client:
    docs = client.collections.get("Document")

    results = docs.query.hybrid(
        query="transformer attention",
        alpha=0.75,                          # 0 = pure BM25, 1 = pure vector, 0.75 = mostly semantic
        fusion_type=HybridFusion.RELATIVE_SCORE,
        limit=5,
        return_metadata=MetadataQuery(score=True),
    )

    for obj in results.objects:
        print(f"[hybrid={obj.metadata.score:.4f}] {obj.properties['title']}")

Output:

text
[hybrid=0.9873] Introduction to Transformers
[hybrid=0.9741] Attention Is All You Need
[hybrid=0.8934] BERT: Pre-training of Deep Bidirectional Transformers

Weaviate supports structured metadata filters that can be combined with any search type.

python
import weaviate
from weaviate.classes.query import Filter, MetadataQuery

with weaviate.connect_to_local() as client:
    docs = client.collections.get("Document")

    # Vector search + metadata filter
    results = docs.query.near_text(
        query="language model pre-training",
        limit=5,
        filters=Filter.by_property("source").equal("bert.pdf"),
        return_metadata=MetadataQuery(distance=True),
    )

    # Compound filters (AND / OR)
    results_compound = docs.query.near_text(
        query="attention",
        limit=5,
        filters=(
            Filter.by_property("page").greater_than(0) &
            Filter.by_property("source").contains_any(["bert.pdf", "paper.pdf"])
        ),
    )

    for obj in results.objects:
        print(f"{obj.properties['title']}{obj.properties['source']}")

Output:

text
BERT: Pre-training of Deep Bidirectional Transformers — bert.pdf

Bringing your own vectors

If you generate embeddings externally (e.g. with sentence-transformers or the Anthropic API), skip the vectorizer and pass vectors directly.

python
import weaviate
import numpy as np

def embed(texts: list[str]) -> list[list[float]]:
    """Stub — replace with your embedding model."""
    return [np.random.rand(1536).tolist() for _ in texts]

with weaviate.connect_to_local() as client:
    # Collection with NO vectorizer
    if not client.collections.exists("ManualVec"):
        client.collections.create(
            name="ManualVec",
            properties=[
                weaviate.classes.config.Property(name="text", data_type=weaviate.classes.config.DataType.TEXT),
            ],
        )

    coll = client.collections.get("ManualVec")
    texts = ["Transformers use self-attention.", "BERT is bidirectional."]
    vectors = embed(texts)

    with coll.batch.dynamic() as batch:
        for text, vec in zip(texts, vectors):
            batch.add_object(properties={"text": text}, vector=vec)

    # Query with an externally generated vector
    query_vec = embed(["What is attention?"])[0]
    results = coll.query.near_vector(near_vector=query_vec, limit=2)
    for obj in results.objects:
        print(obj.properties["text"])

Output:

text
Transformers use self-attention.
BERT is bidirectional.

Generative search (RAG)

Weaviate can pass retrieved objects directly to an LLM and return a synthesised answer — one round trip to the server does both retrieval and generation.

python
import weaviate

with weaviate.connect_to_local() as client:
    docs = client.collections.get("Document")   # must have generative_config set

    # Grouped task — LLM receives ALL retrieved objects and generates one answer
    response = docs.generate.near_text(
        query="transformer attention mechanism",
        limit=3,
        grouped_task="Summarise the key ideas about attention in transformers.",
    )
    print(response.generated)

    # Single prompt — LLM generates a response per retrieved object
    response = docs.generate.near_text(
        query="BERT pre-training",
        limit=2,
        single_prompt="Explain this document in one sentence: {content}",
    )
    for obj in response.objects:
        print(f"{obj.properties['title']}: {obj.generated}")

Output:

text
Attention in transformers computes weighted sums over input tokens using query-key similarity, enabling models to focus on relevant parts of the input simultaneously.
BERT: Pre-training uses masked language modelling to learn bidirectional representations.

Multi-tenancy

Multi-tenancy isolates data between tenants (e.g. customers) inside a single collection — no separate collections or clusters needed.

python
import weaviate
from weaviate.classes.config import Configure, Property, DataType
from weaviate.classes.tenants import Tenant

with weaviate.connect_to_local() as client:
    # Create a multi-tenant collection
    client.collections.create(
        name="TenantDocs",
        multi_tenancy_config=Configure.multi_tenancy(enabled=True),
        properties=[Property(name="text", data_type=DataType.TEXT)],
    )
    coll = client.collections.get("TenantDocs")

    # Add tenants
    coll.tenants.create([Tenant(name="acme"), Tenant(name="globex")])

    # Insert into a specific tenant
    acme = coll.with_tenant("acme")
    acme.data.insert({"text": "ACME internal document."})

    globex = coll.with_tenant("globex")
    globex.data.insert({"text": "Globex internal document."})

    # Query is tenant-scoped — can't see other tenants' data
    results = acme.query.fetch_objects(limit=5)
    for obj in results.objects:
        print(obj.properties["text"])

Output:

text
ACME internal document.

LangChain integration

python
from langchain_weaviate.vectorstores import WeaviateVectorStore
from langchain_openai import OpenAIEmbeddings
import weaviate, os

client = weaviate.connect_to_local(
    headers={"X-OpenAI-Api-Key": os.environ["OPENAI_API_KEY"]}
)
embeddings = OpenAIEmbeddings(api_key=os.environ["OPENAI_API_KEY"])

vectorstore = WeaviateVectorStore(
    client=client,
    index_name="LangChainDocs",
    text_key="text",
    embedding=embeddings,
)

vectorstore.add_texts(
    texts=["Transformers use self-attention.", "BERT is bidirectional."],
    metadatas=[{"source": "paper"}, {"source": "bert"}],
)

retriever = vectorstore.as_retriever(search_kwargs={"k": 3})
docs = retriever.invoke("What is self-attention?")
for doc in docs:
    print(doc.page_content)

Quick reference

TaskCode
Connect localweaviate.connect_to_local()
Connect WCSweaviate.connect_to_weaviate_cloud(cluster_url=..., auth_credentials=AuthApiKey(...))
Embeddedweaviate.connect_to_embedded()
Create collectionclient.collections.create(name=..., vectorizer_config=..., properties=[...])
Get collectionclient.collections.get("Name")
Insert onecoll.data.insert({"field": "value"})
Batch importwith coll.batch.dynamic() as batch: batch.add_object(...)
Semantic searchcoll.query.near_text(query=..., limit=n)
Vector searchcoll.query.near_vector(near_vector=[...], limit=n)
Keyword searchcoll.query.bm25(query=..., limit=n)
Hybrid searchcoll.query.hybrid(query=..., alpha=0.75, limit=n)
FilterFilter.by_property("field").equal("value")
Generative searchcoll.generate.near_text(query=..., grouped_task="Summarise...")
Delete objectcoll.data.delete_by_id(uuid)
Multi-tenancycoll.with_tenant("name").data.insert(...)
Closeclient.close() or with ... as client: