RAG with ChromaDB

ChromaDB is an open-source vector database designed to make it easy to build AI applications with embeddings. This guide shows you how to integrate ChromaDB into your Cycls agent to build a Retrieval-Augmented Generation (RAG) workflow. You will learn how to:

Add ChromaDB as a dependency.
Store and query document embeddings.
Retrieve context to use in your agent’s response.

Prerequisites

Python 3.8+
cycls package installed
Docker installed (for local testing)
OpenAI API key

pip install cycls

Step 1: Import Cycls

Create a new file called agent.py and import the cycls package:

import cycls

Step 2: Initialize the Agent

Initialize your agent and specify chromadb and openai as dependencies.

# Initialize the agent with dependencies
agent = cycls.Agent(pip=["chromadb", "openai"])

Step 3: Define the Agent Logic

Use the @agent decorator to register your async handler function. We’ll configure ChromaDB to use OpenAI’s embedding model instead of the default local model (which would require downloading large files).

@agent("chroma-agent", title="RAG Agent")
async def search_agent(context):
    import chromadb
    from chromadb.utils import embedding_functions
    
    # 1. Setup OpenAI Embedding Function (Get your OpenAI API key from https://platform.openai.com)
    openai_ef = embedding_functions.OpenAIEmbeddingFunction(
        api_key="YOUR_OPENAI_KEY",
        model_name="text-embedding-3-small"
    )
    
    # 2. Initialize ChromaDB client
    client = chromadb.Client()
    
    # 3. Create/Get collection with specific embedding function
    collection = client.get_or_create_collection(
        name="docs", 
        embedding_function=openai_ef
    )

    # 4. Add documents (embeddings are generated automatically via OpenAI)
    collection.add(
        documents=["I love cats", "I love dogs", "The weather is nice"],
        ids=["1", "2", "3"]
    )

    # 5. Get user query
    query = context.messages[-1]["content"]

    # 6. Perform similarity search
    results = collection.query(
        query_texts=[query], 
        n_results=1
    )
    
    # 7. Return the retrieved context
    retrieved_doc = results['documents'][0][0]
    yield f"Found context: {retrieved_doc}"

Step 4: Deploy the Agent

Finally, add the deployment command. We’ll use prod=False to run it locally first.

# Run locally
agent.deploy(prod=False)

Full Code

Here is the complete agent.py file:

import cycls

# Initialize the agent with dependencies
agent = cycls.Agent(pip=["chromadb", "openai"])

@agent("chroma-agent", title="RAG Agent")
async def search_agent(context):
    import chromadb
    from chromadb.utils import embedding_functions
    
    # Setup OpenAI Embedding Function
    openai_ef = embedding_functions.OpenAIEmbeddingFunction(
        api_key="YOUR_OPENAI_KEY", # Replace with your actual key
        model_name="text-embedding-3-small"
    )
    
    # Initialize ChromaDB client
    client = chromadb.Client()
    
    # Create collection with the embedding function
    collection = client.get_or_create_collection(
        name="docs", 
        embedding_function=openai_ef
    )

    # Add documents to the collection
    collection.add(
        documents=["I love cats", "I love dogs", "The weather is nice"],
        ids=["1", "2", "3"]
    )

    # Query using the latest message
    query = context.messages[-1]["content"]
    results = collection.query(query_texts=[query], n_results=1)

    # Return retrieved context
    retrieved_doc = results['documents'][0][0]
    yield f"Context: {retrieved_doc}"

# Run locally
agent.deploy(prod=False)

Step 5: Run the Agent

Execute your agent script:

python agent.py

Cycls will build the local Docker image and start your agent. You can then chat with it to test the semantic search functionality.

Get Started

Basics

Guide

Resources

LLMs

RAG with ChromaDB

Prerequisites

Step 1: Import Cycls

Step 2: Initialize the Agent

Step 3: Define the Agent Logic

Step 4: Deploy the Agent

Full Code

Step 5: Run the Agent

Get Started

Basics

Guide

Resources

LLMs

​Prerequisites

​Step 1: Import Cycls

​Step 2: Initialize the Agent

​Step 3: Define the Agent Logic

​Step 4: Deploy the Agent

​Full Code

​Step 5: Run the Agent

Prerequisites

Step 1: Import Cycls

Step 2: Initialize the Agent

Step 3: Define the Agent Logic

Step 4: Deploy the Agent

Full Code

Step 5: Run the Agent