Skip to main content
ChromaDB is an open-source vector database designed to make it easy to build AI applications with embeddings. This guide shows you how to integrate ChromaDB into your Cycls agent to build a Retrieval-Augmented Generation (RAG) workflow. You will learn how to:
  1. Add ChromaDB as a dependency.
  2. Store and query document embeddings.
  3. Retrieve context to use in your agent’s response.

Prerequisites

  • Python 3.8+
  • cycls package installed
  • Docker installed (for local testing)
  • OpenAI API key
pip install cycls

Step 1: Import Cycls

Create a new file called agent.py and import the cycls package:
import cycls

Step 2: Initialize the Agent

Initialize your agent and specify chromadb and openai as dependencies.
# Initialize the agent with dependencies
agent = cycls.Agent(pip=["chromadb", "openai"])

Step 3: Define the Agent Logic

Use the @agent decorator to register your async handler function. We’ll configure ChromaDB to use OpenAI’s embedding model instead of the default local model (which would require downloading large files).
@agent("chroma-agent", title="RAG Agent")
async def search_agent(context):
    import chromadb
    from chromadb.utils import embedding_functions
    
    # 1. Setup OpenAI Embedding Function (Get your OpenAI API key from https://platform.openai.com)
    openai_ef = embedding_functions.OpenAIEmbeddingFunction(
        api_key="YOUR_OPENAI_KEY",
        model_name="text-embedding-3-small"
    )
    
    # 2. Initialize ChromaDB client
    client = chromadb.Client()
    
    # 3. Create/Get collection with specific embedding function
    collection = client.get_or_create_collection(
        name="docs", 
        embedding_function=openai_ef
    )

    # 4. Add documents (embeddings are generated automatically via OpenAI)
    collection.add(
        documents=["I love cats", "I love dogs", "The weather is nice"],
        ids=["1", "2", "3"]
    )

    # 5. Get user query
    query = context.messages[-1]["content"]

    # 6. Perform similarity search
    results = collection.query(
        query_texts=[query], 
        n_results=1
    )
    
    # 7. Return the retrieved context
    retrieved_doc = results['documents'][0][0]
    yield f"Found context: {retrieved_doc}"

Step 4: Deploy the Agent

Finally, add the deployment command. We’ll use prod=False to run it locally first.
# Run locally
agent.deploy(prod=False)

Full Code

Here is the complete agent.py file:
import cycls

# Initialize the agent with dependencies
agent = cycls.Agent(pip=["chromadb", "openai"])

@agent("chroma-agent", title="RAG Agent")
async def search_agent(context):
    import chromadb
    from chromadb.utils import embedding_functions
    
    # Setup OpenAI Embedding Function
    openai_ef = embedding_functions.OpenAIEmbeddingFunction(
        api_key="YOUR_OPENAI_KEY", # Replace with your actual key
        model_name="text-embedding-3-small"
    )
    
    # Initialize ChromaDB client
    client = chromadb.Client()
    
    # Create collection with the embedding function
    collection = client.get_or_create_collection(
        name="docs", 
        embedding_function=openai_ef
    )

    # Add documents to the collection
    collection.add(
        documents=["I love cats", "I love dogs", "The weather is nice"],
        ids=["1", "2", "3"]
    )

    # Query using the latest message
    query = context.messages[-1]["content"]
    results = collection.query(query_texts=[query], n_results=1)

    # Return retrieved context
    retrieved_doc = results['documents'][0][0]
    yield f"Context: {retrieved_doc}"

# Run locally
agent.deploy(prod=False)

Step 5: Run the Agent

Execute your agent script:
python agent.py
Cycls will build the local Docker image and start your agent. You can then chat with it to test the semantic search functionality.