View Categories

Uploading to a Qdrant Vector DB for Ai Chatbot Retrieval Frameworks

2 min read

Subscribe to my youtube channel for Video Tutorials: https://www.youtube.com/@LibraryofCelsus
(Channel not launched yet)

Github Example: https://github.com/libraryofcelsus/Basic-Qdrant-Upload-and-Search-Example

Introduction #

This tutorial will guide you through the process of uploading data into Qdrant Vector DB. We’ll be making use of the sentence transformers library for embeddings.

1. Create the Collection #

A collection in Qdrant functions similarly to an index in databases like Pinecone. Before inserting vectors, we need to ensure a collection exists. Our script will first check if a collection with the desired name already exists, and if not, it will create one. The dimensions of the collection are determined by the chosen embedding model from sentence transformers.

    # Set Sentence Transformer Model
    model = SentenceTransformer('all-mpnet-base-v2')
# Define the collection name
    collection_name = f"SET_COLLECTION_NAME_HERE"
    try:
        collection_info = client.get_collection(collection_name=collection_name)
    except:
        client.create_collection(
            collection_name=collection_name,
            vectors_config=models.VectorParams(size=model.get_sentence_embedding_dimension(), distance=Distance.COSINE),
        )

2. Vectorize Your Query for Upload #

To store data in Qdrant, we first need to convert our textual data into numerical vectors (embeddings). The sentence transformers library helps us achieve this.

embedding = model.encode([query])[0].tolist()

3. Set the Point ID #

Every point in Qdrant should have a unique identifier (UUID). We’ll generate this UUID and convert it to a string format for compatibility.

 unique_id = str(uuid4())

4. Define Metadata for the Payload #

Metadata helps enrich the information you store in Qdrant. By storing both time as an integer and as a string, you can effectively search memories using a “Range” filter. Including classifiers like memory_type further refines your filtering capabilities.

    metadata = {
        'bot': bot_name,
        'time': timestamp,
        'message': query,
        'timestring': timestring,
        'memory_type': 'TYPE OF MEMORY',
    }

5. Upsert to Qdrant DB #

Finally, we’ll upsert (insert or update) our vector and its metadata to the specified collection in Qdrant.

    client.upsert(collection_name=collection_name,
                         points=[PointStruct(id=unique_id, payload=metadata, vector=embedding)])  
Buy Me a Coffee at ko-fi.com

Leave a Reply