Subscribe to my youtube channel for Video Tutorials: https://www.youtube.com/@LibraryofCelsus
(Channel not launched yet)
Github Example: https://github.com/libraryofcelsus/Basic-Qdrant-Upload-and-Search-Example
Introduction #
This tutorial will guide you through the process of uploading data into Qdrant Vector DB. We’ll be making use of the sentence transformers library for embeddings.
1. Create the Collection #
A collection in Qdrant functions similarly to an index in databases like Pinecone. Before inserting vectors, we need to ensure a collection exists. Our script will first check if a collection with the desired name already exists, and if not, it will create one. The dimensions of the collection are determined by the chosen embedding model from sentence transformers.
# Set Sentence Transformer Model
model = SentenceTransformer('all-mpnet-base-v2')
# Define the collection name
collection_name = f"SET_COLLECTION_NAME_HERE"
try:
collection_info = client.get_collection(collection_name=collection_name)
except:
client.create_collection(
collection_name=collection_name,
vectors_config=models.VectorParams(size=model.get_sentence_embedding_dimension(), distance=Distance.COSINE),
)
2. Vectorize Your Query for Upload #
To store data in Qdrant, we first need to convert our textual data into numerical vectors (embeddings). The sentence transformers library helps us achieve this.
embedding = model.encode([query])[0].tolist()
3. Set the Point ID #
Every point in Qdrant should have a unique identifier (UUID). We’ll generate this UUID and convert it to a string format for compatibility.
unique_id = str(uuid4())
4. Define Metadata for the Payload #
Metadata helps enrich the information you store in Qdrant. By storing both time as an integer and as a string, you can effectively search memories using a “Range” filter. Including classifiers like memory_type further refines your filtering capabilities.
metadata = {
'bot': bot_name,
'time': timestamp,
'message': query,
'timestring': timestring,
'memory_type': 'TYPE OF MEMORY',
}
5. Upsert to Qdrant DB #
Finally, we’ll upsert (insert or update) our vector and its metadata to the specified collection in Qdrant.
client.upsert(collection_name=collection_name,
points=[PointStruct(id=unique_id, payload=metadata, vector=embedding)])
Documentation Links
https://github.com/qdrant/qdrant-client
https://qdrant.tech/documentation/concepts/collections/
https://qdrant.tech/documentation/concepts/payload/
https://qdrant.tech/documentation/concepts/points/
https://qdrant.tech/documentation/concepts/filtering/
https://qdrant.tech/documentation/tutorials/search-beginners/