VectorDB#

Submodules#

Base#

Abstract base class for vector database clients.

This module provides:

— VectorDB

class grag.components.vectordb.base.VectorDB[source]#

Bases: ABC

Abstract base class for vector database clients.

abstract async aadd_docs(docs: List[Document], verbose: bool = True) None[source]#

Adds documents to the vector database (asynchronous).

Parameters:
  • docs – List of Documents

  • verbose – Show progress bar

Returns:

None

abstract add_docs(docs: List[Document], verbose: bool = True) None[source]#

Adds documents to the vector database.

Parameters:
  • docs – List of Documents

  • verbose – Show progress bar

Returns:

None

abstract async aget_chunk(query: str, with_score: bool = False, top_k: int | None = None) List[Document] | List[Tuple[Document, float]][source]#

Returns the most similar chunks from the vector database (asynchronous).

Parameters:
  • query – A query string

  • with_score – Outputs scores of returned chunks

  • top_k – Number of top similar chunks to return, if None defaults to self.top_k

Returns:

list of Documents

abstract delete() None[source]#

Delete all chunks in the vector database.

abstract get_chunk(query: str, with_score: bool = False, top_k: int | None = None) List[Document] | List[Tuple[Document, float]][source]#

Returns the most similar chunks from the vector database.

Parameters:
  • query – A query string

  • with_score – Outputs scores of returned chunks

  • top_k – Number of top similar chunks to return, if None defaults to self.top_k

Returns:

list of Documents

Chroma Client#

Class for Chroma vector database.

This module provides:

— ChromaClient

grag.components.vectordb.chroma_client.ChromaClient(host: str = 'localhost', port: str | int = 8000, collection_name: str = 'grag', embedding_type: str = 'instructor-embedding', embedding_model: str = 'hkunlp/instructor-xl')[source]#

A class for connecting to a hosted Chroma Vectorstore collection.

grag.components.vectordb.chroma_client.host[source]#

str IP Address of hosted Chroma Vectorstore

grag.components.vectordb.chroma_client.port[source]#

str or int port address of hosted Chroma Vectorstore

grag.components.vectordb.chroma_client.collection_name[source]#

str name of the collection in the Chroma Vectorstore, each ChromaClient connects to a single collection

grag.components.vectordb.chroma_client.embedding_type[source]#

str type of embedding used, supported ‘sentence-transformers’ and ‘instructor-embedding’

grag.components.vectordb.chroma_client.embedding_model[source]#

str model name of embedding used, should correspond to the embedding_type

grag.components.vectordb.chroma_client.embedding_function[source]#

a function of the embedding model, derived from the embedding_type and embedding_modelname

grag.components.vectordb.chroma_client.client[source]#

chromadb.HttpClient Chroma API for client

grag.components.vectordb.chroma_client.collection[source]#

Chroma API for the collection

grag.components.vectordb.chroma_client.langchain_client[source]#

langchain_community.vectorstores.Chroma LangChain wrapper for Chroma collection

Deeplake Client#

Class for DeepLake vector database.

This module provides:

— DeepLakeClient

grag.components.vectordb.deeplake_client.DeepLakeClient(store_path: str | Path = PosixPath('data/vectordb'), collection_name: str = 'grag', embedding_type: str = 'instructor-embedding', embedding_model: str = 'kunlp/instructor-xl', read_only: bool = False)[source]#

A class for connecting to a DeepLake Vectorstore.

grag.components.vectordb.deeplake_client.store_path[source]#

str, Path The path to store the DeepLake vectorstore.

grag.components.vectordb.deeplake_client.embedding_type[source]#

str type of embedding used, supported ‘sentence-transformers’ and ‘instructor-embedding’

grag.components.vectordb.deeplake_client.embedding_model[source]#

str model name of embedding used, should correspond to the embedding_type

grag.components.vectordb.deeplake_client.embedding_function[source]#

a function of the embedding model, derived from the embedding_type and embedding_modelname

grag.components.vectordb.deeplake_client.client[source]#

deeplake.core.vectorstore.VectorStore DeepLake API

grag.components.vectordb.deeplake_client.collection_name[source]#

str The name of the collection where the vectors are stored.

grag.components.vectordb.deeplake_client.langchain_client[source]#

langchain_community.vectorstores.DeepLake LangChain wrapper for DeepLake API.

Module Contents#