Getting Started: Vector Stores 入门:矢量存储
A vector store is a particular type of database optimized for storing documents and their embeddings, and then fetching of the most relevant documents for a particular query, ie. those whose embeddings are most similar to the embedding of the query.
向量存储是一种特定类型的数据库,优化用于存储文档及其嵌入,然后获取特定查询的最相关文档,即。嵌入与查询的嵌入最相似的那些。
interface VectorStore {
/**
* Add more documents to an existing VectorStore
*/
addDocuments(documents: Document[]): Promise<void>;
/**
* Search for the most similar documents to a query
*/
similaritySearch(
query: string,
k?: number,
filter?: object | undefined
): Promise<Document[]>;
/**
* Search for the most similar documents to a query,
* and return their similarity score
*/
similaritySearchWithScore(
query: string,
k = 4,
filter: object | undefined = undefined
): Promise<[object, number][]>;
/**
* Turn a VectorStore into a Retriever
*/
asRetriever(k?: number): BaseRetriever;
/**
* Advanced: Add more documents to an existing VectorStore,
* when you already have their embeddings
*/
addVectors(vectors: number[][], documents: Document[]): Promise<void>;
/**
* Advanced: Search for the most similar documents to a query,
* when you already have the embedding of the query
*/
similaritySearchVectorWithScore(
query: number[],
k: number,
filter?: object
): Promise<[Document, number][]>;
}
You can create a vector store from a list of Documents, or from a list of texts and their corresponding metadata. You can also create a vector store from an existing index, the signature of this method depends on the vector store you're using, check the documentation of the vector store you're interested in.
您可以从文档列表或文本列表及其相应的元数据创建矢量存储。您也可以从现有索引创建向量存储,此方法的签名取决于您使用的向量存储,请查看您感兴趣的向量存储的文档。
abstract class BaseVectorStore implements VectorStore {
static fromTexts(
texts: string[],
metadatas: object[] | object,
embeddings: Embeddings,
dbConfig: Record<string, any>
): Promise<VectorStore>;
static fromDocuments(
docs: Document[],
embeddings: Embeddings,
dbConfig: Record<string, any>
): Promise<VectorStore>;
}
Which one to pick? 选择哪一个?
Here's a quick guide to help you pick the right vector store for your use case:
以下是帮助您为您的用例选择合适的矢量存储的快速指南:
-
If you're after something that can just run inside your Node.js application, in-memory, without any other servers to stand up, then go for HNSWLib or Faiss
如果你追求的东西只能在你的Node.js应用程序中运行,在内存中运行,没有任何其他服务器可以站立,那么选择HNSWLib或Faiss
-
If you're looking for something that can run in-memory in browser-like environments, then go for MemoryVectorStore
如果您正在寻找可以在类似浏览器的环境中在内存中运行的东西,那么请选择MemoryVectorStore。
-
If you come from Python and you were looking for something similar to FAISS, pick HNSWLib or Faiss
如果你来自Python,并且正在寻找类似于FAISS的东西,请选择HNSWLib或Faiss
-
If you're looking for an open-source full-featured vector database that you can run locally in a docker container, then go for Chroma
如果您正在寻找可以在 docker 容器中本地运行的开源全功能矢量数据库,那么请选择 Chroma
-
If you're using Supabase already then look at the Supabase vector store to use the same Postgres database for your embeddings too
如果您已经在使用 Supabase,请查看 Supabase 向量存储,以便为您的嵌入使用相同的 Postgres 数据库
-
If you're looking for a production-ready vector store you don't have to worry about hosting yourself, then go for Pinecone
如果您正在寻找一个生产就绪的矢量商店,您不必担心托管自己,那么请选择 Pinecone
-
If you are already utilizing SingleStore, or if you find yourself in need of a distributed, high-performance database, you might want to consider the SingleStore vectore store.
如果您已经在使用 SingleStore,或者如果您发现自己需要一个分布式高性能数据库,您可能需要考虑 SingleStore vectore 存储。
Vector Stores: Integrations 矢量存储:集成
️Memory ️记忆
MemoryVectorStore is an in-memory, ephemeral vectorstore that stores embeddings in-memory and does an exact, linear search for the most similar embeddings. The default similarity metric is cosine similarity, but can be changed to any of the similarity metrics supported by ml-distance.
MemoryVectorStore 是一种内存中的临时向量存储,它将嵌入存储在内存中,并对最相似的嵌入进行精确的线性搜索。默认相似性指标为余弦相似度,但可以更改为 ml-distance 支持的任何相似性指标。
️Chroma ️色度
Chroma is an open-source Apache 2.0 embedding database.
Chroma是一个开源的Apache 2.0嵌入数据库。
️Faiss ️费斯
Only available on Node.js. 仅在节点.js上可用。
️HNSWLib ️HNSWLib
Only available on Node.js. 仅在节点.js上可用。
️Milvus ️米尔沃斯
Milvus is a vector database built for embeddings similarity search and AI applications.
Milvus 是一个为嵌入相似性搜索和 AI 应用而构建的向量数据库。
️MyScale ️我的尺度
Only available on Node.js. 仅在节点.js上可用。
️OpenSearch ️打开搜索
Only available on Node.js. 仅在节点.js上可用。
️Pinecone ️松果
Only available on Node.js. 仅在节点.js上可用。
️Prisma ️棱镜
For augmenting existing models in PostgreSQL database with vector search, Langchain supports using Prisma together with PostgreSQL and pgvector Postgres extension.
为了通过向量搜索来增强PostgreSQL数据库中的现有模型,Langchain支持将Prisma与PostgreSQL和pgvector Postgres扩展一起使用。
️Qdrant ️Qdrant
Qdrant is a vector similarity search engine. It provides a production-ready service with a convenient API to store, search, and manage points - vectors with an additional payload.
Qdrant 是一个矢量相似性搜索引擎。它提供生产就绪服务,并提供方便的 API 来存储、搜索和管理具有额外有效载荷的点 - 矢量。
️Redis ️重复
Redis is a fast open source, in-memory data store.
Redis 是一种快速开源的内存数据存储。
️SingleStore ️单一商店
SingleStoreDB is a high-performing, distributed database system. For an extended period, it has offered support for vector functions such as dotproduct, thus establishing itself as an optimal solution for AI applications necessitating text similarity matching.
SingleStoreDB是一个高性能的分布式数据库系统。在很长一段时间内,它提供了对向量函数(如点积)的支持,从而将自己确立为需要文本相似性匹配的 AI 应用程序的最佳解决方案。
️Supabase ️苏帕基斯
Langchain supports using Supabase Postgres database as a vector store, using the pgvector postgres extension. Refer to the Supabase blog post for more information.
Langchain支持使用Supabase Postgres数据库作为向量存储,使用pgvector postgres扩展。有关更多信息,请参阅 Supabase 博客文章。
️Tigris ️底格里斯河
Tigris makes it easy to build AI applications with vector embeddings.
Tigris 使使用向量嵌入构建 AI 应用程序变得容易。
️TypeORM ️类型ORM
To enable vector search in a generic PostgreSQL database, LangChainJS supports using TypeORM with the pgvector Postgres extension.
为了在通用的PostgreSQL数据库中启用矢量搜索,LangChainJS支持使用TypeORM和pgvector Postgres扩展。
️Weaviate ️编织
Weaviate is an open source vector database that stores both objects and vectors, allowing for combining vector search with structured filtering. LangChain connects to Weaviate via the weaviate-ts-client package, the official Typescript client for Weaviate.
Weaviate是一个开源矢量数据库,可存储对象和矢量,允许将矢量搜索与