pgvector#

pgvector is an extension for PostgreSQL, allowing to integrate vector similarity searches within SQL-queries.

Dependencies#

To interface with a pgvector-extended PostgreSQL-server additional depedencies need to be installed. You may do so using pip:

pip install vecworks[pgvector]

API#

class vecworks.retrievers.pgvector.pgvectorRetriever#

Retriever using pgvector as the backend.

Fundamentals#

__init__(url: str, table: str | Query, index: Index | list[Index], return_columns: list[str] | dict[str, str] | None = None, return_rank_as: str | None = None, ensemble_by: ENSEMBLERS | None = None, top_k: int = 10, authenticator: Authenticator | None = None, **kwargs)#

Initializes the retriever.

Parameters#

url

Connection URL of the PostgreSQL database to connect.

Refer to the SQLAlchemy documentation for documentation on how to specify this URL.

table

Table against which similarity matching should be performed, defined either by the qualified name of the table, or a non-terminated SQL query. In case of the latter, annotate the argument with Query.

index

Index or indices to query for similarity.

return_columns

Columns of the table specified by table to include in the results. If a dictionary is passed, the dictionaries values are treated as aliases, to be used when converting back to Python. If no columns are specified, all columns are returned.

return_rank_as

Name to assign to the return variable holding the similarity rank.

ensemble_by

Ensembler to use to combine results from similarity queries. Refer to vecworks.functions.ENSEMBLERS for the available options.

top_k

Maximum number of results to retrieve per vector.

authenticator

vecworks.auth.Authenticator to use for authenticating against the database

kwargs

Any additional arguments to pass to the underlying Pypeworks Node.

Methods#

exec(*args, **kwargs)#

Search the index for content similar to the given input.

Parameters#

args

Input(s) to compare the index with for similarity, passed without a name.

kwargs

Input(s) to compare the index with for similarity, passed with a name. Any arguments with a name that may not be mapped to an index, are passed as key-value arguments for use at the discretion of the retriever.

query(input: dict[str, Iterable[ndarray | sparray | spmatrix]], **kwargs) dict[str, list[list[Any]]]#

Search the index for a vector similar to the given vector(s). If the input is not yet vectorised, call exec() instead.

Note

Any sub-class must re-implement this method to provide a fully functional retriever.

Parameters#

input

Vectorized input(s) indexed by name.

kwargs

Any other arguments passed.

class vecworks.retrievers.pgvector.Query#

Companion class to pgVector, used to annotate the contents of the table argument as representing a SQL query.