AI & Vectors

Metadata


vecs allows you to associate key-value pairs of metadata with indexes and ids in your collections. You can then add filters to queries that reference the metadata metadata.

Types#

Metadata is stored as binary JSON. As a result, allowed metadata types are drawn from JSON primitive types.

  • Boolean
  • String
  • Number

The technical limit of a metadata field associated with a vector is 1GB. In practice you should keep metadata fields as small as possible to maximize performance.

Metadata Query Language#

The metadata query language is based loosely on mongodb's selectors.

vecs currently supports a subset of those operators.

Comparison Operators#

Comparison operators compare a provided value with a value stored in metadata field of the vector store.

OperatorDescription
$eqMatches values that are equal to a specified value
$neMatches values that are not equal to a specified value
$gtMatches values that are greater than a specified value
$gteMatches values that are greater than or equal to a specified value
$ltMatches values that are less than a specified value
$lteMatches values that are less than or equal to a specified value
$inMatches values that are contained by scalar list of specified values
$containsMatches values where a scalar is contained within an array metadata field

Logical Operators#

Logical operators compose other operators, and can be nested.

OperatorDescription
$andJoins query clauses with a logical AND returns all documents that match the conditions of both clauses.
$orJoins query clauses with a logical OR returns all documents that match the conditions of either clause.

Performance#

For best performance, use scalar key-value pairs for metadata and prefer $eq, $and and $or filters where possible. Those variants are most consistently able to make use of indexes.

Examples#


year equals 2020

1
{"year": {"$eq": 2020}}

year equals 2020 or gross greater than or equal to 5000.0

1
{
2
"$or": [
3
{"year": {"$eq": 2020}},
4
{"gross": {"$gte": 5000.0}}
5
]
6
}

last_name is less than "Brown" and is_priority_customer is true

1
{
2
"$and": [
3
{"last_name": {"$lt": "Brown"}},
4
{"is_priority_customer": {"$gte": 5000.00}}
5
]
6
}

priority contained by ["enterprise", "pro"]

1
{
2
"priority": {"$in": ["enterprise", "pro"]}
3
}

tags, an array, contains the string "important"

1
{
2
"tags": {"$contains": "important"}
3
}