query-data

在你的数据索引完成后，你可以开始发送查询到Pinecone。

查询操作使用一个查询向量在索引中进行搜索。它检索与索引中最相似的向量的ID以及它们的相似度得分。可选地，它还可以包括结果向量的值和元数据。在发送查询时，您指定每次检索的向量数量。它们总是按相似度从最相似到最不相似的顺序排序。

发送查询

当您发送查询时，您提供一个向量，并检索每个查询的top-k个最相似的向量。例如，以下示例发送一个查询向量并检索三个匹配向量：下面分别是Python、JavaScript和Curl代码

index.query(
  vector=[0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3],
  top_k=3,
  include_values=True
)

# Returns:
# {'matches': [{'id': 'C',
#               'score': -1.76717265e-07,
#               'values': [0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3]},
#                   {'id': 'B',
#                    'score': 0.080000028,
#                    'values': [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2]},
#                   {'id': 'D',
#                    'score': 0.0800001323,
#                    'values': [0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4]}],
#               'namespace': ''}

const index = pinecone.index("example-index");
const queryResponse = await Index.query({
  query: {
    vector: [0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3],
    topK: 3,
    includeValues: true,
  },
  namespace: "example-namespace",
});
// Returns:
// {'matches': [{'id': 'C',
//               'score': -1.76717265e-07,
//               'values': [0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3]},
//                   {'id': 'B',
//                    'score': 0.080000028,
//                    'values': [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2]},
//                   {'id': 'D',
//                    'score': 0.0800001323,
//                    'values': [0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4]}],
//               'namespace': ''}

curl -i -X POST https://hello-pinecone-YOUR_PROJECT.svc.YOUR_ENVIRONMENT.pinecone.io/query \
  -H 'Api-Key: YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "vector":[0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3],
    "topK": 3,
    "includeValues": true
  }'

# Output:
# {
#  "matches":[
#      {
#       "id": "C",
#       "score": -1.76717265e-07,
#       "values": [0.3,0.3,0.3,0.3,0.3,0.3,0.3,0.3]
#      },
#      {
#       "id": "B",
#       "score": 0.080000028,
#       "values": [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2]
#      },
#      {
#       "id": "D",
#       "score": 0.0800001323,
#       "values": [0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4]
#      }
#  ],
#  "namespace": ""
# }

根据你的数据和查询条件，可能无法获取前k个结果。当top_k大于查询匹配向量的数量时，就会发生这种情况。

按命名空间查询

你可以将加入索引的向量组织成分区或“命名空间”，以限制查询和其他向量操作仅针对一个命名空间。有关更多信息，请参见：命名空间。

在查询中使用元数据过滤器

您可以向Pinecone的文档嵌入元数据，然后在发送查询时过滤出符合条件的文档。Pinecone只会在符合过滤器条件的文档中搜索相似的向量嵌入。有关更多信息，请参见：元数据过滤。

下面分别是Python、JavaScript和Curl代码

index.query(
    vector=[0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1],
    filter={
        "genre": {"$eq": "documentary"},
        "year": 2019
    },
    top_k=1,
    include_metadata=True
)

const index = pinecone.index("example-index")
const queryResponse = await index.query({
  query: {
    vector: [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1],
    topK: 1,
    includeMetadata: true
    filters: {
      "genre": {"$eq": "documentary"}
    },
  }
})

curl -i -X POST https://YOUR_INDEX-YOUR_PROJECT.svc.YOUR_ENVIRONMENT.pinecone.io/query \
  -H 'Api-Key: YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "vector": [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1],
    "filter": {"genre": {"$in": ["comedy", "documentary", "drama"]}},
    "topK": 1,
    "includeMetadata": true
  }'

使用稀疏和密集值查询向量

当查询包含稀疏和密集向量的索引时，请使用带有sparse_vector参数的query()操作。

⚠️警告
Update操作不验证索引中ID的存在。如果给定不存在的ID，则不会进行任何更改，并返回200 OK。

示例

以下示例查询具有稀疏-密集向量的索引示例索引。

下面分别是Python和Curl代码

query_response = index.query(
    namespace="example-namespace",
    top_k=10,
    vector=[0.1, 0.2, 0.3, 0.4],
    sparse_vector={
        'indices': [10, 45, 16],
        'values':  [0.5, 0.5, 0.2]
    }
)

curl --request POST \
     --url https://index_name-project_id.svc.environment.pinecone.io/query \
     --header 'accept: application/json' \
     --header 'content-type: application/json' \
     --data '
{
     "includeValues": "false",
     "includeMetadata": "false",
     "vector": [
          0.1,
          0.2,
          0.3,
          0.4
     ],
     "sparseVector": {
          "indices": [
               10,
               45,
               16
          ],
          "values": [
               0.5,
               0.5,
               0.2
          ]
     },
     "topK": 10,
     "namespace": "example-namespace"
}
'

限制

当top_k大于1000时，避免返回向量数据和元数据。这意味着具有top_k超过1000的查询不应包含include_metadata=True或include_data=True。有关更多限制，请参见：[限制(Limits)]。

更新时间 13天前

发送查询​

按命名空间查询​

在查询中使用元数据过滤器​

使用稀疏和密集值查询向量​

⚠️警告​

限制​

发送查询

按命名空间查询

在查询中使用元数据过滤器

使用稀疏和密集值查询向量

⚠️警告

限制