How to prioritize fresh knowledge for GoogleCloud Agentspace Knowledge assistants?

This article is co-authored by Łukasz Olejniczak and Rafał Knapik @raknapik (rafal.knapik@allegro.com).

Agentspace knowledge assistants use a variety of modern search techniques to find information, including semantic search, full-text search, and keyword search. They also leverage other methods like exact match, synonyms, and data augmentation with both public and private knowledge graphs.

For this blog post, we’ll focus on semantic search. This method represents knowledge as numbers called embeddings, along with their corresponding metadata. You can visualize these embeddings using tools like Google’s open-source embedding projector https://projector.tensorflow.org/. The default view on this page shows how different words are represented as embeddings.

Press enter or click to view image in full size

In Agentspace datastores, these numbers represent more than just single words; they stand for entire chunks of text or complex data structures, such as the tasks we explored in a previous blog post.

Scanning through these massive “clouds” of data to find relevant items in a fraction of a second is a huge challenge. To solve this, researchers have developed advanced algorithms for vector search. A great example is Google ScaNN (https://research.google/blog/announcing-scann-efficient-vector-similarity-search/), which is designed to efficiently find similar embeddings. This is supported by scalable infrastructure like Vertex AI Vector Search on Google Cloud (https://cloud.google.com/vertex-ai/docs/vector-search/overview).

However, these powerful algorithms make a trade-off: they are optimized to quickly find excellent candidates for a given query, but they don’t guarantee that every single relevant item in the knowledge base will be found.

To visualize this, imagine a fan of green jelly beans:

Press enter or click to view image in full size

A person asks the system for green jelly beans, and it quickly returns a number of results — say, 10, 20, or even 100. This happens in milliseconds, but it doesn’t return all of them. While the user will likely be happy with the fast, relevant results, someone expecting a complete list might be disappointed.

In contrast, if this were a relational database, you’d use a simple query like SELECT * FROM TABLE_OF_JELLYBEANS WHERE color = "GREEN".

SELECT * FROM TABLE_OF_JELLYBEANS WHERE color = "GREEN"

This would guarantee that the result set includes every single item that matches the condition.

This raises an important question: can’t knowledge systems go beyond vector search and also filter results using available metadata?

The answer is they very often do. Such filtering is typically applied either before or after semantic search. However, even with this additional filtering, there’s still no guarantee that the final set of results will include every single item from the knowledge database that satisfies the condition.

Given these characteristics, can we influence which items are prioritized in the search results? For instance, when a knowledge assistant is asked to list tasks or tickets, we’d want it to prioritize the most recent ones over older, archival items.

With Agentspace, the answer is yes. The mechanism that allows for this type of prioritization is called serving controls.

I previously wrote about this mechanism in a blog post titled “Agent Builder serving controls: Boosting. Semantic search with on the fly customizations. Part 1.” In that post, I demonstrated how to “boost” (prioritize) or “bury” (deprioritize) search results based on metadata attributes.

In this article, we will specifically focus on freshness. We’ll first show how to prioritize the most recent information in a custom datastore and then demonstrate that this same mechanism can be applied to any datastore, including native Agentspace datastores for systems like ServiceNow.

Serving controls are applied at the application level within Agentspace. This means you attach them to your specific Agentspace application. However, the controls are enforced on individual datastores, and the serving control’s definition must include the name of the datastore where it should affect the retrieval process.

Press enter or click to view image in full size

The easiest way is to create new serving control from GCP console, but in this blog post we will use API:

Press enter or click to view image in full size

The key to prioritizing by freshness is the presence of a date attribute. This attribute allows Agentspace to determine the age of any given item.

When we inspect the schema of our custom datastore, we can easily identify the date attribute to use for this purpose.

Press enter or click to view image in full size

As noted, we can easily identify the date attribute within our custom datastore’s schema to use for freshness.

Boosting based on freshness works by measuring the time difference between the current time (now()) and the value of the specified date attribute for each item. This difference is then used to apply a boosting factor that modifies the item’s ranking in the search results.

This boosting factor is a value between -1 and +1. A positive value pushes the item higher in the results, while a negative value pushes it lower. You can use a single boosting factor for all items, or you can define multiple “bands,” with each band having its own boosting factor:

boot_spec = {
  "fieldName": f"{freshness_attribute_name}",
  "attributeType": "FRESHNESS",
  "interpolationType": "LINEAR",
  "controlPoints": [
            {
              "attributeValue": "1D",
              "boostAmount": 0.9
            },
            {
              "attributeValue": "3D",
              "boostAmount": 0.7
            },
            {
              "attributeValue": "7D",
              "boostAmount": 0.4
            },
            {
              "attributeValue": "30D",
              "boostAmount": 0.1
            }
          ]
}

You can define specific boosting “bands” to prioritize recent items. For example, a boost_spec could be configured so that:

  • Items created within the last day receive a high boosting factor, such as 0.9.

  • Items created within the last three days receive a slightly lower factor, like 0.7.

To create this new serving control, you use the Agentspace SDK. The control’s payload includes a conditions field, which determines when the rule becomes active. This can be based on specific query terms (e.g., “ticket”), but you can also leave it empty to apply it to all queries.

payload = {
        "displayName": serving_control_name,
        "solutionType": "SOLUTION_TYPE_SEARCH",
        "useCases": ["SEARCH_USE_CASE_SEARCH"],
        "conditions": {
            "queryTerms": []. ##{"value": query_value, "fullMatch": full_match}]
        },
        "boostAction": {
            ##"boost": boost_value,
            "filter": "", ##filter_string,
            "dataStore": data_store_path,
            "interpolationBoostSpec": boot_spec
        },
}

The boostAction is the part of the payload where you specify which datastore the serving control should be applied to. With the payload configured, we are now ready to make an API call to Agentspace to create the new serving control for our application.

url = (
        f"https://discoveryengine.googleapis.com/v1/projects/{project_id}/"
        f"locations/{location}/collections/default_collection/engines/{agentspace_app_id}/controls"
    )

headers = {
        "Authorization": f"Bearer {access_token}",
        "Content-Type": "application/json",
        "X-Goog-User-Project": project_id,
    }

params = {"controlId": serving_control_name}

# 3. Construct the Request Body (Payload)
data_store_path = (
        f"projects/{project_id}/locations/{location}/collections/default_collection/"
        f"dataStores/{data_store_id}"
    )

payload = {
        "displayName": control_id,
        "solutionType": "SOLUTION_TYPE_SEARCH",
        "useCases": ["SEARCH_USE_CASE_SEARCH"],
        "conditions": {
            "queryTerms": []. ##{"value": query_value, "fullMatch": full_match}]
        },
        "boostAction": {
            ##"boost": boost_value,
            "filter": "", ##filter_string,
            "dataStore": data_store_path,
            "interpolationBoostSpec": boot_spec
        },
}

# Add optional active time range if specified
if start_time and end_time:
        payload["conditions"]["activeTimeRange"] = [
            {"startTime": start_time, "endTime": end_time}
        ]

# 4. Make the POST Request
try:
        response = requests.post(
            url, headers=headers, params=params, data=json.dumps(payload)
        )
        # Raise an exception for HTTP error codes (4xx or 5xx)
        response.raise_for_status()
        print(f"✅ Control '{control_id}' created successfully!")
        print(response.json())
except requests.exceptions.HTTPError as e:
        print(f"❌ HTTP Error: {e}")
        print(f"Response Body: {e.response.text}")
        raise

As a final step, you must ensure you Enable the serving control. You always have the option to disable it later if needed.

Press enter or click to view image in full size

Now, let’s put it to the test!

Press enter or click to view image in full size

Alright, that’s a demonstration of how this works on a custom datastore. The next step is to show how to apply the same mechanism to a ServiceNow datastore.

The ServiceNow datastore schema includes a number of attributes that can be used. For freshness, we will use the sys_updated_on attribute.

Press enter or click to view image in full size

When we run “show me 10 articles from knowledge base, in tbale with update date and name”when there is no serving control prioritizing freshness then quite old article get to the top of the list:

Press enter or click to view image in full size

With serving control applied the same list looks quite different:

Press enter or click to view image in full size

Let us know in the comments if this was helpful.

Summary:

In this blog post, we explored serving controls, a powerful Agentspace mechanism that allows you to customize and improve search results. These controls let you prioritize, or “boost,” items based on specific metadata attributes, such as a creation or update date.

This functionality is crucial because it ensures that the most recent and relevant information appears at the top of your search results, even if the underlying semantic search algorithm might not have ranked it highly. By defining a set of rules and applying a boosting factor based on an item’s age, you can dynamically influence the retrieval process.

Serving controls are highly versatile and can be applied to both custom datastores and native integrations like ServiceNow. By using an attribute like sys_updated_on, you can ensure that your knowledge assistant consistently delivers the most current information first.

Check my other blog post to learn how to connect your ADK agent to 100+ systems with GCP Integration Connectors:

Connect & Act: Google ADK Agents with GCP Integration Connectors to Perform Tasks Across 100+…

Google just dropped their new open-source Agent Development Kit (ADK) for building multi-agent AI applications. It’s…

This article is authored by Lukasz Olejniczak — Customer Engineer at Google Cloud. The views expressed are those of the authors and don’t necessarily reflect those of Google.

Please clap for this article if you enjoyed reading it. For more about google cloud, data science, data engineering, and AI/ML follow me on LinkedIn.