How to specify the Reference Id for each prompts requests in gemini batch

blackdiamond · December 24, 2024, 3:47am

I have gone through the documentations https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/batch-prediction-gemini

the current documentation suggest the user inputs to be in format , how can i adda reference Id to this request so when results are generated i can related for which batch requests this was created for. one way to determine is reading the output in the same order as input was given.

there is also statement for the long running queries output are exported after 90 minutes. but its unclear for me how do i related my input request to the output request if i can’t add a additional metadata .

{"request":{"contents": [{"role": "user", "parts": [{"text": "What is the relation between the following video and image samples?"}, {"fileData": {"fileUri": "gs://cloud-samples-data/generative-ai/video/animals.mp4", "mimeType": "video/mp4"}}, {"fileData": {"fileUri": "gs://cloud-samples-data/generative-ai/image/cricket.jpeg", "mimeType": "image/jpeg"}}]}]}}

I have tried adding additional fields to the json requests object but it never comes back in the response details.

one workaround i can think of is i can inject the request id in the prompt itself.

update:

my current solution is i am injection request id in the prompt and. Reading it back. this is very poor experience from google Gemini i didn’t expected such workarounds i need to implement for the basic feature

ruthseki · December 25, 2024, 6:09pm

Hi @blackdiamond,

Welcome to Google Cloud Community!

Gemini doesn’t have a direct “Reference ID” parameter within its batch prompt request structure. Gemini’s BatchPrompt API focuses on processing multiple prompts efficiently. It doesn’t inherently have a mechanism to track individual requests with a custom ID. The API primarily returns an array of responses corresponding to the order of your prompts.

With regard to your proposal which is embedding request IDs in the prompt, this is a viable approach.

If you include a unique identifier (like a UUID or a custom ID) within the text part of the prompt, the generated output will inherently contain that ID, providing a reliable way to link back to the original request. It’s relatively straightforward to implement this on the request creation side. You can include other metadata in the text part too, although it might impact the model’s processing.

On the other side, injecting IDs into prompts can potentially affect the model’s output. While likely minor, it might slightly influence the text generation if not handled correctly. You’ll need to parse the output text to extract the injected ID, adding a processing step after getting the responses. This assumes that the language model will retain the ID in output. You’ll need to consider the impact of added characters on prompt length limits. If the model changes or we want to change model later then there can be a breaking change in the prompt parsing.

Here are other approaches that you may consider to try:

1. File Naming Conventions:

If your input media is stored in a location where you control file naming, you can encode request ID info there. For example, if your input files are in Cloud Storage, you can name the file something like request-id-123_image.jpg. With this, it can keep the prompt clean and avoids introducing noise. However, it may require an infrastructure change if your files are not already named with IDs. You’ll need to parse filenames in your output.

2. Database/Lookup Table:

Before making the batch requests, create a lookup table (e.g., in a database, data store) that maps a sequential ID to the actual request payload. Use these sequential IDs in the prompt. You then use the ID to query the lookup table for original requests. The advantage of this is that it is robust and allows storing additional metadata besides just request contents. On the other hand, it may require an additional database or datastore and adds overhead.

3. Post-Processing with Error Handling:

Relying on the original order while making sure to implement robust error handling and tracking. It is the simplest method in terms of not adding additional logic. However, it can be error prone, partial failures will skew output and can be tricky to debug.

Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.

blackdiamond · December 27, 2024, 6:02pm

Hi @ruthseki ,
Appreciate your response on this. please find my finding below.

Before making the batch requests, create a lookup table (e.g., in a database, data store) that maps a sequential ID to the actual request payload. Use these sequential IDs in the prompt. You then use the ID to query the lookup table for original requests.

This is not true as per my finding, Below is some id ordering in the input and output verification I have done. E.g. take a example of output for a particular keyword the line is present at 19 but in input it was present at 193.

grep -n -Eo SF1430633518SIK demoCliRuns_prediction-model-2024-12-27T16_15_53.115021Z_predictions.jsonl GeminaiBatchInput_prod_automated_v1_2024-12-27_7e9a1e3a-a9f2-408d-bae7-7559164a4ef8.jsonl
demoCliRuns_prediction-model-2024-12-27T16_15_53.115021Z_predictions.jsonl:19:SF1430633518SIK
GeminaiBatchInput_prod_automated_v1_2024-12-27_7e9a1e3a-a9f2-408d-bae7-7559164a4ef8.jsonl:193:SF1430633518SIK

Questions:

Do I need to pass any options to make sure the output are in sequential order ?

It will also be gr8 if this information is officially updated in the documentation . I have also double verified to be sure that i am not mixing the files :).

serjester · April 8, 2025, 12:45am

After spending some time on the this, turns out you can supply a labels: dict[str,str] attribute within the request. Not sure why this is so hidden.

https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.endpoints/generateContent

Significantly easier than perfectly keeping track of order.

Topic		Replies	Views
Performance degradation when using Batch prediction Custom ML & MLOps gemini-in-looker , vertex-ai-platform	2	73	April 8, 2025
Gemini Batch API Performance Issue - Slow Processing Custom ML & MLOps gemini-in-looker , vertex-ai-platform	1	140	April 4, 2025
Structured Output in vertexAI BatchPredictionJob Custom ML & MLOps vertex-ai-platform	8	250	June 6, 2025

How to specify the Reference Id for each prompts requests in gemini batch

AI Suggested topics