Hi,
Has anyone been able to generate multiple responses per query? I’m trying to fine-tune a Gemini model and I want it to return multiple responses for one query if it’s programmed that way. Is this not possible now? It seems the regular Gemini models allow for multiple “candidateCount” responses; but this doesn’t work for a fine-tuned model? Is that correct?
https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference#parameters
Maybe this will change soon? The preview feature may allow it?
Thanks!
Hi @jsniff12,
Welcome to Google Cloud Community!
Recent release note dated August 9, 2024, stated that Gemini on Vertex AI supports multiple response candidates. While the candidateCount parameter is available for standard Gemini models, it’s not yet supported for fine-tuned versions. This means you can’t directly request multiple responses for a fine-tuned model. For details, see Generate content with the Gemini API.
Google is actively developing and updating its generative AI capabilities. It’s possible that support for multiple responses for fine-tuned models might be added in future updates. You can stay informed by monitoring Vertex AI release notes and the official documentation. With this, I suggest to file this as a feature request. Please note that I can’t provide any details or timelines at this moment.
I hope the above information is helpful.
this is very helpful @ruthseki
one more question, do you know if it’s possible to return the probability or confidence score of a query? I would like to make a threshold for low confidence responses.