Hi everyone,
I’m building a production-grade virtual try-on bot where the goal is very strict:
-
The original photo of the person must remain unchanged (especially face, body proportions, skin, hands)
-
Only the clothing should change
I’ve tested Google Virtual Try-On via Vertex AI, and while results are sometimes visually acceptable, I’m facing critical stability issues that make it unusable for a real product:
-
The photo quality deteriorates significantly, despite reducing the parameter - 0.
-
Face and facial features are often altered
Even with high-quality input photos, the model sometimes reshapes the face (eyes, nose, symmetry). This is a hard blocker — users immediately notice it.
-
Incorrect handling of sleeves / arms
If the original model photo has bare arms and the garment has long sleeves, the output often:
-
Overall inconsistency between runs
With similar inputs, results vary a lot. This makes it impossible to guarantee predictable output quality.
-
In my experience with NanoBanana and NanoBanana Pro, I haven’t been able to achieve any real stability so far.
The outputs are highly inconsistent: in many cases the model simply returns the original model image or the original garment image without applying any changes at all, and in other cases the garment is applied only partially or unpredictably.
Because of this, I haven’t yet found a way to configure NanoBanana / NanoBanana Pro for reliable, repeatable virtual try-on results.
Does anyone here have real-world experience building a stable virtual try-on pipeline using Google Virtual Try-On without degrading the original image quality (especially face and body preservation)?
Additionally, has anyone managed to achieve highly consistent results with NanoBanana or NanoBanana Pro?
If so, are there any prompts, configurations, or processing strategies that significantly improve stability and prevent cases where the model either returns the original image unchanged or applies the garment inconsistently?
2 Likes
The issues you’re seeing are common with current virtual try-on models. They often alter faces or misplace sleeves because they aren’t fully deterministic. The most reliable way to improve results is to carefully preprocess images, provide clear separation between body and clothing, and use post-processing to preserve the original face and proportions. True consistency without intervention is still limited with NanoBanana or similar models.
Hi, I’d love to be able to provide some feedback! In order to do so, would you be able to share a few examples where you’re seeing these consistency issues with the Virtual Try-On model? Can you please also include an example of the parameter you’re referring to so that I can help take a look?
Hello Katie, thanks for your answer
I fixed the quality loss issue using upscaling. Facial distortion is minimal when the source photo is sufficiently high resolution, so that problem is essentially solved as well.
However, the issue with sleeves remains. I tested several examples where the original human photo had bare arms while the clothing items had long sleeves.
The Try-On model produced heavy distortions: in some cases it simply removed the sleeves altogether, and in others it generated artifacts such as a smooth, unnatural transition between the sleeve and the bare arm.
Thanks for providing these examples! Just to confirm are you using the latest “virtual-try-on-001” model? I tested some of your examples using this notebook and saw improvements in maintaining clothing items with long sleeves. If that doesn’t work feel free to let me know.
Hey Katie,
I’m working on a try-on app as well. I tried your example with virtual-try-on-001, but I got a 429 RESOURCE_EXHAUSTED error:
{'error': {'code': 429, 'message': 'Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: virtual-try-on-001. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/generative-ai/quotas-genai.', 'status': 'RESOURCE_EXHAUSTED'}}
I followed the link to request a quota increase, but I couldn’t find “virtual-try-on-001.” I only found virtual-try-on-exp and virtual-try-on-preview.
Is there any way I can access this model?
Thank you
Hi Katie,
I have been using the virtual try on API and I wanted to know if Google pushed an update recently, because I have noticed quite a bit of degradation when it comes to body proportions. Specifically, pretty much any body type I input, results in a standard model body type (thin, tall, etc). I wasn’t having this issue before. Is there a new parameter or something to help with this to give more ‘honest’ outputs?
Hi, would you be able to try running the notebook again and let me know if that works for you? We had some changes roll out recently that should resolve this issue.
Hello, in order to better assist you would it be possible to provide some examples of the model and product images you’re using?
Thank you Kaite – the model is working now. I’ve been testing with different outfits and have a question about the behavior.
In the attached examples:
-
Business outfit: VTO changed both the clothing and the shoes (Image 1)
-
Summer outfit: VTO only changed the clothing, shoes stayed the same (Image 2)
Is this expected? Does VTO detect what’s in the garment image and apply everything visible, or is there a way to control which body region it modifies?
In my experience, it’s best to start with an image of a model that’s wearing the clothing item you’re attempting to replace. If you include multiple clothing items in a single image the model will attempt to replace all items. If you’d like to only replace a specific clothing item, you can use the following parameter in the RecontextImageConfig with the Gen AI SDK for Python:
http_options=HttpOptions(extra_body={'parameters': {'productsToReplace' : ['shoes']}})
Thank you for the explanation. I now have a better understanding of how the model works. We tested this using both the Python Gen AI SDK with HttpOptions(extra_body=...) and the REST API directly.
However, we couldn’t find productsToReplace documented anywhere — not in the Virtual Try-On API reference, the VirtualTryOnModelParams, or the Generate Virtual Try-On Images guide. Could you share any documentation on this parameter and its accepted values?
Thanks!
You can set the parameter to one or more items worn by the individual in the person_image that you’d like to replace with a new item. The parameter is an array of string values like [“t-shirt”, “shoes”]. Hope this helps!
Hey Jikki_Jim
In my experience automating unstable GenAI flows, you cannot always “prompt” your way out of random failures (hallucinations, face distortion). You often need an architectural fix rather than a configuration fix.
Instead of tweaking parameters endlessly, have you considered wrapping your VTO call in a Self-Healing Loop with a separate validator?
The Logic:
-
Generate: Call Vertex AI/NanoBanana.
-
Validate (The Guardrail): Use a deterministic library (like dlib or a FaceNet embedding) to compare the Original Face vs. Output Face.
-
Decision: If similarity_score < 0.95 → Discard and Retry immediately (with a slightly different seed or noise).
I treat GenAI outputs as “untrusted” by default. I detailed this “Self-Healing” approach for data extraction in my latest post (available via my profile), but the architectural principle is exactly the same for image pipelines: **Automate the quality control, don’t just automate the generation.
See u !**