How to improve Document AI model that detects O as zero

konradpsiuk · August 1, 2024, 6:10am

Hi

I’m training a model that detects a value that is a mix of letters and number. It is struggling to differentiate O’s and zeros and I’s and one’s. I use >50 files for test and training but it still can’t detect them correctly. Is there any way to improve it?

jaia · August 5, 2024, 12:36pm

Hi,

Thank you for contacting Google Cloud Community!

I would suggest you to do the following:

Acquire more diverse data samples, including different fonts, sizes, styles, and image qualities. This will help the model learn to recognize these characters in various conditions.
Apply techniques like rotation, scaling, noise addition, and cropping to increase the size of your dataset without collecting new data.

Regards,
Jai Ade

jaia · August 13, 2024, 7:13am

Hello,

Thank you for your engagement regarding this issue. We haven’t heard back from you regarding this issue for sometime now. Hence, I’m going to close this issue which will no longer be monitored. However, if you have any new issues, Please don’t hesitate to create a new issue. We will be happy to assist you on the same.

Regards,
Jai Ade

Topic		Replies	Views
Issue with Data Extraction Using Document AI: Confusing "0" (Zero) with "O" (Letter O) AI Solutions document-ai	1	150	January 8, 2025
Document AI doesnt see some numbers on training screen AI Solutions document-ai	8	64	July 8, 2024
Document AI Model Misclassifying Documents Despite Adequate Training Data AI Solutions document-ai	2	27	January 9, 2025

How to improve Document AI model that detects O as zero

AI Suggested topics