How to improve Document AI model that detects O as zero

Hi

I’m training a model that detects a value that is a mix of letters and number. It is struggling to differentiate O’s and zeros and I’s and one’s. I use >50 files for test and training but it still can’t detect them correctly. Is there any way to improve it?

Hi,

Thank you for contacting Google Cloud Community!

I would suggest you to do the following:

  1. Acquire more diverse data samples, including different fonts, sizes, styles, and image qualities. This will help the model learn to recognize these characters in various conditions.
  2. Apply techniques like rotation, scaling, noise addition, and cropping to increase the size of your dataset without collecting new data.

Regards,
Jai Ade

2 Likes

Hello,

Thank you for your engagement regarding this issue. We haven’t heard back from you regarding this issue for sometime now. Hence, I’m going to close this issue which will no longer be monitored. However, if you have any new issues, Please don’t hesitate to create a new issue. We will be happy to assist you on the same.

Regards,
Jai Ade