Google Translation API male female translation issue with ukranian language.

When translating names from Ukranian(also seen in Polish, Hungarian), there is a problem where the returning translated value is a name of the opposite gender. The incorrect translation will be related to all the names that in Ukraine can have a male and female “version” (e.g. Yaroslav – Yaroslava, Oleksandr – Oleksandra etc.).

Example :

“Ярослава Федоренко” is translated as “Yaroslav Fedorenko” by Google Translation API. The correct expected output is “Yaroslava Fedorenko”.

“Bing” / “Deep L” / "www.reverso.net" / all provided the correct translation ie “Yaroslava Fedorenko”.

Even in Google translation done from web, the text shown under ukranian text(which is I think is meant for pronunciation) is correct. But the translation on right side is wrong.

In Ukranian language “Ярослав” is male(without a in end) and “Ярослава” is female version. Below wikipedia page explains this.

https://en.wikipedia.org/wiki/Ukrainian_name

Is there a way to get correct translation of “Ярослава Федоренко” as “Yaroslava Fedorenko” from Google API. If this is a bug in API, where should we report this to get it fixed.

Have attached Screenshots below.

2 Likes

Good day @rajsj ,

Welcome to Google Cloud Community!

You can use a glossary for this case, it is a custom dictionary that Cloud Translation API uses to translate your domain-specific terminology. It can used for ambiguous words (e.g. translation of the word “bat” may mean an animal or a piece of equipment in sport but you want to use the glossary to translate the word “bat” to an animal instead of the the equipment in sport), it can also be used for borrowed words as well as Product names (e.g. Translation of a product, “Google Translate” must only be translated to “Google Translate”) and it can also be applied in this case. For more information regarding glossaries, you can visit this link: https://cloud.google.com/translate/docs/advanced/glossary
There are two types of glossary that are currently supported in Cloud Translation API, unidirectional glossaries and equivalent term sets. Unidirectional glossaries allow you to specify your preferred translation for a single pair of source and target languages, while the equivalent term sets can identify the equal terms to different languages. For more information, you can visit this link: https://cloud.google.com/translate/docs/advanced/glossary#create_a_glossary
Here is how you can create a glossary file: https://cloud.google.com/translate/docs/advanced/glossary#format-glossary
But before you start creating a glossary please note about the required permission: https://cloud.google.com/translate/docs/advanced/glossary#before_you_begin
You can also file this issue using this link: https://cloud.google.com/support/docs/issue-trackers

Hope this helps!