paint-brush
AI Translate: Bias? Sexist? Or this is the way it should be?ā€‚by@jecelynyeen
1,982 reads
1,982 reads

AI Translate: Bias? Sexist? Or this is the way it should be?

by Jecelyn YeenOctober 6th, 2017
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

It started with a post about stereotypes in Google Translate:

Companies Mentioned

Mention Thumbnail
Mention Thumbnail
featured image - AI Translate: Bias? Sexist? Or this is the way it should be?
Jecelyn Yeen HackerNoon profile picture
It started with a post about stereotypes in Google Translate:

In short, when the translators (Google translate, Microsoft translator, etc) translate a sentence from gender neutral language(e.g. Turkish) to a non-gender neutral language(e.g. English), it make a guess on the gender(not really random guess, but fact/trained guess).

This behaviour has gotten my attention. So I did few rounds of tests with Malay and Chinese language, because I am a Malaysian Chinese. šŸ‡²šŸ‡¾ Malay is a gender neutral language. Chinese is mixedā€Šā€”ā€Šit could be gender neutral, or not gender neutral.

The tests

Itā€™s a simple 3 level tests. Tested in both Google Translate and Microsoft translators.

Level 1: No context provided.

Both Google Translate and Microsoft Translator return
  • she for nurse
  • he for programmer

Both Google and Microsoft return sameĀ result.

Level 2: Context provided in the following sentence.

Same result as level 1.

Both Google and Microsoft return sameĀ result.

Level 3: Context provided in the same sentence with aĀ comma.

Google Translate is slightly smarter. It translates both sentences correctly, while Microsoft Translator got the first sentence right, but insist that Jecelyn is a ā€œheā€!

Google translates both sentence correctly

Microsoft translates the first sentence correctly, but it insists Jecelyn is aĀ ā€œheā€ā€¦

Test withĀ Chinese

In Chineseļ¼Œthe word ā€œä»–ā€ is gender neutral. It can refer to ā€œheā€ or ā€œsheā€. We do have another wordā€Šā€”ā€Šā€œå„¹ā€ that is referring to female ā€œsheā€, but thatā€™s not mandatory. If you can read Chinese, you can read the long discussion and history about these two words . I tested with a simple sentence with context in the same sentence, this is the result:

In this case, the translators(both Google and Microsoft) return male, even context is given in same sentence. If using ā€œä»–ā€, the result will consistently return as male, no matter with or without context in the same sentence. Only if I change to use the female specific ā€œå„¹ā€, the result is correct:

Did the above tests with occupationsā€Šā€”ā€Šdoctor, teacher, scientist, graphic designer as well. You can sorta guess the result.

So what? Is this aĀ problem?

definition ofĀ problem So the question is: Is this a problem? Is Artifical Intelligence(AI) gender bias? AI isnā€™t gender bias. It learns from data, trained and designed to return the result. It returned a logical result based on the model - we, human design that. I would say the translations are statistically correct. Based on the popular surveys:
  • Programmer: The famous developer websiteā€Šā€”ā€ŠStack overflowā€™s shows that 88.8% of the developer(programmer) that participate in the survey are men.

stack overflow
  • Nurse: Website shows that only 9.1% are men in a pool of 2,824,641 registered nurses.

Only 9.1 registered nurses and 7.6% of licensed practical nurses areĀ men.

We can say that the result make sense. At the end of the day, how often will you meet a female programmer, even myself doesnā€™t meet many female programmers. You probably wonā€™t see a male nurse too frequent too. So, can we conclude that translators are smart? They know the probabilities of their guesses are certainly right!

NO. For me, it is a problem. Itā€™s an area that need to be improved.

Statistically correct doesnā€™t mean itā€™sĀ correct.
I believe, if we donā€™t see this a problem, then this issue wouldnā€™t be fixed nor there will be any improvement.

Why I think itā€™s aĀ problem?

I am reading this from Charles Wheelan (highly recommended). Itā€™s a book about statistics.

Thatā€™s one chapter, he talks about statistical discrimination and rational discrimination.

He wrote, ā€œIs it okay to discriminate if the data tell us that weā€™ll be right far more often than wrong?

It would be naive to think that gender, age, race, ethnicity, religion, and country of origin collectively tell us nothing about anything related to law enforcement. But what we can or should do with that kind of information is a philosophical and legal question, not a statical one.

If we can build a model that identifies drug smugglers correctly 80 out of 100 times, what happens to the poor souls in the 20 percentā€Šā€”ā€Šbecause our model is going to harass them over and over and over again.

For the elegance and prediction of probability, there is no substitute for thinking about the calculations we are doing and why we are doing that.
We can sometimes do the calculations correctly and end up blundering in a dangerous direction.ā€
Of course, we are just talking about translation now, not criminal. It might not be as serious as the drug smugglers example. However, think about it, if AI in translations behave this way, how about AI in other areas?

How toĀ improve?

Seriously, I donā€™t know. I am not a AI nor language expert. I was thinking a few solutions, but none of them good enough,
  • How about translate to ā€œitā€?
  • How about translate to ā€œhe/sheā€?
  • How about recreating a gender neutral word ā€œshehā€? How about other languages?
  • Randomly return ā€œheā€ or ā€œsheā€?
Translation is hard. Itā€™s not only have to take grammar into accountā€“they have to take into account context, subtext, implied meanings, cultural quirks, and a million other subjective factors and then turn them into code.
I came across this article(published in year 2013):


During that time, Translating this English sentence Men are men, and men should clean the kitchen to German, return MƤnner sind MƤnner, und Frauen sollten die KĆ¼che sauber

which means ā€œMen are men and women should clean the kitchen.ā€

The translation is fixed now. Thatā€™s an improvement! As the translation process getting better, probably we will solve this ā€œheā€, ā€œsheā€ soon?


Side stories

I have a few discussions with my friends. It almost turn into gender / language war.

Language

I share the level 2 test result (Context provided in the following sentence) with my friends. One of my friend replied me with this message:

and send me the level 3 Google translate test result (thank you, I didnā€™t thought of using comma!), and mention that I should use comma, instead of full stop because:

  • AI confuses with the context if itā€™s in two sentences.
  • Itā€™s grammatically correct with comma instead of full stop.
My points are,
  • As a user, I do not care whether AI confuses with the context or not. What I want is to get the correct translation (and it didnā€™t if I use full stop).
  • Using full stop is definitely grammatically correct. Probably itā€™s better to use comma in the previous example, but that doesnā€™t mean using full stop is wrong (and itā€™s not!). To prove it further, letā€™s extend the sentences to:

I am pretty sure that this sentence is grammatically correct in Malay, but AI still got itĀ wrong.
  • Microsoft translator somehow still got the programmer part wrong even I join the sentences with comma. Iā€™ve thumb down it.

Gender

After sharing that in my Facebook. Some people starts to debate about sexist. Here is a few interesting messages.

- MessageĀ 1

A friend sent me this with caption ā€œSexist ai, man cannot be bra model?ā€

- MessageĀ 2

ā€œIf AI identify all males but actually 1 is a transgender. Does that make the AI sexist?ā€

- MessageĀ 3

ā€œMost of the teacher are female, so translators are using ā€œsheā€, make sense right?ā€

- MessageĀ 4

I were saying I see it as a problem. My friend replies ā€œthatā€™s the problem, you are seeing it as a problem!ā€

ā€œAlmostā€ means no wars started, heh. Starting any wars arenā€™t something that Iā€™m interested in and definitely not something I wanted to.

Summary

What I would like to stress again is the righteous of translation, the consideration we put in during design the AI, systems, training models, whatever. Logical and statically correct doesnā€™t mean itā€™s right. We need diversity. Especially in the age of .
ė°”ģ¹“ė¼ģ‚¬ģ“ķŠø ė°”ģ¹“ė¼ģ‚¬ģ“ķŠø ģ˜Øė¼ģøė°”ģ¹“ė¼