visit
In short, when the translators (Google translate, Microsoft translator, etc) translate a sentence from gender neutral language(e.g. Turkish) to a non-gender neutral language(e.g. English), it make a guess on the gender(not really random guess, but fact/trained guess).
This behaviour has gotten my attention. So I did few rounds of tests with Malay and Chinese language, because I am a Malaysian Chinese. š²š¾ Malay is a gender neutral language. Chinese is mixedāāāit could be gender neutral, or not gender neutral.Both Google and Microsoft return sameĀ result.
Both Google and Microsoft return sameĀ result.
Google translates both sentence correctly
Microsoft translates the first sentence correctly, but it insists Jecelyn is aĀ āheāā¦
In this case, the translators(both Google and Microsoft) return male, even context is given in same sentence. If using āä»ā, the result will consistently return as male, no matter with or without context in the same sentence. Only if I change to use the female specific ā儹ā, the result is correct:
Did the above tests with occupationsāāādoctor, teacher, scientist, graphic designer as well. You can sorta guess the result.
definition ofĀ problem So the question is: Is this a problem? Is Artifical Intelligence(AI) gender bias? AI isnāt gender bias. It learns from data, trained and designed to return the result. It returned a logical result based on the model - we, human design that. I would say the translations are statistically correct. Based on the popular surveys:
stack overflow
Only 9.1 registered nurses and 7.6% of licensed practical nurses areĀ men.
We can say that the result make sense. At the end of the day, how often will you meet a female programmer, even myself doesnāt meet many female programmers. You probably wonāt see a male nurse too frequent too. So, can we conclude that translators are smart? They know the probabilities of their guesses are certainly right!
NO. For me, it is a problem. Itās an area that need to be improved.
Statistically correct doesnāt mean itāsĀ correct.I believe, if we donāt see this a problem, then this issue wouldnāt be fixed nor there will be any improvement.
Thatās one chapter, he talks about statistical discrimination and rational discrimination.
He wrote, āIs it okay to discriminate if the data tell us that weāll be right far more often than wrong?
It would be naive to think that gender, age, race, ethnicity, religion, and country of origin collectively tell us nothing about anything related to law enforcement. But what we can or should do with that kind of information is a philosophical and legal question, not a statical one.
If we can build a model that identifies drug smugglers correctly 80 out of 100 times, what happens to the poor souls in the 20 percentāāābecause our model is going to harass them over and over and over again.
For the elegance and prediction of probability, there is no substitute for thinking about the calculations we are doing and why we are doing that.
We can sometimes do the calculations correctly and end up blundering in a dangerous direction.āOf course, we are just talking about translation now, not criminal. It might not be as serious as the drug smugglers example. However, think about it, if AI in translations behave this way, how about AI in other areas?
Translation is hard. Itās not only have to take grammar into accountāthey have to take into account context, subtext, implied meanings, cultural quirks, and a million other subjective factors and then turn them into code.I came across this article(published in year 2013):
which means āMen are men and women should clean the kitchen.ā
The translation is fixed now. Thatās an improvement! As the translation process getting better, probably we will solve this āheā, āsheā soon?
and send me the level 3 Google translate test result (thank you, I didnāt thought of using comma!), and mention that I should use comma, instead of full stop because:
I am pretty sure that this sentence is grammatically correct in Malay, but AI still got itĀ wrong.
A friend sent me this with caption āSexist ai, man cannot be bra model?ā
āIf AI identify all males but actually 1 is a transgender. Does that make the AI sexist?ā
āMost of the teacher are female, so translators are using āsheā, make sense right?ā
I were saying I see it as a problem. My friend replies āthatās the problem, you are seeing it as a problem!ā
āAlmostā means no wars started, heh. Starting any wars arenāt something that Iām interested in and definitely not something I wanted to.