He and she: What’s the real difference?

Thursday, July 17th, 2003

He and she: What’s the real difference? reports on a fairly simple computer program that analyzes text (looking at 50 different features) to determine whether the author is male or female:

This summer, a group of computer scientists — including Koppel, a professor at Israeli’s Bar-Ilan University — are publishing two papers in which they describe the successful results of a gender-detection experiment. The scholars have developed a computer algorithm that can examine an anonymous text and determine, with accuracy rates of better than 80 percent, whether the author is male or female.

How does it work?

The odd thing is that the language differences the researchers discovered would seem, at first blush, to be rather benign. They pertain not to complex, ”important” words, but to the seemingly quotidian parts of speech: the ifs, ands, and buts.

For example, Koppel’s group found that the single biggest difference is that women are far more likely than men to use personal pronouns-”I”, ”you”, ”she”, ”myself”, or ”yourself” and the like. Men, in contrast, are more likely to use determiners — ”a,” ”the,” ”that,” and ”these” — as well as cardinal numbers and quantifiers like ”more” or ”some.” As one of the papers published by Koppel’s group notes, men are also more likely to use ”post-head noun modification with an of phrase” — phrases like ”garden of roses.”

Leave a Reply