Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

To use the authors own framework the epistemic failure of this paper is the mistake of identifying the failures of some individuals with a failure of an institution or a technology. The fact that harm can be created by some individuals means nothing more than that. There are pseudoscientists in ML, but to generalise that fact into an argument that there is something rotten at the heart of ML is just foolish. The authors also fail to understand that the idea of a causal theory as fundamental to science is extremely tenuous and has really only applied to physics for most of the history of science. The workings of plants (for example) were understood with an almost complete ignorance of the mechanisms that caused them to behave in particular ways until relatively recently. This didn't impair the value of this observational and contingent knowledge. The comprehension of gravity as a field determining the inverse square law was so brilliant and important that it's blinded us to the reality of so many other fields of knowledge - and their legitimacy.


But the authors do point out what are intrinsic biases and failures of experimental design in most of the examples they mention

* Inferring sexual orientation: Linking «self-reported sexual orientation labels» with «[...]scraped their data from social media profiles, claiming that training their classifiers on “self-taken, easily accessible digital facial images increases the ecological validity of our results.”[...]». Social media profile photos are by their very nature socially influenced, with open sexual orientation being an important cue to display.

* Personality psychology: Training and test datasets came from the same pool of «participants [who] self-reported personality characteristics by completing an online questionnaire and then uploaded several photographs». This heavily suggests that the participants were aware when choosing the photos that this was a "personality type" experiment, and may even have made their own awareness of their personality more salient by doing the test first and then uploading the photographs.

* “Abnormality” classification: General critique of lack of transparency as to how the true labels were determined.

* Lie detection: The ability to detect the facial differences between people following two different experimental instructions does not equate to lie detection.

* Criminality detection: At least they used official ID photographs instead of self-selection-biased photos like the first example... but consider this: what conclusions would their same model reach if it used official ID photos of US populations? The confounding factors of class and ethnicity are obvious.


These are examples and the experimental designs were a particular choice by the authors of those examples - they aren't intrinsic to ML or the ML community.


Hence why the authors never claim to be talking about ML or the ML community. They are talking about "the harmful repercussions of ML-laundered junk science" and, in the section that I quoted in my comment above, they "review the details of several representative examples of physiognomic ML".


I didn't read the paper particularly thoroughly, but there is a real threat here. Governments love to "follow the science" when implementing authoritarian policies, and if there is a body of psudoscientific literature law enforcement will with high likelihood use it as cover for time-honoured traditions of eyeballing people and judging them based on how they look.

It is important to resist that dynamic at every level, so it is probably worth supporting the papers authors in pointing it out. The risk of pseudoscience taking on a racial tinge and leaking out into the real world is always present.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: