Erste Asset Management

Contra Big Data – When the algorithm knows too much

Contra Big Data – When the algorithm knows too much
Contra Big Data – When the algorithm knows too much
Share post:

“If you have nothing to hide, you have nothing to be afraid of”, is a popular argument when it comes to the curtailment of the private sphere, whether by public surveillance, social media, or Big Data. What is at the core of this argument? After all, the user of social media platforms decides which of their private information to relinquish for public consumption. And why should I keep the fact that I like Harley Davidson and the Wu-Tang Clan a secret?

The scientist Michael Kosinski researched this question, i.e. whether the so-called likes on facebook say more about a person than the preferences thus articulated, at the University of Cambridge in 2012.[1] To this end he and his colleagues built a facebook app that would create a personality profile for free if the user agreed to fill in a form and share their facebook profile. Subsequently, Kosinski would look for statistical relationships between likes and the personality structures thus compiled. The results were startling – it was possible to predict on the basis of the likes with an 88% accuracy rate whether a person was homosexual or heterosexual. The political view could be modelled at an 85% accuracy rate. On the basis of the interests mentioned earlier for example, the user could be expected to be heterosexual (listens to Wu-Tang Clan) and of below-par intelligence (likes Harley Davidson).

In his follow-up research project in 2014, Kosinski compared the accuracy of a personality analysis based on facebook likes to an assessment by the immediate environment of the subject.[2] The results were frightening: from 10 likes upwards, the computer can give a better assessment of a person than their work colleagues, from 70 likes upwards better than their friends, from 150 likes upwards better than their family members, and from 300 likes upwards the computer beats the subject’s own spouse when it comes to personality assessment. And that only takes a fraction of the digital footprint into consideration that we leave every day! If the algorithm were to be complemented by search queries, the browser history, and online purchases, even more precise statements could be made.

Knowing that somebody owns a dog and sending that person advertisements for dog food does not seem shady. But knowing that someone is pregnant and sending her advertisement material for baby bottles and maternity clothes might constitute an unwanted breach of privacy with undesirable consequences for the person in question. Maybe the mother-to-be wanted to keep her pregnancy a secret for a little while longer? The manipulation of voting during elections by tapping into fears that thanks to social media can be addressed to the potential voter depending on their personality structure is unacceptable.  It constitutes an attack on the basic fundament of our democracy, when the perfect conveyance of the fitting message trumps facts and political programmes in terms of relevance. In this context it therefore seems almost grotesque when the most important former client of Cambridge Analytica and President of the USA uses the expression “fake news”.

Edward Snowden, who has leaked information on how our digital footprint is analysed by governmental surveillance software, has refuted the aforementioned argument therefore as follows: “Arguing that you don’t care about the right to privacy because you have nothing to hide, is no different than saying you don’t care about free speech because you have nothing to say.”


[1] Kosinski et al. (2013): “Private traits and attributes are predictable from digital records of human behavior”

[2] Kosinski et al. (2015): “Computer-based personality judgments are more accurate than those made by humans“


Legal note:
Prognoses are no reliable indicator for future performance.



Legal disclaimer

This document is an advertisement. Unless indicated otherwise, source: Erste Asset Management GmbH. Our languages of communication are German and English.

The prospectus for UCITS (including any amendments) is published in accordance with the provisions of the InvFG 2011 in the currently amended version. Information for Investors pursuant to § 21 AIFMG is prepared for the alternative investment funds (AIF) administered by Erste Asset Management GmbH pursuant to the provisions of the AIFMG in connection with the InvFG 2011. The fund prospectus, Information for Investors pursuant to § 21 AIFMG, and the Key Information Document can be viewed in their latest versions at the web site within the section mandatory publications or obtained in their latest versions free of charge from the domicile of the management company and the domicile of the custodian bank. The exact date of the most recent publication of the fund prospectus, the languages in which the Key Information Document is available, and any additional locations where the documents can be obtained can be viewed on the web site A summary of investor rights is available in German and English on the website as well as at the domicile of the management company.

The management company can decide to revoke the arrangements it has made for the distribution of unit certificates abroad, taking into account the regulatory requirements.

Detailed information on the risks potentially associated with the investment can be found in the fund prospectus or Information for investors pursuant to § 21 AIFMG of the respective fund. If the fund currency is a currency other than the investor's home currency, changes in the corresponding exchange rate may have a positive or negative impact on the value of his investment and the amount of the costs incurred in the fund - converted into his home currency.

Our analyses and conclusions are general in nature and do not take into account the individual needs of our investors in terms of earnings, taxation, and risk appetite. Past performance is not a reliable indicator of the future performance of a fund.