When we put together the program for our workshop “Big Data, Privacy and Surveillance in China: Regulations, Actors, and Debates” we found it quite difficult to identify scholars that work on the matter from a critical perspective. Some we knew from conferences and through various networks. We then did the usual thing – Internet research, article review, and snowballing. A few more names came up and in the end we managed to bring together a really interesting crowd. The experience, however, made me curious about what’s going on in big data research. The charts below give a first impression on where we stand at the moment. The global research community published about 4250 articles with the topic “big data” in SCI-listed journals. “Big data” as a phenomenon and potentially new technological paradigm is younger than you might think. Essentially research took off only in 2013, and thus knowledge accumulation at academic institutions is at a very early stage. In line with this observation is the distribution of research categories. Around 80% relates to basic research in computer sciences, engineering, and mathematics. Another 15% deals with applied big data in the natural sciences such as nanotechnology, and biomedicine. Only 5 percent of current research output emerges out of other disciplines. Among this last batch are economics and management the largest category with about 4 percent of all articles. The last percent of research output originates from social sciences; the lion’s share is applied methods followed by ethical reflections and applied ethical issues in the medical sector. These figures shed some light on our big data workshop: We are out early!
For obvious reasons our research group is mainly interested in critical perspectives on big data in China. Again the figures in the charts below suggest that this is a sensible choice. China is the top publisher of big-data research only outrun by the US. 10% of all articles on big data originate from five academic institutions; among them are two from China while three are from the US. Another interesting feature is the funding situation. In the US about 20% of articles mention financial support from government funds; in China it is more than 90%. In essence big data research reflects an increasing political polarisation where China and the US are eager to size opportunities and gain a first mover advantage. The US comfortably relies on its prestigious universities and the market to raise the funds needed. China’s academic elite is much smaller but steadily growing. The main driver is government support. And where is Europe? Well as usual; third place, divided, with market institutions and government organisations incapable of forging ahead. But make no mistake, in a world where macho-style “make it happen” policies trump or become the core European apprehension becomes a virtue that we should cherish.
But before we start to postulate ethical principles that none of the successful technology innovators is going to abide to, it is useful to critically reflect on our own capacity to contribute to a hopefully new emerging field of critical big data studies. If our big data workshop is representative for where we stand, then we need to tackle at least three paradoxes:
- Firstly, most of us have no experience in using big data technologies or the underlying mathematical and statistical principles. Of course you don’t have to be a craftsman in order to distinguish a good job from inferior performance. Yet a stronger representation of technological and statistical knowledge will be needed in order to reach those that we would like to convince. Diversity, opposition, and frictions are inevitable if we are not satisfied with drafting a manifesto of the converted.
- Secondly, most researchers look at laws and regulations, legal loopholes, and implementation failures. In China, however, the law isn’t exactly a good indicator for the constraints that keep big data research and corporate capabilities within an ethical frame. And it won’t bring us a long way in the US either where commercial actors will strongly lobby for self-regulation. Thus we have to get more insight into decision-making processes, corporate research, and organisational cultures that shape big data products and services. This point is intimately connected to the last issue …
- The success of critical big data studies relies on advancing our understanding of how data is collected, stored, distributed, shared, sold, reconfigured, matched, and analysed. Most of these processes are surrounded by secrecy and concentrated in a few centres of power. Many crucial actors neither have the obligation to nor an interest in talking to us. Thus we need to find innovative approaches to collect evidence and gradually push for transparency.
I think our big data workshop was a great beginning. It made us aware of the early stage we are in, the need for getting organised, and the benefit that our research can reap. Of course, there are many hurdles on the way towards having an impact – but that shouldn’t prevent us from making an effort.