The Oval Observer Foundation recently conducted a Dialogue on quantification of governance in India on the 4th of February, which touched upon the need to improve data collection and big data analytics to enhance public service delivery in India. The importance and benefits of big data analysis are widely acknowledged – transparency and accountability, participatory governance, faster decision making, predictive analysis, efficient delivery of services, better targeting of social programs, increased productivity and reduced administrative costs. Whether it is by eliminating leakages in welfare programs, improving policing by tracking crime patterns, detecting fraud in financial transactions or just holding government to account by identifying policy gaps, data analytics has revolutionary potential. As a result, in recent years, there has been a significant push for open and public access to data all over the world, including in India.
India was a leader in this regard and became one of the first countries to embrace open data as a concept; when it launched its open data initiative, Data Portal India in 2013 in order to facilitate increased public access to datasets provided by various government ministries and departments at both the State and Central level. The Portal functions as a single access point for administrative data from all over the country and currently hosts over 7000 different datasets from all over the country. In addition, India has one of the most advanced RTI laws in the world. The ‘Open Data Barometer Report’ prepared by the World Wide Web Foundation and Open Data Institute ranks India 34th among 77 countries, which indicates considerable room for improvement. The challenge now lies in improving demand for data from civil society, business and academia; increasing pressure on government to provide more reliable and regular data; and figuring out the mechanisms to effectively utilise the data legally and productively.
The Oval Observer Foundation in collaboration with University of Southampton is delighted to convene a dialogue that will bring together relevant stakeholders from government, academic and the private sector to explore the opportunities and challenges involved in furthering big data analysis to improve public sector governance in India.
Some of the key topics that this dialogue will focus on include:
- Identifying sectors and thematic areas characterised by a lack of data collection/access.
- Methods to improve data collection and provide access to real time data.
- Challenges and Problems – lack of digital infrastructure, privacy concerns, overlap between government departments, inaccuracy of data and impact on marginalised communities.
Wendy Hall, Professor, University of Southampton
Dame Wendy Hall is Professor of Computer Science at the University of Southampton, UK and Executive Director of the Web Science Institute. One of the first computer scientists to undertake serious research in multimedia and hypermedia, she has been at its forefront ever since. The influence of her work has been significant in many areas including digital libraries, the development of the Semantic Web, and the emerging research discipline of Web Science. She co-founded the Web Science Research Initiative in 2006 and she is currently a Director of the Web Science Trust which has a global mission to support the development of research, education and thought leadership in Web Science. She has also held several senior positions in various science and technology commissions at government level.
Q&A with Professor Wendy Hall
1) What do you understand by the term Big Data? What is the relevance of Big Data in public policy and governance?
Big data is essentially a popular term used to describe extremely large and unwieldy datasets that are too complex for traditional data analysis computers to handle. It is a very relative term – today’s big data is tomorrow’s small data. As the processing power of computers get faster, big data will not be ‘big’ anymore.
We live in an age of constant data generation. If such data can be analysed and interpreted using proper technologies, it can change the way we do things. Climate science is a really good example of effective use of big data – detailed data of thousands of variables from all over the world are fed into complex computer models that then help us understand the evolution of climate and predict future patterns. The UK Government has highlighted big data as one of the eight ‘great challenges technologies’ of the future that investment will be focused upon. Most recently, the British government announced the start of the Alan Turing Institute which will try to bring together expertise from across the country for research into big data. However, researchers also have difficulty aligning their research interests with those mandated by the government – so the government needs to allow greater freedom and independence so that research goals are not compromised and are fully realised.
2) You are one of the founders of ‘Web Science’. Could you explain exactly what Web Science refers to and talk a little about the Web Observatory that you launched in Bangalore recently?
The Internet is a modern marvel which is unparalleled in its sheer impact and rate of adoption. Web Science is essentially a new discipline that aims to understand the Web, how it evolves, and how it impacts human society. It is essential to do this in order to leverage the power of the Web to derive greater social impact and benefit from it. The Web Science Research Initiative is a multi-disciplinary project which aims to study the social and technological implications of growing web adoption. It seeks the intersection of social scientists and computer analysts to understand and interpret data, based on its nature, content, and context.
Part of this mission is a global network of ‘web observatories.’ Just like astronomical observatories staring at the sky, these web observatories form a system that gathers existing data sets on the web and creates new datasets to answer questions about the Web and its users. Unlike astronomical observations however, the challenge is that the information changes by the mere act of observing it! The initiative is big data and open data rolled into one, providing distributed data analytics that viewsthe globe as a single interconnected digital planet. The International Institute of Information Technology in Bangalore became the 15th university to create a Web Observatory and join a growing network of seamless data sharing and analysis.
3) What do you understand by the Internet of Things (IoT)? What are the pros and cons?
The Internet of Things is essentially the idea that all objects will be connected to each other through the internet – mobiles, cars, kitchen appliances, household objects etc. – everything will be seamlessly connected to each other. It sounds like a wonderful idea but it is actually completely the opposite of what Tim Berners Lee envisioned when he created the internet on single open protocol. The internet is a free commons that everyone has equal and unrestricted access to. Anyone and everyone can start up a browser and access a webpage from anywhere. The Internet of Things (IoT) on the other hand is probably going to result in a great degree of segmentation with multiple standalone devices. It is going to become like the mobile devices in our hands – each of which is supplied by a different vendor, has a completely different set of protocols which restricts what we can do and what we can access. You could have 5 different light bulbs in the room for example, each of which can’t communicate with the other and runs on different protocols – creating an unnecessary complex experience. So, in that sense, proprietary standards and interoperability is going to be the key determinant of how successful the IoT ends up being.
4) What are the dangers of extensive data collection? Who should control the massive amounts of data being generated?
One can already see the dangers of extensive data collection – we have governments snooping on each other and on citizens, privacy rights being invaded etc. The growth of terror is often an excuse used by governments to justify more intrusive data collection on citizens. There is no doubt that some of it is legitimate, but at what cost? In the UK, citizen health data is already being sold to insurance companies by governments and that is public knowledge. What are they going to do with all the additional data they are collecting now? It is thus a major concern that governments may misuse all this data.
There is no doubt that government cannot not be allowed to control all data as it would result in a 1984 Orwellian dystopia. Government must only regulate the industry and allow open access to data to academics, commercial enterprises and civil society. It is not just governments, even private players having exclusive access to vital data may be dangerous. Already, there is a lack of data sharing within and between government, civil society and private players. Google, for example, holds enormous amounts of critical data – but no one has access to that. It belongs to one player only.
5) What is the importance of social media to the big data industry?
I’m happy to be proved wrong on this but I would contend that social media is one of the single biggest sources of big data out there. Analysing social media data could have huge implications for public policy and governance. For example, many election campaigns are increasingly being fought on social media nowadays. The Obama campaign is a perfect example of that. Closer to home, the AAP victory in Delhi also stands testament to the effective use of social media. What this essentially means is that we are getting closer to a reality where control over social media gives you control over an election. And big data analytics will play a huge role in determining who has the resources to craft the best social media strategy and therefore determine who wins an election. This has profound implications for democracy.
6) Privacy is always one of the foremost concerns when talking about big data. What do you see as the future of data privacy?
Nowadays, we sign huge 100 page contracts with governments, private players etc. to control data access. Oftentimes, we don’t even understand what we are signing or who gets access to what data since the entire process is so complex. If I was to look at the future, I can say without doubt that we were moving towards a scenario where data access would be very easy to control since everyone would have a chip embedded within their bodies – which will allow each person to individually decide who had access to their private data. It solves the problems related to data privacy, though I am sure many people might not be comfortable with it!
Another aspect of data privacy is the EU’s ‘Right to Forget’ legislation. It is an extremely interesting concept and it remains to be seen how it affects the international movement.