There is a huge amount of data in the public domain in India. Each government department has piles of data that they have been recording over the decades. Much of it is not digitized and considered useless. The has been an increase in the quantity and quality of data that has been collected over the years. But, unfortunately the majority of the population is not aware of the treasure that is data and how they can make use of it.
“It’s amazing how much data is out there. The question is how do we put it in a form that’s usable?” — Bill Ford
The problem lies in understanding that each data point can contribute in solving an important public problem. Leveraging the true potential of data can empower the government to efficiently run all of their schemes and services. In spite of having millions of data points collected from the corners of the country, the government is unable to make use of its potential.
The answers to most of the problems can be found with the use of data. Tracking data and implementing its insights can accurately detect problems at an early stage and solutions can be found with data itself. Today, in India we see that a number of good policies are drafted by the government, but they are often not implemented on the ground. The citizen is helpless as he/she does not have the information to bring this to the light of the government.
For the majority of the population, data is difficult to understand. So it becomes something that is completely useless to them. But in reality it is all about sourcing specific information to start a conversation and influence thegovernment to make the right decisions. There is scarce amount of clean data in the first place and tools to present this data in a simple format. Citizen involvement will only become successful when the understanding of data is simplified.
Just think about the political integrity in India as part of the electoral process. The public needs to know exactly whom to trust and cast their vote for. There is constant mudslinging and accusations that are flying during elections, killing the scope for a healthy debate. With access to the right political data, the public can make decisions purely on the basis of past performance, thereby improving the quality of candidates and increasing political accountability.
This experiment was the idea of Gokul who was attending the Devthon event for the first time. He was a Drupal Architect by profession and a keen observer of web trends/social media.
Gokul started with this Analysis of Facebook. This analysis goes on to say that if BJP and KJP had stayed together, their combined votes would have been greater than that of the Congress in 93 constituencies. What this analyst omits is whether BJP+KJP would have won in all these 93 seats?
Though the analyst doesn’t mention that BJP would have won in 93 seats, it kind of leads the end user to believe so. So out of curiosity I just wanted to see in how many places BJP+KJP votes would have been greater than the winning votes. We tried to find the data behind the Karnataka Election results.
We were unable to find any open data that was in a consumable format. So we used the Election Commission Website. It allowed us to only check the results by the constituencies and not any other parameter. We looked at the URL naming patterns and it turned out that there was a co-relation between the results URL and the constituency code.
We took a list of all the constituency codes from the drop-down on the page, used some notepad++ macros to clean up the HTML mess and make it a PHP array. Once we had the array of constituency codes, we wrote a simple script to scrape the 223 URLs and get the relevant data.
You can find the visualizations Here
It is established that with the right tools and visualization, data can be utilized to its maximum potential. Data is always there to tell a story. It is up to the citizens to come with different perspectives and insights to find the right correlations. It is all about creating value with data. Visualization tools need to be upgraded 10x before they can be used by the common man.
Big hopes are being put on data to revolutionize the way citizens will be able to influence and push for important legislation for the benefit of the community. The truth is that there is still plenty of inefficiency all around us. The opportunity to identify areas for improving productivity is almost endless. Now citizens can raise their voice with concrete data to back it up.
Democratization of data is an important step in leveraging the complete positive potential of data. Data always been always controlled by very few persons in the government and otherwise. When there is equal access to data, there is high chances of a rational discussions that will lead to positive steps towards development. For this the government needs to collect and release data consistently for the use of the citizens.
There is immense scope to improvement in the field of visualizations and consumption of data. Citizens will start searching for data on topics like public transport, healthcare and education. They will need frameworks to put their findings and share it with the public. If there is confusion in the dissemination of data, there are chances of misuse. The real revolution will start when they start using this data to influence national policy and solve international problems.
This experiment showed us the basic Proof of Concept of how the data can be made publicly accessible using the APIs. Now this data can be used to create various analysis reports and visualizations of the results for the better understanding of citizens. The government can play a pivotal role in making this possible by making the next election results open data.
To share your ideas and participate, check our Initiatives page.
Originally Published on April 05, 2016