This week, I finally found all the data sets and sources we will be using for the Freedom House project and started with the data cleaning process. Data cleaning and pre-processing is a crucial process in analytics – data scientists spend 80% of their time cleaning and pre-processing data. Looks like I got a good taste of it. So, this is basically what I am doing in the data cleaning process:
- Merge data before 2003 with the data after 2003 (this is messy since they are in different formats and aggregation levels)
- Merge data related to freedom in disputed territories (e.g. West Bank and Gaza)
- Find and merge population data going back to 1973 and data about electoral democracies
About the MSI blog, we didn’t get any updates from the editor which was frankly speaking, very unprofessional. We have decided to reach out to BNE Intellinews, another news reporting site in Europe to see if they are interested in publishing my blog. Let’s see what happens!