The second to last week of the internship has been a whirlwind. The biggest thing we worked on this week was finalizing and presenting our intern group projects. I had mentioned in a previous post that we were all assigned to groups on the very first day of the program. Throughout the summer, we developed new ideas to pitch to various CNN executives at the presentation that occurred on Wednesday.
Our presentation went as well as it could have, and that is all I can ask for. It was very fun and interesting to also hear the pitches of other groups and see how our ideas were either similar or different.
More than anything, though, I have been working on the Impossible Task. As an update, I finally figured out how to parse the information I wanted from the generated XML file and input the results into a spreadsheet. What I am hung up on now is how to start deleting rows we know have content that determines we cannot archive that item. Let me explain:
A “slug” is a name given to a media file. It helps us determine what the file is about. However, if the slug is “kill kill kill,” that means that we no longer need that file. So, I am now trying to find these slugs in the spreadsheet and delete the corresponding rows. In doing so, I create a list of items that we are very, very confident will need to be archived, thus making the lives of the selection archivists easier.
That is part of the reason why we have been referring to it as the Impossible Task. We are not sure if the code will work, or if we will do anything to help the archivists. But I said challenge accepted, and I want to try and make as much headway on it as possible before I leave next week. The next item on the list: what is a data frame, and can I do anything with it in relation to the Impossible Task?