So I started my thesis ‘officially’ yesterday, after being told we all were allowed to proceed to masters following the exam board meeting. No actual results posted up yet, but hopefully I did well!
Unofficially, I’ve been working on my thesis for about 2 weeks, just making the skeleton application. I’m making a program called Webscavator (an amalgamation of website and excavator) and it will be a web application that visualizes web browser history. It accepts CSV files from a couple of common web history analyzer programs (so far Web Historian, Pasco and Net Analysis) and will eventually visualize them with timelines, graphs, pictures etc.
I’ve made it very easy to add other web history programs – you just have to add a class for that program and implement a processRow() method, which takes in a row of the CSV file, and returns the normalized output Webscavator expects. It was much easier to hard code each web history program converter like this than to come up with a generic CSV parser – each program produces very different files and it would get very messy. This also has the advantage that when I open source Webscavator after my thesis is done, other developers can quickly add in other web browser programs converters and even different kinds of parsers (i.e. not CSV).
If you want a sneaky peek at Webscavator then email me and I’ll give you the username and password to view the online current beta version of it. If anyone is feeling creative I’ve love a logo for it! I’m thinking along the lines of Indiana Jones meets the Internet, and also it’s written in Python, so maybe an adventurer/snake theme?
The next week or so I will be investigating the inner workings of browser history files, i.e. index.dat files for Internet Explorer, places.sqlite for Firefox, global_history.dat for Opera, history for Chrome and history.plist for Safari. There are some very detailed papers on IE, Firefox and Safari, but no one has written an academic paper on the forensics of Chrome and Opera history. I so desperately want to get published – is this an opportunity?? Expect a blog post about each one soon!