Algorithmic Criticism

Today I want to present I paper which made me think about Digital Humanities. It is called “Algorithmic Criticism” by Stephen Ramsay.

Unlike most of the other papers that only focus new algorithms and new data, this one also focuses on methods how the two parts of the digital humanities can be combined together. He wants to develop a criticism (which is normally a method more used in humanities) that is based on algorithms.

He argues that even in literature research it could be possible to have approached that are a lot more empirical, meaning you have an experiment and quantitative measurement to proove your claims. Another important point that he states that computers might not be ready for that kind of analysis (the paper is from 2005 though), but in future may be, so he believes that these methods will become available.

One of the central points is that he argues every critic reads a text using his own assumptions and “sees an aspect” (Wittgenstein) in the text. So the feminist reader sees a feminist aspect of the text, and also the “algorithmic” reader can see the aspect of the computer or can read the text transformed by a computer. The paper at the end presents some research doing tf-idf measures at the novel The Waves by Virignial Woolf.

I really like this idea to have a certain way of reading a text by letting this be done by a machine and that it is considered similar to a human reader, which is also not completely effective and free of bias. This also is good for the researcher in NLP, because so you can admit that the judgement the computer gives is also not free of bias, for instance if you change the parameters in your algorithm.

 

How to be a modern scientist, Google Tensorflow

This post I want to share a few things that just came to me the last couple of weeks and think there are worth sharing:

There is a new episode on Open Science Radio. This is a German podcast about Open Science and other stuff that is related. They also have some episodes. One thing they talk about it is Jeffrey Leek, a researcher in (bio-)statistics who wrote a book about being a modern scientist, which you can get for free or for a donation. And he also teaches classes via Cursera in Data Science. I can also recommend a lot episode 59 of open science radio about OpenML, which I think is also a very cool project

Google is open sourcing a tool for visualization of high dimensional data, Tensor Flow. The standard visualization shows word vectors. In my opinion this visualization is a little tricky because stuff that appears to be close in this three-dimensional view is in the real vector space with a couple hundred dimensions not close. But it is still a nice tool in order to explore how word vectors behave on a very large dataset that you do not even have to train yourself. You can also use the tool for plotting other high dimensional data.