This post I want to share a few things that just came to me the last couple of weeks and think there are worth sharing:
There is a new episode on Open Science Radio. This is a German podcast about Open Science and other stuff that is related. They also have some episodes. One thing they talk about it is Jeffrey Leek, a researcher in (bio-)statistics who wrote a book about being a modern scientist, which you can get for free or for a donation. And he also teaches classes via Cursera in Data Science. I can also recommend a lot episode 59 of open science radio about OpenML, which I think is also a very cool project
Google is open sourcing a tool for visualization of high dimensional data, Tensor Flow. The standard visualization shows word vectors. In my opinion this visualization is a little tricky because stuff that appears to be close in this three-dimensional view is in the real vector space with a couple hundred dimensions not close. But it is still a nice tool in order to explore how word vectors behave on a very large dataset that you do not even have to train yourself. You can also use the tool for plotting other high dimensional data.