Semantic MediaWiki Conference (SMWCon) 2018

Semantic Media Wiki Conference (SMWCon) 2018 took place in Regensburg. In this post, I highlight some of the talks I liked or where I want to share my opinion on. Sorry if I did not talk about yours. This is no critique, rather due to my limited time ;))

The first day was all about business applications. It seemed to me that there were a lot of efforts to somehow standardize solutions for project management, technical documentation and other stuff. One remarkable thing was the project zoGewoon, where the company put a lot of effort in design and usability of the system. This turned out to be very interaction and easy to do, which makes sense for the target group being people with disabilities looking for a place to live. Another cool thing was the Extension VEForAll that introduces Visual Editor working within Forms. This is not possible yet and gives a great advantage when it comes to usability because Page Forms as well as the visual editor helps a lot to make editing wiki pages easier.

The keynote was about how language shapes perception. Marc van Hoof linked this from Orwell’s distopia of Newspeak to the way how we organize knowledge in ontologies. He also votes for a user-centred way of naming and creating these ontologies in order to make it easier for users to perceive information and link this to their everyday lifes. This also leads to the concept of folksonomies, although my impression was that Folksonomies are below the hype time, but maybe they come back…

On the second day my favorite talk was the presentation of the new features of Semantic MediaWiki 3.0. There were several cool things like the improvement of the list format and data tables format. Also you can enter now semantic querys in the search field directly. Karsten also visited the Wikimedia technical conference and said that MWcore will be more open to wishes of third parties, which is remarkable.

On the breaks the big topics were of course the new features in SMW 3.0 like migration and new features. Another topic was (maybe because there were many people from companies and not so much from research) the way of telling people of smw (and then also selling it). It was kind of consensus, that people tend not to talk about wikis anymore, but knowledge management systems. First, because people tend to think about wikipedia when it comes to the openness but also because there are many ways of tweaking input (thanks to forms) and style (thanks to tweeki and Chameleon) were can customize the system very much and be a lot freer than only provide a clone of Wikipedia.

Viktor Schelling talked about WSForm, which might replace page forms and does some stuff very good, like providing templates at every page and not only Template-Pages. I am very excited to see their release and try it out.Talking about graphs, Sebastian Schmid has an improvement for Result formats, which is using the library mermaid, which can display graphs, the one basic principle of knowledge organisation in SMW. I am happy to see the application of this because there is lot of graph data stored in SMWs out there.

At the end it was two nice days in the beautiful city of Regensburg. The conference dinner was really nice in the old city centre of Regensburg. Also thanks to the people at Gesinn.it and TechBase Regensburg for organizing and providing the place.

Algorithmic Criticism

Today I want to present I paper which made me think about Digital Humanities. It is called “Algorithmic Criticism” by Stephen Ramsay.

Unlike most of the other papers that only focus new algorithms and new data, this one also focuses on methods how the two parts of the digital humanities can be combined together. He wants to develop a criticism (which is normally a method more used in humanities) that is based on algorithms.

He argues that even in literature research it could be possible to have approached that are a lot more empirical, meaning you have an experiment and quantitative measurement to proove your claims. Another important point that he states that computers might not be ready for that kind of analysis (the paper is from 2005 though), but in future may be, so he believes that these methods will become available.

One of the central points is that he argues every critic reads a text using his own assumptions and “sees an aspect” (Wittgenstein) in the text. So the feminist reader sees a feminist aspect of the text, and also the “algorithmic” reader can see the aspect of the computer or can read the text transformed by a computer. The paper at the end presents some research doing tf-idf measures at the novel The Waves by Virignial Woolf.

I really like this idea to have a certain way of reading a text by letting this be done by a machine and that it is considered similar to a human reader, which is also not completely effective and free of bias. This also is good for the researcher in NLP, because so you can admit that the judgement the computer gives is also not free of bias, for instance if you change the parameters in your algorithm.

 

How to be a modern scientist, Google Tensorflow

This post I want to share a few things that just came to me the last couple of weeks and think there are worth sharing:

There is a new episode on Open Science Radio. This is a German podcast about Open Science and other stuff that is related. They also have some episodes. One thing they talk about it is Jeffrey Leek, a researcher in (bio-)statistics who wrote a book about being a modern scientist, which you can get for free or for a donation. And he also teaches classes via Cursera in Data Science. I can also recommend a lot episode 59 of open science radio about OpenML, which I think is also a very cool project

Google is open sourcing a tool for visualization of high dimensional data, Tensor Flow. The standard visualization shows word vectors. In my opinion this visualization is a little tricky because stuff that appears to be close in this three-dimensional view is in the real vector space with a couple hundred dimensions not close. But it is still a nice tool in order to explore how word vectors behave on a very large dataset that you do not even have to train yourself. You can also use the tool for plotting other high dimensional data.