Categories
General

Why political discussions on Twitter don’t work

Twitter and social media are everywhere. It also seems that political discussions are mainly driven by Twitter and other platforms. Let’s just have a look who is on Twitter in Germany: Only 8% of Germans do have an account on Twitter. Probably this population is even screwed to people who are young, do have higher education and/or work on media jobs. In the US this looks different, there are close to 50 million active users, so one in every six Americans uses Twitter.

But still, it is now possible to talk directly to your politicians and tell them what YOU think and what is important. This means that politics comes down to the “ordinary people”. Well, after looking at who is on Twitter, it is evident that this is (at least for Germany) not true. Furthermore, I think that some (especially complex) discussions are better not done online in general. There are many reasons for this, I think three are most relevant:

  • Audience
  • Trolling
  • Attention

Audience

Everyone working in sales or marketing knows: defining a target audience is really hard: Who do you want to sell your product to? How should use your software? These questions are not easy. If you are on an open social network, this gets even harder. You do not know if your political views about migration are read by white supremacists or Antifa people. Which brings us to the issue that there is constant misunderstanding. This is sometimes intended, but maybe often not. With a joke you might be in some friendship circles a funny guy, in other a nazi, or you might be considered left-wing by some people, but think your opinion is quite moderate or even conservative. A nice piece about this is this article about PewDiePie who became a symbol of white supremacists. Although you probably have to admit that PewDiePie himself does not really know what he believes.

Trolling

Trolling is also a great problem. People troll because they want to annoy people, maybe just for fund and yes, some might even get paid to do it. I personally respect trolling as a social mechanism that can be very well played. See for this this great presentation. The issue is also not new, you had trolls since the internet started. What is the problem now? When I first used the internet, it was mainly forums where discussion happened. When you were annoyed of the trolls, you could ban them, some forums also created places where trolls could do their trolling without annoying other people. Another rule was not to feed them. Just ignoring helped quite well. What is different now? On Social Media, you cannot just open hashtags or sub-forums where the trolls can live. They still stay in your newsfeed. Yes, you can block them manually, but this is also very time-consuming. Second, trolls get fed no matter what. This old machinery of not-feeding seems to scale very bad. The trolls always find someone who is obsessed with their trolling and so they win. Sadly, the mechanisms of attendance play very well for trolls, too:

Attendance

We all know Twitter and other services sell our attendance to their ad-buyers. (Just read the book The attention merchants by Tim Wu, if you are more interested) Based on this, it is important for these platforms that we spend more time there in order to make more money. So there is an incentive for them to keep us as long as possible on the platform and probably also show opinions that are more controversial, maybe polemic, but probably not focused on a good discussion. If we look at this from the other side, it also gets clear what you have to do if you want to have a lot of followers: produce content that the services rank high and show it often to people. There you are, you produce polemic, polarizing content in order to get ranked higher by Twitter and reach a higher audience.

Summing up, we see why so few people I know in personal do political stuff on twitter. I personally enjoy the discussions about information science, user experience or data science on Twitter a lot, but this is mainly because the audience is clear and there are close to none trolls. Therefore Twitter is for me a great professional tool, but not a digital place to have political discussions. It might make sense also for the media to not see is as this.

Categories
General

Minimalism/Frugalism

I would not consider myself a big minimalist, but after living two months on stuff I could fit into one suitcase and watching some videos about it, I am quite confident to talk about my experiences. I also try to have a quite minimalist lifestyle back in Germany, though.

First, what does it mean? Basically, if you click through Youtube, you find people showing how they got rid of stuff (mostly clothes), room tours through quite empty flats and experiences of people telling that they felt better than before.

I like this approach to have less and therefore not having stuff with me that I need to care about because they need space or I have to take care about them while cleaning them or – with electronic devices – updating. This saves you a lot of time and you can concentrate on things (or people) that are important for you, rather than something that is not important for you (anymore)It really has the potential to make your life easier.

But actually I do not agree with the way of getting rid of things. I actually think it is sometimes also smart to keep things than giving them away now and buying them back later, at least from an economical point of view. One pattern that arose was the people were at the beginning talking a lot how great their life is with fewer stuff, but after a while they see they might need more stuff and so minimalism helps them to find a good balance how much to own.

While clicking through Youtube I found the interviews most interesting where people give a review after living this lifestyle for a certain time. Many got to have more stuff than after they cleaned everything. I do not know how much stuff I own, but I in general like to keep it small and every once in a while I try to get rid of stuff I do not need anymore. Still, I think it makes most sense not to buy everything.

Good links to start are this video in German as well as Minimalismus-Podcast and The Minimalists.

Frugalism

Which brings me to a trend you see often when you look these videos: Frugalism or FIRE, which means Financial Independence Retire Early. The idea behind it is, to save money while you are young, invest into stock exchange, get rich and not work anymore afterwards. The good point again is to see what you really need and then think about if you really have to spend money on this. The bad thing still is that you still have to earn a lot of money, which might be possible for college graduates in their twenties, but not for everyone. For instance, if you need 15.000€ a year, you need to have 25x that sum because the idea is that your stocks raise in value by 4% a year. So you can take out 15.000€ a year while your money in total does not change. So you need to have 15.000€x25=375.000€, which is a lot.

The FIRE folks argue, you can create this amount by investing in stock exchange and get a good interest rate from there: They calculate with 8%, which seems quite accurate, but you still need to earn quite well, so you can invest in the stock exchange.

What I am taking from it? The concept to get an overview of your finance as well as the stuff you own is really great. Many people do not even track their spending, which is actually quite easy, there are a lot of apps around there where you can add your spending, define budgets or even synchronize directly with your bank account. A good video to start is How to be a financial minimalist.

Summing up, I think both concepts are quite good, you see many people putting it into the extremes, but this is not necessary. Integrating at least a little bit of frugality and minimalism into your life might help spend less time on things you don’t like and make your life easier. Still, a good guidance for how much is Marie Kondos:„Does it spark joy?“

Categories
General

Installing Nextcloud on a Raspberry Pi using snap

Nextcloud offers some great functions when it comes to sharing data in your local network. I mainly needed a tool to sync a calendar on my network with my devices. I was also thinking about a lighter software like Baikal, but the installation of most of the others seemed more difficult to me than nextcloud, so I choose this.

I used the snap package of nextcloud, so I did not need to do a lot of configuration for the server and it is just ready to start. Have a look at the snap installation.

Connecting calendar clients

I connected an android client, evolution and an ios client. The connection of the android device and evolution is pretty easy, it just works if you create a new calendar using the provided link. For android you have to use the software DAVx5 to establish a connection.

On ios the connection only works with https enabled (I did not find another way to set it up, if you know, please share). Therefore you need to set up https using snap. (if you run into problems on your pi, maybe have a look here) After this you can simply create a new caldav-connection on your ios device. In my case it also worked directly with the basis url (e.g. https://yourdomain.com/nextcloud) without the other adresses the manual wants you to use.

Categories
General

My Audio setup

In this article I will give a little big of background to my music system and why I use the certain components.

First, the speakers: I use some speakers which enable me to give input via USB. This has the great advantage I can use whatever computer I want to push the music signals to the speakers. (side note: I am using some speakers by Nubert, which also have digital and analogue input for other sources) My music comes directly from a raspberry pi with Mopidy and Raspotify. These two tools allow me to play basically all the music I have on hard disk or to stream it from the internet. I also programmed some small software, that shows the title and artist on a small display and I added some buttons, so I can pause the track and turn the pi off.

The connection using USB is actually very great since it reduces the amount of cables a lot. Before this, I used an external sound card, which also worked well, but needed more space. Another option would be HiFiBerry, but I do not have any experience about sound quality.

Why did I use this setup and not buy directly speakers like from Sonos or other systems? I like to keep my system as easy to repair and change as possible. My speakers will probably last longer than my raspberry, so I want to able to change the way music comes to the speakers. Also, I want to be able to change my streaming service. All this might or might not be possible using out-of-the-box systems.

And finally, the raspberry pi and mopidy are open systems, where you can add your own code in order to improve your setup.

Categories
General

Raspotify – Turn your pi into a Spotify server

As some of you know I really like to use a raspberry pi as music server. Therefore I want to introduce a nice tool: Raspotify. This little program turns your raspberry pi into a Spotify server as you might also get it when listening on your computer or on other systems like Sonos. The really cool thing is: it just works out of the box and you can use your phone as a remote control. I actually prefer it right now over the usage of Mopidy-Spotify, which has some problems since Spotify is blocking the API more and more. For instance, it is not possible anymore to load playlists.

I do not know if this will improve in the future or if Spotify nudges us to use their service. I would prefer to have a system working directly together with mopidy, still bundles everything, but we will see.

The installation procedure of Raspotify is actually as easy as it can get: you only need to type one command and it does everything. But pay attention to change the settings (in the paragraph Configuration at the page) to get higher quality playback.

Categories
General

Digital Humanities at Hochschule Darmstadt

In summer semester me and Professor Rittberger will be giving a class about digital humanities at Hochschule Darmstadt in the major information science. Here is our syllabus, I will try to upload slides as well (but in German).

We want to give a broad overview about what Digital Humanities mean. There are also other classes dealing with text-mining, so we do not focus so much on it (there are four lessons about it, though)

  1. Introduction to Digital Humanities: What are DH, what can we do with digital methods
  2. Research methods: Qualitative and quantiative methods in social sciences, hermeneutics, virtual research environments
  3. Law, ethics: Basic understanding what law means and what problems it can cause. This leads to data management and open data
  4. XML: Basics about XML, why DTDs are useful, standards like TEI (2 sessions), XML regarding ontologies
  5. Editions and digitalization: What are editions, how can we create them digitally? How do we digitize content?
  6. Basics of Text analysis: Distant Reading, Google n-grams, how new methods in text analysis can help in research
  7. Named-Entity-Recognition: We chose this problem of NLP to give an overview of what can be done using new technology and also to compare approaches from computer science like machine learning with approaches from information science and semantic web
  8. Topic Modelling: Basic introdution and practial usage with R
  9. Network analysis: Basics of network analysis and how to use it for instance for plays. Tool: Gephi
  10. Geoinformation: How can we code geographical data, how can we use it in DH?
  11. 3D-Modelling: What new approaches are there using 3D-Modelling, how can we use it in DH? Tool: Blender

 

Categories
General

Semantic MediaWiki Conference (SMWCon) Fall 2017

Last week there was the SMWCon, the european conference on Semantic MediaWiki. It was held in Rotterdam. Our venue was directly in the Zoo in the middle of the aquarium. So we could watch sharks and turtles during the talks!!

But there were also very interesting talks. The information to most of the talks you can find here. I will describe some of the talks that were interesting for me because they dealt with stuff I might use in the future.

The keynote on the first day dealt with firefighters and their problem with information overload. Also fire fighters have the problem: you have a lot of information, it is hard to find it, it is in different formats (GPS, Information Systems, paper copies). But fire fighters do only have limited time until they reach the burning building and have to act then and cannot loose even more time reading documentation. So they need the right information in time, which is quite difficult.

He also stressed that machine learning and reasoning over knowledge are nice, but you sometimes especially as fire fighters you have completely new cases, but actually the world and technology changes, so you still will face new obstacles. An example could be the case of a burning of a car with and electric engine.

Karsten then introduced the new stuff they develop for SMW 3.0, which will be a major release. He also stressed that the software needs better documentation, something I also encountered when I tried to introduce new people to SMW. But this is a problem in lots of Open Source projects: People like to code, but not to write documentation. This also shows that for Open Source projects you do not only need coders ;))

Tobias introduced annotation tools for images, text and videos. This was developed as part of our projects and we would be excited to see use-cases and of course feedback to the extensions.

The keynote on Friday was about a project called slidewiki. This is basically a wiki that allows you to create and re-use presentation slides, annotate them, link them to topics and so on. It is really cool because other projects like Slideshare do note allow this and also do not allow forking of that.

The second talk was by Cindy Cicalese, who works for Wikimedia Foundation. She introduced that she will be advocating the 3rd party developers more in the project management of MediaWiki. You can go to their site to see stuff they want to do. In short:

  • They want to do content revision, so making more than one slot on a wiki page, a functionality that right now you can only get with SMW and PageForms
  • They also tackle the installation, updating and maintaing of wikis. This is actually a very important topic that basically everyone in the community faces. We normally do not have one wiki, we have a lot more. And updating every single one and also setting up is cumbersome.
  • They want to introduce a roadmap to make the development of MediaWiki more predictable. This would also help 3rd partys because we can tell our customers if a certain feature will be implemented soon or not

After that, Alexander Gesinn introduced a pre-configured virtual machine. Actually this might be nice for people who only want to try out SMW, with productive usage you still face the problems with maintenance. He named three things every enterprise-wiki has to have (and I agree with him):

  • Semantics (SMW + PageForms)
  • VisualEditor to not torture users with Wikisyntax
  • a responsible skin to have nice-looking wiki on mobile devices

Remco hat on thursday also a talk about a similiar topic, he called it wiki product lines. A product line is similar to the industry where you have different TVs that are all basically the same, only the screen size changes with the different products. He explained from a little bit more theoretical standpoint where he sees potential. To me this looks like a problem that will be tailored and we might have some (hopefully) completely free and documented solutions for this.

End the end of the day there was also a workshop organized by my colleague Lia and me. We said to set up a page on the SMW-Wiki where we collect projects and how we might use the stuff in the future.

Overall, it was a nice conference and I got to know many nice people. Also thanks to the organizers 😉

Categories
General

How to make your own Raspberry Pi Musicbox

This is not about NLP, but I think it is worth sharing, so here we go 😉

I really like the project PiMusicbox, the raspberry pi is just the perfect device to host a music server, especially the model B1, which is also a little bit slow when it comes to video playback. But there are a few drawbacks, like that updates are quite hard and you cannot easily customize it. So I just set up the system in a different way, directly from scrath on Raspbian. In the next steps I show you how.

Thoughts before you start

This is what comes to my mind if someone asks me if she whether should take the normal version of PiMusicbox or mine.

Pros

  • You can update all the time. PiMusicbox you have to reinstall with every new release.
  • You can customize it as you want. The normal PiMusicbox does not provide Podcasts or Files.

Cons

  • Configuration is done manually. You should be able to connect to your pi via ssh. Alternatively use the Websettings package.
  • You do not have a way shutdown button integrated. But you can use RaspiCheck.
  • It is slower. In PiMusicbox a few tweaks are done to improve booting of the system. I did not do that.

Download Raspbian minimal image

You can get this directly from the website of Raspbian: https://www.raspberrypi.org/downloads/raspbian/

Install mopidy + run as system service

In order to start it when the system boots, you only have to type (source)

sudo systemctl enable mopidy

Install add-ons

I used Spotify, Podcast and files, because these are all I use with the raspi. You can get an overview at the documentation of Mopidy. Generally speaking you can download all extensions either via pip or via apt, depending on in which repo they are. This makes it sometimes a little bit confusing, but you’ll find everything, I am sure 😉 You can find all of the extensions for playback (the documentation calls them backend-extensions in the mopidy-documentation).

Also, you need an extension for a HTML-frontend. I used the one made for the PiMusicbox. An overview can be found here.

Mounting your external storage automatically

For this I used usbmount, which is a small programm that just mounts external storage devices automatically. This can of course be done via scripts as  well, but I did not want to mess around with scripts, so I used this approach.

Configurations

The config file for mopidy can be found at /etc/mopidy/mopidy.conf, if you run it as a system service, not in your user’s directory. To shorten this paragraph I just paste my config file here and make some comments:

[core]
cache_dir = /var/cache/mopidy
config_dir = /etc/mopidy
data_dir = /var/lib/mopidy

[logging]
config_file = /etc/mopidy/logging.conf
debug_file = /var/log/mopidy/mopidy-debug.log

[local]
data_dir = /var/lib/mopidy/local
media_dir = /var/lib/mopidy/media

[m3u]
playlists_dir = /var/lib/mopidy/playlists

[musicbox_webclient]
enabled = true


username = # your username
password =  # your password
bitrate = 320 # better sound quality

[http]
hostname=0.0.0.0 # VERY important. Otherwise you cannot reach it from outside

[mpd]
hostname=0.0.0.0 # VERY important. Otherwise you cannot reach it from outside

[podcast]
enabled = true #only need to activate it. 
browse_root = #Path where your .opml-file with all your podcasts is

[file]
enabled = true
media_dirs = /media/usb #this is where your external storage is mounted via usbmount
show_dotfiles = false
follow_symlinks = false
metadata_timeout = 1000

I hope this tutorial helped you.

Bonus: Setting up SMB share

If you have a hard disk connected to your Raspi, you can easily share the files in all your network. I used the tutorial given by putokaz to install and configure it. I just configured it like the share for the torrent files at the end of the tutorial, but this is up to you.

Categories
General

Named Entity Recognition

Named Entities are nouns that – simply speaking – refer to something in the real world. An example would be the noun Los Angeles, which refers to a city in the US, unlike the noun apple, which describes a fruit. For tasks in information retrieval it is very useful to know whether a noun refers to a named entity or not because it is a common task in search to find named entities, for example if you want to make a trip to Los Angeles. It will then be important because people do not want to find information about the two words los and angeles, but information about this particular city.

So how do you recognize it? There are several techniques, that are used and combined. One way is analyzing parts of speech and trying to detect when a certain pattern of for example two nouns occurs (like USB device). Another way is to look at the sentences for keywords that may refer to named entities and then analyze if in this part of the sentence there are named entities. For instance, in patent retrieval new inventions are described in a way that does not make clear what it really means in order to make the patent claims broader. For instance, a floppy disc drive can be

At the end you can combine these techniques with machine learning, so you can mark named entities at a data set and let an algorithm learn, which of the nouns are named entities.

Another difficulty is the mapping of named entities. For example you have a text about politics in Germany. This text talks about the chancellor of Germany. You can use this information, but you still do not know if Angela Merkel or one of her predecessors. You will need more information to figure out about whom this person is talking, like the date when the article was written. Another awesome example is Java, which is an island and a programming language. There is also a book that uses this ambiguity. It is named Java ist auch eine Insel – Java is also an island.

You can find more information about this topic for example at Marrero et al. and more general information at Wikipedia.

Categories
General

Query Clarity

In IR you got your query and from this query you get a result. But how good is this result? One way to measure this is by calculating the clarity of the result. The clarity means – generally speaking – how much the found results differ. You can measure this when you look a the result sets and try to find out how much the words in the found documents differ. Query clarity can tell you, how much ambiguity you have in your query.

Of course there are different ways to calculate the query clarity. The basic model is the one introduced by Cronen-Townsend et al. Others are the Improved Clarity Score by Hauff et al. and the Simplified Clarity Score by He and Ounis.