Categories
Science

Fellowship Free Knowledge

I am happy to announce that I got accepted for being a fellow for free knowledge and open science (Fellow-Programm Freies Wissen), which is sponsored by Wikimedia Foundation. The program includes some money as well as mentoring and opportunities to network with other people, which are enthusiastic about open science.

Due to corona, the event was online. The first day we had a nice presentation with Judith Simon about ethics in computer science and especially in artificial intelligence and machine learning. The second day was all about our projects. We talked to our mentors and the people from Wikimedia gave us an overview over the program and open science in general

You can find my project at the project page. It will be about improving our project Schularchive (school archives). Our focus is three-fold:

  1. We want to promote the platform more to attract more users within the research community history of education as well as archivists at schools.
  2. We want to improve the platform using feedback of users.
  3. We want to dive deeper into the question how data stored at our wiki can be connected with wikidata and ultimately, which data should be in our wiki and which data should be in wikidata

If you want to connect or use the platform for your teaching, contact us via the platform or via or Twitter account: @Schularchive.

We also did a little networking and other fellows recommended interesting pages I want to share:

Categories
Science

Collaborative open analysis in a qualitative research environment

Together with some collegues I recently published a paper about the use of a vritual research environment for teaching the qualitative method objective hermeneutics. It is a follow-up of the paper SMW Based VRE for Addressing Multi-Layered Data Analysis my collegues did in 2017 where they presented the virtual reserach environment (VRE) and anticipated use cases. This time we evaluate the usage of the VRE. We did this using questionnaires for the students working with the VRE. We see the main potential in the guidance of students through the research process as well as in the tracing of the research, which also connects to principles of open science. The paper also discusses the pedagogical boundaries of this work since students mentioned being more distracted while from working from home than meeting in personal. The analysis was done pre-corona, so this might have changed now.

I also think this research is quite interesting when considering that a lot of teaching is done online now. If you want to try out the VRE, please contact me.

Categories
General

Sustainable Software

When we are talking about free software, the point is often that this is more sustainable than proprietary software because everyone can edit the code and even if your company goes bankrupt someone else can take over and go on coding. Actually, in many open source projects there is only one person doing most of the developing work and there is also the risk of abandonware, software that used to be maintained, but the maintainer has moved on and does other stuff now. Still, no one else is taking over the code due to several reasons like bad documentation, complex code, lacking skills. So at the end the whole programm is written new from scratch in the next project (I see this quite a bit in science). Luckily there are institutions like the software sustainability institute tackling some if these especially technical issues, but I want to put more emphasis on social issues.

So actually the question we have to ask: how do we make software more sustainable? I see one crucial point that is true for software as well as any other voluntarily work (be it sports clubs or cultural/political groups): how easy do you attract people and how easy is it to participate in your project? Often, it is only one person working on a project. If this person stops, the whole project goes down. So what should happen? I think there are three more social than technical levels in which many projects might need some improvement:

  1. Community building: it should be easy to join your project. Connect, network with others, show that the atmosphere you do things is nice. Threat people reporting bugs nicely, talk to people and show that you are a person or group of persons it is fun to work with. Remember, people do this often in their free time
  2. Documentation: Make it easy on a technical side to join your project. It should not be easier to re-write the whole software than working on existing code. If you are a political group or other, also document what you do and what you did and why you did it. If you write code, also do this. Also track decisions, you do not want other people to make the same mistakes again.
  3. Financing: yes, financing. How do you expect people to work on things when they still have to pay rent? Therefore it is important to have your project on a stable ground, if you want to have it running. This does not mean that you sell out or try to get rich from ripping of your users, but it means that you think if you want to spend a reasonable amount of your time (or support someone else to spend a reasonable amount of their time for your project and think about putting some money in this). In software this also tackles licenses (another boring topic, I know, but there is also help.)

Summing up, I think we need to talk more about these things when developing free software and I also know that it is not the tasks most programmers are good at, but might be some skills to acquire in the future or attract people having these skills for our projects or software. I also want to show that if you are not a programmer, you can still do very important work in this background.

And even if you do not want to become active in open source software development, there are a lot of clubs, sports teams, political groups that will be happy to use your input and exptertise

Categories
Science

Open Science Barcamp 2020

This year, I attended again the barcamp open science in Berlin. Due to corona, there were less people than last year, but the experience was still really cool. It is always nice to meet people and chat about open science. In all sessions there were pads where people could add their notes. There are also interviews on Open Science Radio.

The day started with the ignition talk by Birgit Schmidt, who works at University of Göttingen, State and University Library, you can also get the slides. She summarized the actual state of open access science publishing and put emphasis on putting this into the bigger picture and connected this topic with issues about funding as well as open peer review.

I attended four sessions: One about findability of research software, one about diamond open access and two about digital humanities. It seemed to me that this year the barcamp was more focused on certain topics, which way either because of less participants due to the beginning of the corona crisis or because the people attending were more focused on their topics.

Findability of research software is in my opinion a very interesting topic. For an information scientist, software is not findable just because it is on GitHub. On GitHub there are no identifiers, no keywords and often it is also not clear whether the software is still maintained or works with on an actual environment. Therefore I can easily relate to the summary we found in the pad: Research software is often not formally published at all (even though it is available, e.g. via GitHub), or published in specific Software journals (which are not common in all disciplines). This is a problem on two levels

  1.  Existing software cannot be adequately found and people work on the same issues without being able to build on pre-existing work.
  2. It is difficult to get proper credit for your research software and link it to the existing reputation system (that is very much focussed on reputation by formal publication in a journal).

Diamond open access was new to me. Basically it means that you try to keep the licenses of the articles also in your hands and try to do all the publishing process within the community in order to get rid of big journals. So the only infrastructure you have to provide externally is a publication system. For this, there exists especially one system: ojs (open journal systems), which is free software and runs on a server. I really liked this approach because it tackles some problems that still exist with open access nowadays like publication fees and the fact that publishers take your intellectual property away from you. The downside of course are the costs for the infrastructure: I do not have a clear number, but there needs some effort to be put into the hosting and providing of the system, so you also need (public) money or great efforts from within the community in order to run these systems. There are actually some projects even at DIPF doing this and I think it will be interesting to see in the future what happens to these projects.

The workshops about digital humanities were sometimes a little bit challenging. We had started with several discussions what might be problems when it comes to open research in digital humanities and we also have to acknowledge that other fields (especially in the natural sciences) are ahead of the humanities. This lead to interesting discussions in the workshops and still the problems that most of the people attending the open science barcamp do have a background in natural sciences or engineering, where open science is way more established than in the humanities.

I think in digital humanities there are actually two things happening: First, there is the will of a lot of people to make their research more open (I can see this when I talk to people during my dissertation). On the other side, we are also in the middle of the digitization of the field, so there is a lot of stuff tried out as well as researched. I would also argue it is not true that there is not so much open science going on in DH. Just think about all the projects to digitize old writings or the corpora created in linguistics. We see a lot of these processes and actually I think it is really interesting to be in these processes now to see what is possible and what is not possible in the future.

Summing up, it was a great event like last year, and thanks a lot to the organizers.

Categories
Science

FAIR Software

Some of you might have heard about the FAIR principles for data. Since the paper was published in 2015, it became state of the art in data sharing. But data is not all that is needed to make research more transparent. Software is another very important part.

Tackling this topic, the German National Library of Science and Technology hosted a workshop to make software also more FAIR. There have been varios posts, you can also see the complete sessions and the exercises online.

I actually liked the workshop a lot and it is worth having a look at the sessions. It also showed that there are still certain boundaries. For instance, there are no real repositories for scientific software with a search interface that can be narrowed down to scientific criteria. I also know that people are working on knowledge graphs, but right now there is often no good way to link data, software and published results. I liked the approach of Zenodo to provide and easy way to reference software and get a DOI for it, but there are not many metadata available about the software.

The workshop involved a lot of hands-on sessions, the overall principle was based on the carpentries, especially library carpentry, which is a workshop format that is completely open, so everyone can work with it and use it for their own workshops.

I learned a lot and thanks very much to the organizers.