Citizen Science in research about schoolbooks

As some of you know, we started the project Interlinking Pictura some time ago. We also published an article about this as Bridging citizen science and open educational resources at OpenSym 2018 (DOI: This was a rather technical article about the project and how we wanted to integrate citizen science. Later, some collegues asked, if we wanted to make an article for a non-technical audience about our citizen science project.

In our new publication, we present potentials for citizen science in the field of history of education. It does not go deeply into our research, but gives an overview on projects in the field and what we tried and learned. So it might be a good start for your citizen science project in digital humanities. You can find the article “Potenziale von Citizen Science in der historischen Schulbuchforschung. Das Beispiel Interlinking Pictura” at


From project to infrastructure – the Schularchive-Wiki project funded by the Fellow Program Free Knowledge

The continuation of the Schularchive-Wiki within the framework of the Fellow-Program Free Knowledge.

I wrote this text to reflect on the Fellow-Programm together with my mentor Maximilian Heimstädt. It was first published at the blog of Wikimedia Germany.

The Project

As part of the Fellow Program Free Knowledge, the project Schularchive (school archives) was funded. This project aims to collect historical sources for school research and make them available to all interested parties and researchers. Such sources are often very difficult to find, often only through personal contacts. Schularchivewants to solve this problem and thus also contribute to better equal opportunities within research.

We see the project Schularchive primarily as an infrastructure on which researchers can document their research sourceand thus build up synergy effects in this research of sources in the sense of Open Science. In addition, we help to make the holdings in state, regional, and school archives more widely known.

At the beginning of the Fellow Program, the platform was already available as a prototype. Within the framework of the Fellow Program Free Knowledge, we carried out various activities. These served primarily to raise awareness of the project and to involve more people:

  • Two workshops (Fall 2020 and Spring 2021)
  • Establishment of a Twitter channel @Schularchive.
  • Importing data from state archives. Contact with the various state archives was also made through the workshops. Currently, only data from the state archives of Baden-Württemberg are on the platform; in the next few weeks, data from Bavaria will also be added, after which other archives will be requested.
  • The link to data from Wikidata was further expanded. In the meantime, a lot of data such as pictures, foundation and web pages of schools are taken over from Wikidata. The link to other sources such as the German National Library is currently being pursued further as part of a student project.


After half a year on Twitter, we have about 60 followers. This is not very many, but also not too bad, since the community is relatively small. However, the project is already very well known within the community of historical educational research and through the workshops we were able to establish contacts with state archives and school archivists. In addition, we presented the project at various academic conferences. The workshops, on the other hand, proved to be very useful for imparting knowledge, getting feedback, but also as an advertisement for the project to attract new interested people. The goal of establishing more contacts with schools proved difficult due to the COVID-19 pandemic.

In principle, we would also like to see a deeper integration and further development of the platform in the historical education research community. During the workshops, useful suggestions always came up, which we gladly took up. It would also be nice to build up the community permanently. At the same time, it became clear to us that this would be difficult, since most of the potential contributors are also involved in other projects. A good possibility is therefore to market Schularchive more as a tool for Open Science in research projects in order to get more people on the platform.

Future Plans

After the end of the Fellow Program Free Knowledge, we would like to write a proposal to further advance the project. In addition, through networking, we were also able to convince various other research teams to use our platform for their research, so that indirect funding is also possible through it. This two-pronged approach makes sense, as not all research proposals are approved and in the case of infrastructures, often only large projects are funded. However, the community of historical education research is not that large. At the same time, we do not think it makes much sense to make such an infrastructure the responsibility of a few people within the institutions. This is because there is also the risk that a lot of knowledge will be lost when people change or that the platform will die completely. It is also important to us that the connection to the community remains firmly established, which is ensured by the Library for Research on the History of Education of the DIPF and the department “Historical Educational Research” of the Ruhr University Bochum.

Lessons Learned

The project showed us that three points in particular are very important in order to perpetuate open science projects:

  1. Sustainability Try to locate the infrastructure within an institution that has an interest in the project. In addition, funding through other projects is useful.
  2. Involvement in subject community Involve your project in the subject community and enable easy participation, for example by providing OER on your project.
  3. Create publicity Make your project more known, we used Twitter, workshops and social contacts for this, but there are many more. This will make you more known and show that the project is important. You will also find people who want to participate and maybe even raise money for you.

Evaluation of German open data portals

In the last years, many cities and states in Germany built up portals where they publish data, for instance about how they spend money, to an overview of the trees in their city. This Open government data is becoming more and more important. First, transparency of public institutions is always important and second, data-driven services might be able to use this data in a great way.

In this publication we evaluated how well the open data portals of the largest German cities and states are doing with the focus on educational data. Our findings are:

  • most portals already use open licences
  • but many portals do not use machine-readable formats like csv, they rather rely on PDF
  • Not all portals use metadata standards
  • Many cities do not share the projects being created with their data
  • There is not one portal which is best in all categories
  • Many cities do not have open data portals yet

We presented this at ISI 2021, you can find the article at


Fellowship Free Knowledge

I am happy to announce that I got accepted for being a fellow for free knowledge and open science (Fellow-Programm Freies Wissen), which is sponsored by Wikimedia Foundation. The program includes some money as well as mentoring and opportunities to network with other people, which are enthusiastic about open science.

Due to corona, the event was online. The first day we had a nice presentation with Judith Simon about ethics in computer science and especially in artificial intelligence and machine learning. The second day was all about our projects. We talked to our mentors and the people from Wikimedia gave us an overview over the program and open science in general

You can find my project at the project page. It will be about improving our project Schularchive (school archives). Our focus is three-fold:

  1. We want to promote the platform more to attract more users within the research community history of education as well as archivists at schools.
  2. We want to improve the platform using feedback of users.
  3. We want to dive deeper into the question how data stored at our wiki can be connected with wikidata and ultimately, which data should be in our wiki and which data should be in wikidata

If you want to connect or use the platform for your teaching, contact us via the platform or via or Twitter account: @Schularchive.

We also did a little networking and other fellows recommended interesting pages I want to share:


Collaborative open analysis in a qualitative research environment

Together with some collegues I recently published a paper about the use of a vritual research environment for teaching the qualitative method objective hermeneutics. It is a follow-up of the paper SMW Based VRE for Addressing Multi-Layered Data Analysis my collegues did in 2017 where they presented the virtual reserach environment (VRE) and anticipated use cases. This time we evaluate the usage of the VRE. We did this using questionnaires for the students working with the VRE. We see the main potential in the guidance of students through the research process as well as in the tracing of the research, which also connects to principles of open science. The paper also discusses the pedagogical boundaries of this work since students mentioned being more distracted while from working from home than meeting in personal. The analysis was done pre-corona, so this might have changed now.

I also think this research is quite interesting when considering that a lot of teaching is done online now. If you want to try out the VRE, please contact me.


Sustainable Software

When we are talking about free software, the point is often that this is more sustainable than proprietary software because everyone can edit the code and even if your company goes bankrupt someone else can take over and go on coding. Actually, in many open source projects there is only one person doing most of the developing work and there is also the risk of abandonware, software that used to be maintained, but the maintainer has moved on and does other stuff now. Still, no one else is taking over the code due to several reasons like bad documentation, complex code, lacking skills. So at the end the whole programm is written new from scratch in the next project (I see this quite a bit in science). Luckily there are institutions like the software sustainability institute tackling some if these especially technical issues, but I want to put more emphasis on social issues.

So actually the question we have to ask: how do we make software more sustainable? I see one crucial point that is true for software as well as any other voluntarily work (be it sports clubs or cultural/political groups): how easy do you attract people and how easy is it to participate in your project? Often, it is only one person working on a project. If this person stops, the whole project goes down. So what should happen? I think there are three more social than technical levels in which many projects might need some improvement:

  1. Community building: it should be easy to join your project. Connect, network with others, show that the atmosphere you do things is nice. Threat people reporting bugs nicely, talk to people and show that you are a person or group of persons it is fun to work with. Remember, people do this often in their free time
  2. Documentation: Make it easy on a technical side to join your project. It should not be easier to re-write the whole software than working on existing code. If you are a political group or other, also document what you do and what you did and why you did it. If you write code, also do this. Also track decisions, you do not want other people to make the same mistakes again.
  3. Financing: yes, financing. How do you expect people to work on things when they still have to pay rent? Therefore it is important to have your project on a stable ground, if you want to have it running. This does not mean that you sell out or try to get rich from ripping of your users, but it means that you think if you want to spend a reasonable amount of your time (or support someone else to spend a reasonable amount of their time for your project and think about putting some money in this). In software this also tackles licenses (another boring topic, I know, but there is also help.)

Summing up, I think we need to talk more about these things when developing free software and I also know that it is not the tasks most programmers are good at, but might be some skills to acquire in the future or attract people having these skills for our projects or software. I also want to show that if you are not a programmer, you can still do very important work in this background.

And even if you do not want to become active in open source software development, there are a lot of clubs, sports teams, political groups that will be happy to use your input and exptertise


Open Science Barcamp 2020

This year, I attended again the barcamp open science in Berlin. Due to corona, there were less people than last year, but the experience was still really cool. It is always nice to meet people and chat about open science. In all sessions there were pads where people could add their notes. There are also interviews on Open Science Radio.

The day started with the ignition talk by Birgit Schmidt, who works at University of Göttingen, State and University Library, you can also get the slides. She summarized the actual state of open access science publishing and put emphasis on putting this into the bigger picture and connected this topic with issues about funding as well as open peer review.

I attended four sessions: One about findability of research software, one about diamond open access and two about digital humanities. It seemed to me that this year the barcamp was more focused on certain topics, which way either because of less participants due to the beginning of the corona crisis or because the people attending were more focused on their topics.

Findability of research software is in my opinion a very interesting topic. For an information scientist, software is not findable just because it is on GitHub. On GitHub there are no identifiers, no keywords and often it is also not clear whether the software is still maintained or works with on an actual environment. Therefore I can easily relate to the summary we found in the pad: Research software is often not formally published at all (even though it is available, e.g. via GitHub), or published in specific Software journals (which are not common in all disciplines). This is a problem on two levels

  1.  Existing software cannot be adequately found and people work on the same issues without being able to build on pre-existing work.
  2. It is difficult to get proper credit for your research software and link it to the existing reputation system (that is very much focussed on reputation by formal publication in a journal).

Diamond open access was new to me. Basically it means that you try to keep the licenses of the articles also in your hands and try to do all the publishing process within the community in order to get rid of big journals. So the only infrastructure you have to provide externally is a publication system. For this, there exists especially one system: ojs (open journal systems), which is free software and runs on a server. I really liked this approach because it tackles some problems that still exist with open access nowadays like publication fees and the fact that publishers take your intellectual property away from you. The downside of course are the costs for the infrastructure: I do not have a clear number, but there needs some effort to be put into the hosting and providing of the system, so you also need (public) money or great efforts from within the community in order to run these systems. There are actually some projects even at DIPF doing this and I think it will be interesting to see in the future what happens to these projects.

The workshops about digital humanities were sometimes a little bit challenging. We had started with several discussions what might be problems when it comes to open research in digital humanities and we also have to acknowledge that other fields (especially in the natural sciences) are ahead of the humanities. This lead to interesting discussions in the workshops and still the problems that most of the people attending the open science barcamp do have a background in natural sciences or engineering, where open science is way more established than in the humanities.

I think in digital humanities there are actually two things happening: First, there is the will of a lot of people to make their research more open (I can see this when I talk to people during my dissertation). On the other side, we are also in the middle of the digitization of the field, so there is a lot of stuff tried out as well as researched. I would also argue it is not true that there is not so much open science going on in DH. Just think about all the projects to digitize old writings or the corpora created in linguistics. We see a lot of these processes and actually I think it is really interesting to be in these processes now to see what is possible and what is not possible in the future.

Summing up, it was a great event like last year, and thanks a lot to the organizers.


FAIR Software

Some of you might have heard about the FAIR principles for data. Since the paper was published in 2015, it became state of the art in data sharing. But data is not all that is needed to make research more transparent. Software is another very important part.

Tackling this topic, the German National Library of Science and Technology hosted a workshop to make software also more FAIR. There have been varios posts, you can also see the complete sessions and the exercises online.

I actually liked the workshop a lot and it is worth having a look at the sessions. It also showed that there are still certain boundaries. For instance, there are no real repositories for scientific software with a search interface that can be narrowed down to scientific criteria. I also know that people are working on knowledge graphs, but right now there is often no good way to link data, software and published results. I liked the approach of Zenodo to provide and easy way to reference software and get a DOI for it, but there are not many metadata available about the software.

The workshop involved a lot of hands-on sessions, the overall principle was based on the carpentries, especially library carpentry, which is a workshop format that is completely open, so everyone can work with it and use it for their own workshops.

I learned a lot and thanks very much to the organizers.