How to FAIR? - Make your data Findable, Accessible, Interoperable and Reusable


FAIR stands for Findable, Accessible, Interoperable and Reusable. Implementing the FAIR principles for data can be a challenge though. In this post I want to dive into how to do it in practice.

The FAIR principles are at the core of many current initiatives in research and beyond. For example the German National Research Data Infrastructure (NFDI) consortia are working on making research data (and sometimes software) FAIR. There is even a special FAIR Data Spaces project. But what does FAIR mean for you?

There is a collection of nice resources on FAIR data from the University of Mannheim, that I recommend to check out: https://github.com/UB-Mannheim/FAIR-Data-Week. In the following I will use these and my own experience to give you a bit of a guidance on how to get started.

"Both humans and machines should be able to find, access, interoperate (with) and reuse both data and metadata."

­- slides by Renat Shigapov, University of Mannheim

Generally for all FAIR principles

The FAIR principles try to help you answer the most common questions people have about data.

By the way, "data" in this context can mean many different things. Of course things like regular tabular data sets, but also images and other research materials.

How to get started

Disclaimer: I am not a librarian. This is a data scientist's take on the FAIR principles. If I got something wrong, please let me know!

Store your data somewhere that makes sense. If you can make your data openly available regular data platforms such as Zenodo will do. Of course also field specific or institutional platforms/repositories are good options.

View the images of this post on Zenodo!

If you cannot make them openly available, you can usually still make the metadata available. Metadata is information about your data such as the author(s), how to cite it, what the data set contains, and so on.

Making your data known in the community increases not only your chance of creating an impact with your work but also your work's FAIRness. You can do so by publishing a data paper or otherwise sharing more info with the community (social media, podcasts, conferences, ...).

And then additionally...

F for Findable

(Three ingredients: data, metadata and infrastructure)

  • Attach a DOI to your data. Many data platforms (e.g. Zenodo) make that really easy for you.
  • Provide rich machine-readable metadata. If you upload your data to a good data platform, the most relevant metadata will be asked from you anyhow. So it's easy to do things right.

A for Accessible

(FAIR is not the same as Open 👉 the point is to provide the exact conditions of accessibility)

  • Explain how someone can access your data. May that be via accessing it through a data platform or through an application that is evaluated by a data-use-and-access committee.

I for Interoperable

  • Use common data formats. For tabular data that could for example be csv, for images jpeg. What's best in your community might be decided through a community standard.
  • Use words that others will understand or define them. For example if the column names in your table are not self explanatory, explain them.
  • Provide context for your data. Is it connected with other data or papers? You can also add your metadata to public knowledge graphs, e.g. Wikidata.

R for Reusable

  • Include rich machine-readable metadata according to the community standards.
  • Attach a license to your data (license is part of the metadata) that makes it clear, what others can do with your data. You can for example use the Creative Commons license CC-BY (more on how to choose a license).

Need help?

If you are a researcher and stuck on what to do, get started by thinking about how and if you can publish your data in a data repository. Libraries at research institutions are usually a good point of contact if you need support.


In other news...

Workshop: Open Replicable Research

The German Region of the International Biometric Society is organising a free 2-half-day workshop on the topic ‘Open Replicable Research’ in Großhadern on the 5th and 6th of October in cooperation with the LMU Open Science Center.

The two keynote speakers Ulrich Dirnagl, Founding Director of the QUEST Center for Responsible Research at the Berlin Institute of Health and Ioana Cristea, Meta-researcher at the Department of General Psychology at the University of Padova will give a talk about “Can statistics save preclinical research?” and "Data sharing", respectively.

The organising committee welcomes contributions from participants, including from early career researchers. Just send your abstract (maximum 300 words) until August 15th 2023 to: wsev@ibe.med.uni-muenchen.de

Train-the-Trainer pilot of the Digital Research Academy

There are some seats left in the in-person training. Wanna join us?

I hope you enjoyed this post.

All the best and happy weekend,

Heidi


P.S. If you're enjoying this newsletter, please consider supporting my work by leaving a tip.

Heidi Seibold, MUCBOOK Clubhouse, Elsenheimerstr. 48, Munich, 81375
Unsubscribe · Preferences · My newsletters are licensed under CC-BY 4.0

Dr. Heidi Seibold

All things open and reproducible data science.

Read more from Dr. Heidi Seibold
Mel, Joyce, and Heidi at the notary

I am excited to share that the Digital Research Academy incorporation is almost done. We had our notary appointment last week, set up the bank account and are now waiting for the official registration of the DRA Digital Research Academy GmbH. I published my first post about the initial idea for the Digital Research Academy in May 2023. Now, just a year and a few months later, the Digital Research Academy is becoming not only an initiative but a company. Last week Joyce Kao, Melanie Imming and...

Logo

I am in the process of making my newsletter FAIR (Findable, Accessible, Interoperable, and Reusable). Here's how. I am an advocate for Open Science. The FAIR priciples are very near and dear to my heart. I was excited when the Open Science folks at Jülich archived a bunch of my posts and gave them a persistent identifier (DOI, see archived posts here). This was exactly what I was still missing to make my posts more FAIR. However, I cannot expect them to take care of archiving all my posts and...

I get asked for career advice all the time (even though I am just figuring stuff out myself). Generally I try to help by listening and asking questions, but there is one thing that I tell everyone who wants to hear it: pick work where you like the people. How do you pick the research group you want to work with? My recommendation is to pick based on two things: Do you like the topics they work on? Do you get along with the people in the group (in particular your boss/supervisor)? The first is...