The Love Data Week will take place again from February 10–14, 2025 with a diverse and international program under the motto "Whose Data Is It, Anyway?". Save the dates and look forward with us to the local program at the UoC.
Preliminary program (registration to be activated soon):
Monday, 10.02.2025
Ich habe gar keine Daten | Felix Rau
10.02.2025, 11-12 Uhr
Speaker
Felix Rau | Data Center for the Humanities (DCH)
Content
We encounter data and data management in a wide variety of guises and under different names in different disciplines. Even if the representations of the research object in these disciplines have a less central role in the research process and among the research outputs, they still need to be managed effectively and reliably throughout the entire process. In this session, we will look at the data management of data in disciplines without data.
Prerequisites
none
Language
DE
Format
Zoom-workshop
Time
11 am - 12:00 noon
Registration
Zoom-Link
Computational Reproducibility (ReproducibiliTea Collab Session) | Dr. Mark Ellison
10.02.2025, 14-18 Uhr
Speaker
Dr. Mark Ellison | Institut for Linguistics
Content
This workshop will introduce the participants to the use of Docker. Computer systems change month-to-month, and a"long-term" Ubuntu Linux releases last only 2 years. How can we keep our software useable, if computer's operating system is always in flux? Also, how can our code be used on the many different operating systems that our collaborators, users, and would-be reproducers might have chosen? The answer is Docker. It is software that defines a virtual machine container. When you create software that runs inside a container, on that virtual machine (e.g. R or PYTHON code with various packages) it will then run onany computer that can run the Docker software. While Docker itself needs to change to fit the operating system, your container does not. Consequently, Docker is a useful tool to keep your scripts (and their output) reproducibleover a longer time period than they otherwise would have.
Prerequisites
Participants should be bring a laptop and have installed the Docker desktop app before the workshop. The app can be downloaded from https://www.docker.com/.
How to write a DMP? | Jasmin Schenk
11.02.2025, 9am-12 noon
Speaker
Jasmin Schenk | C³RDM
Content
The workshop will focus on the benefits of data management planning as a means of good practice in RDM. We will look at the relevant components of a DMP and discuss examples as well as the characteristics of different templates and tools.
Prerequisite
Basic knowledge of RDM concepts is required.
Language
EN
Format
Zoom workshop with registration (max. 25 participants)
Live Demo - LSTT: The lightweight sample tracking tool for consistent tracking of biosamples | Dr. Laura Godfrey
11.02.2025, 11-11:45 am
Speaker
Jonas Gassenschmidt, Frederik Voigt | Institut of MedicalBioinformatics and Systems Medicine (IBSM), DKFZ partnersite Freiburg - DKTK & University Hospital Freiburg and Albert-Ludwigs-University,Germany
Dr. Laura Godfrey | Institute for Computational Cancer Biology (ICCB), Cancer Research Center Cologne Essen (CCCE) University Hospital and University of Cologne, Germany & Bridge Institute of Experimental Tumor Therapy (BIT), Division of Solid Tumor Translational Oncology (DKTK), University Hospital Essen and University of Duisburg-Essen, Germany
Lukas Heine | Institute for AI in Medicine, Cancer Research Center Cologne Essen (CCCE), University Hospital Essen and University of Duisburg-Essen, Germany
Dr. Florian Heyl | German Cancer Research Center (DKFZ), Division of Computational Genomics and Systems Genetics (B260) and The German Human Genome-Phenome Archive (GHGA)
Content
In light of the growing intricacy inherent in biomedical research endeavors, the meticulous monitoring and documentation of biosamples constitute a fundamental aspect of each project. Despite technological advancements, Excel spreadsheets remain a stalwart standard for these tasks, and are continuously circulated among collaborators, repositories, and academic journals. However, these spreadsheets are susceptible to data loss, security vulnerabilities, incomplete metadata, and an absence of functionalities supporting bioinformatic data processing. To address these challenges, we introduce a versatile, scalable web-based Sample Tracking Tool, built using the Django-Framework. Our focus was on developing a user-friendly instrument that necessitates minimal programming expertise, making it accessible for small teams lacking a dedicated data steward. The tool features a customizable interface for data entry, along with a dashboard providing real-time insights into sample status and project progression. Furthermore, the LSTT facilitates the straightforward creation and export of sample sheets conducive to bioinformatic data analysis and processing, seamlessly linking samples with subsequent bioinformatic data generation, thus optimizing the workflow of biomedical research projects.
Prerequisites
none
Language
EN
Format
Live-Presentation
Time
11 am - 11:45 am
Registration
Zoom-Link
Elektronische Laborbücher. Unterstützung von Forschungsdatenmanagement und guter wissenschaftlicher Praxis | Birte Lindstädr
11.02.2025, 1-1:45 pm
Speaker
Birte Lindständt | ZB MED
Content
Das Papierlaborbuch wird zunehmend durch eine digitale Version abgelöst. Dies bietet Vorteile für den Workflow des Forschungsdatenmanagements digitaler Daten insgesamt sowie die Einhaltung der Prinzipien der guten wissenschaftlichen Praxis. Was ein Elektronisches Laborbuch ist, welche Systeme es gibt und wie wissenschaftliche Bibliotheken unterstützen können soll dieser kurze Einblick beantworten. Fragen und Diskussion im Anschluss an den Vortrag sind willkommen
Prerequisites
none
Language
DE
Format
Lecture and discussion
Time
1 pm - 1:45 pm
Registration
Zoom Link
Dinge mit Daten Love Data Week Special: Von Excel nach Python | Dr. Denis Arnold
11.02.2025, 2-4 pm
Speaker
Dr. Denis Arnold | USB & C³RDM
Content
Daten in Excel zu verwalten ist in vielen Bereichen üblich. Wir wollen uns in diesem Code Along echte Datensätze ansehen und diese mit Pandas in Python importieren. Wer mit machen möchte, braucht einen Rechner mit Excel, Numbers oder einer der zahlreichen Open Office Varianten, sowie Python.
Prerequisites
Installation von Python und Visual Studio Code Anleitungen und Materialien in EduLabs
Language
DE
Format
Workshop in Präsenz
Timet & Place
2 pm - 4 pm, University- and City Library, Room 4.006
Registration
without registristration (max. 20 participants)
Wednesday, 12.02.2025
Beautiful Data: Data visualization basics | Dr. Emilia Kmiotek-Meier
12.02.2025, 10-11:30 Uhr
Speaker
Dr. Emilia Kmiotek-Meier | Institute of Sociology and Social Psychology, International Office
Content
In this interactive session, you will learn how to appropriately visualise data from both technical and design perspectives. We will also explore tools for data visualisation. There will be time for questions and discussion.
Prerequisites
none
Language
EN
Format
Zoom workshop with registration
Time
10 am - 11:30 am
Registration
Zoom-Link
Reproducing linguistics studies from the authors' original data | Dr. Elen Le Foll, Dr. Poppy Siahaan, Rose Hörsting & Gina Reinhard
12.02.2025, 13-14:30 Uhr
Speakers
Dr. Elen Le Foll | Romanisches Seminar und DCH
Dr. Poppy Siahaan | Institut für Sprachen und Kulturen der islamisch geprägten Welt
Rose Hörsting & Gina Reinhard | Institut für Linguistik
Content
While data sharing undoubtedly helps to improve the reproducibility of published research, this session shows that it is far from a silver bullet. We report on our attempts to reproduce the results of recent quantitative linguistics publications using the authors' original data. We report on challenges that highlight recurring reproducibility issues, setting the stage for a lively discussion on the (lack of) incentives, recurring problems, and potential solutions for improving the reproducibility of quantitative research in the humanities. This session is organised in collaboration with ReproducibiliTea in the HumaniTeas (sign up to the newsletter to find out more).
Prerequisites
none
Language
EN
Format
Zoom Workshop mit Anmeldung (max. 25 Teilnehmende)
Time
1:00 pm- 2:30 pm
Registration
Zoom-Link
Bring Your Laptop and Let's Practice with cBioPortal for Cancer Genomics Analysis & Visualization | Dr. Deya Alzoubi
12.02.2025, 14-16:30 Uhr
Spreaker
Dr. Deya Alzoubi | ITCC HPC & Visualization| SFB1530
Content
The workshop will focus on utilizing cBioPortal as a tool for cancer genomics analysis and visualization. Participants will be able to explore in their own laptops, key features of the platform, practice analyzing datasets, and discuss examples to understand its application in research, gaining insights into effective data interpretation and visualization techniques. Agenda: 14:00 -14:30 Short introduction to cBioPortal 14:30 -15:00 Guided tour – Cancer studies: Exploring the data of Brain Lower Grade Glioma (TCGA, PanCancer Atlas) 15:00 -16:30 Free exploration of the platform with support of our Team
ImpoRting data in R: Why and how | Dr. Elen Le Foll
13.02.2025, 9-10:30 Uhr
Speaker
Dr. Elen Le Foll | Romanisches Seminar und DCH
Content
For many R novices, importing data is often a struggle, yet textbooks and (online) courses rarely explain the process in any detail. This beginner-level workshop, however, is entirely devoted to exploring different ways of importing data in all kinds of formats into R. To participate in the hands-on activities, you will need to have installed R and RStudio on your computer before the workshop.
Prerequisites
To set things up correctly, please follow the detailed instructions here: https://elenlefoll.github.io/RstatsTextbook/4_InstallingR.html.
Language
EN
Format
Zoom Workshop mit Anmeldung (max. 25 Teilnehmende)
Time
9 am - 10:30 am
Registration
Zoom-Link
WoRking with time seRies data in R | Dr. Denis Arnold
13.02.2025, 11-12:30 Uhr
Speaker
Dr. Denis Arnold | USB & C³RDM
Content
The University and City Library of Cologne keeps track how many visitors are inside the public part of the building. In this course we want to look into this data with R and see how we can deal with date and time with functions from base R. We will also look into creating factors from the output of months and weekdays functions in a reproducible way to prepare our data set for the use with other packages like e.g. ggplot2.
Prerequisites
Language
EN
Format
Zoom Workshop mit Anmeldung (max. 25 Teilnehmende)
Time
11 am - 12:30 pm
Registration
Zoom-Link
Beautiful & useful: Data visualization with R (ggplot) | Dr. Emilia Kmiotek-Meier
13.02.2025, 14-17 Uhr
Speaker
Dr. Emilia Kmiotek-Meier | Institute of Sociology and Social Psychology, International Office
Teaching Assistants: Jenny Esser & Lukas Frommelt | Institute of Sociology and Social Psychology
Content
Data, data, data. Lots of numbers, lots of spreadsheets. But how do you make sense of it all? Good data visualisation can help! In this workshop we will focus on data visualisation – an important skill when working with data. We will guide you through the R package for data visualisation - ggplot. At the end, you will be familiar with the basics of data visualisation for exploring & presenting data.
Prerequisites
Confident handling of R (e.g. importing data, tidyverse)
Language
EN
Format
Zoom Workshop mit Anmeldung (max. 25 Teilnehmende)
Zeit
2 pm - 5 pm
Anmeldung
Zoom-Link
Open Science in Medical Research: Boost Your Impact and Meet Current Requirements | Katja Restel
13.02.2025
Speaker
Katja Restel | OSCC
Content
This workshop aims at showing how Open Science practices can enhance research's impact/visibility and credibility. This will be achieved by introducing the concept of and tools for preregistration as well as by introducing and fostering understanding of journal and funding requirements for Open Science. These first glimpses into OS as well as tools and tipps serve to empower participants to (at least) think of integrating Open Science into their projects.
Prerequisites
none
Language
EN
Format
Zoom Workshop mit Anmeldung
Time
Registration
Zoom-Link
Friday, 14.02.2025
RDM Meets AI: Navigating Opportunities and Challenges | Dr. Hajira Jabeen
14.02.2025, 1pm-5pm
Speakers
Dr. Hajira Jabeen | Team lead : Artificial Intelligence for Research Data Management (AI4RDM) at the Institute of Biomedical Informatics (BI-K)
Prof. Dr. Oya Beyan | Head of the Institute of Biomedical Informatics (BI-K) and co-lead of MeDIC
Prof. Dr. Konrad Förstner | Professor for Data and Information Literacy at ZB MED - Information Center for Life Sciences and TH Köln
Prof. Dr. Sören Auer | Director TIB, Head of Research Group Data Science and Digital Libraries
Prof. Dr. Dietrich Rebholz-Schumann | Scientific Director of ZB MED – Information Center for Life Sciences
Content
In this discussion will explore how AI, particularly large language models (LLMs), can address the growing challenges in research data management (RDM) in today’s data-intensive era. Researchers face issues like time-consuming metadata creation, data inconsistencies, and a lack of standardized tools. AI-driven solutions, such as automated metadata extraction, annotation, and generation, can improve data reusability, accessibility and findability. We will discuss how AI enhances FAIR metrics, data quality, and harmonization, as well as the integration of knowledge graphs (KGs) for better data interoperability. Additionally, LLMs improve search capabilities and provide AI-driven question answering to help researchers navigate complex datasets. While AI offers significant opportunities, challenges such as accuracy, privacy, and integration with existing systems must be addressed. This presentation will highlight how AI can revolutionize RDM and streamline data management practices.
In the panel discussion, experts from various NFDI (National Research Data Infrastructure) initiatives will delve into the practical applications and challenges of AI in RDM. They will share insights on how AI is being implemented across different research domains, discuss the potential for collaboration in developing standardized tools, and address key challenges such as data privacy, integration with existing systems, and ensuring AI accuracy. The discussion will also explore future trends, opportunities for enhancing data interoperability, and the role of knowledge graphs in improving RDM practices.