Programme

Hybrid Workshop at the Max Planck Institute for Legal History and Legal Theory (mpilhlt), 04 November, 2025

All times are Frankfurt time, that is CET (UTC/GMT +1 hour)

Download as pdf

Tuesday 04 November 2025

Onboarding

09:00-09:15 Arrival/Registration

09:15-09:45 Christian Boulanger/Andreas Wagner (mpilhlt): Welcome and Upshot from RefExtract2023, State of the Discussion

09:45-10:00 Coffee Break

Research presentations

10:00-12:30

  1. Hiba Arnaout (TU Darmstadt): In-depth Research Impact Summarization through Fine-Grained Temporal Citation Analysis

  2. Yurui Zhu/Matteo Romanello (Odoma): Benchmarking Large Language Models on Reference Extraction and Parsing in the Social Sciences and Humanities

  3. Sofía Aguilar Valdez (Saarland University): How Scientific Ideas Evolve

  4. Open Discussion and Ad-Hoc Presentation of Research

12:30-13:30 Lunch

Datasets, Infrastructure and Interoperability

13:30-15:30

  1. Angelo Di Iorio/Matteo Guenci/Marta Soricetti*/Silvio Peroni/Lorenzo Paolini*/Ivan Heibi (University of Bologna): Citation Extractor and Classifier: Pipeline and Datasets (*presenting)

  2. Tamara Heck/Christoph Schindler/Verena Weimer/Philipp Mayr/Ahsan Shahid (DIPF/GESIS): Open Citation Data for Educational Research

  3. Christian Boulanger, Andreas Wagner (mpilhlt): Datasets in the Legal Theory Knowledge Graph Project

  4. Interoperability Roundtable: Open Discussion on Data Models and Data Formats

15:30-16:00 Coffee Break

Tools, Workflows and Pipelines

16:00-17:30

  1. Raphael Schlattmann/Malte Vogl (mpigea)/Aleksandra Kaye (TU Berlin/mpigea): LLM-Based Knowledge Graph Extraction Pipeline

  2. Luca Foppiano (ScienciaLAB): Training the Grobid Reference Extraction Models

  3. Christian Boulanger/Andreas Wagner (mpilhlt): Annotation Tools for Machine Learning: PDF-TEI Editor (for LLamore & Grobid), Prodigy, TEI-Publisher

17:30-18:30 Takeaways, Way Forward, Closing

19:00 Dinner (self-paid)

Restaurant Zur Stalburg, Glauburgstraße 80, 60318 Frankfurt am Main