Skip to main content

OSTrails at EGI2025: Advancing FAIR and Reproducible Science in Big Data Environments

At the recent EGI2025 conference, held from June 2nd to June 6th at the stunning Palacio de la Magdalena in Santander (Spain), OSTrails made a strong impression by showcasing practical approaches to enabling FAIR, open, and reproducible science within data-intensive research environments — a participation that held a pleasant surprise for the project!

Key takeaways from the event

  • There is strong demand for tools thatreduce complexity, not add to it.
  • Researchers need help with reproducibility, and OSTrails offers a clear path forward.
  • Interoperability is top of mind, and OSTrails is well-positioned to contribute with its cross-domain, composable approach.
  • Also: People do read posters—and sometimes even give you awards for them. 

What was the main message of the OSTrails contribution?

OSTrails puts researchers at the centre by supporting every stage of the data lifecycle -from planning and metadata creation to FAIRness assessment, publication, and reuse. The OSTrails poster at EGI2025 illustrates how OSTrails' modular tools, including machine-actionable DMPs, FAIR assessment pipelines, and Scientific Knowledge Graphs, can be seamlessly integrated into existing workflows. Rather than adding more burden, OSTrails aims to make researchers’ lives easier, their data more reusable, and their science more reproducible. The research community embraced the contribution, awarding it the Best Poster Award at EGI2025 — a clear endorsement of OSTrails’ commitment to delivering concrete, impactful solutions for the scientific community.

EGI Anca received award"And the best poster award goes to.. Anca Hienola"

Why was the event important for OSTrails?

Presenting at EGI2025 gave OSTrails visibility in one of the communities that build and run the infrastructure behind European science. It was a key moment to showcase how OSTrails is bridging the gap between high-level FAIR/Open Science policy and the realities of researchers’ daily work, and how our tools complement existing infrastructural services.

Why was the event important for Open Science in general?

Open Science isn't just about openness - it's about making science reproducible, reusable, and efficient. OSTrails helps move the conversation from ideals to implementation by offering tools that embed FAIRness and automation directly into research workflows. This contributes to a more mature and usable Open Science ecosystem.

Impressions

The EGI2025 atmosphere was genuinely collaborative, and OSTrails stood out as a project that's not just “another project,” but a set of researcher-facing solutions. Attendees were particularly pleased by how OSTrails brings together technical components (like metadata enrichment and assessment pipelines) in a way that’s actually usable.

EGI Anca shows posterDiscussions aroung OSTrails poster presentation. 

Conclusions from attending the event.

OSTrails is clearly aligned with the current needs of both researchers and infrastructure providers. The event confirmed that our emphasis on automation, modularity, and usability is not only relevant but urgently needed. It also opened doors for future collaborations—particularly in integrating OSTrails tools into the broader EOSC ecosystem and other e-infrastructures frameworks.

"Open Science isn't something you just comply with—it should make your life as a researcher easier. OSTrails is building the tools to make that actually happen."

-Anca Hienola, OSTrails project partner and Best Poster Award winner at EGI2025

  • Created on .
  • Last updated on .
  • Hits: 53

Is the Future… Now?! Reflections from our SURF Research Day 2025 workshop on machine actionable DMPs.

On May 20th, OSTrails partners took part in SURF Research Day 2025 with a session titled “Is the future… now?! Exploring machine actionable DMPs with OSTrails-NL & ARGOS.”  Jointly organised and delivered by Dutch Pilot partners Eileen Waegemaekers (SURF) and Andrew S. Hoffman (CWTS), together with Leiden University colleague Céline Richard, and Elli Papadopoulou (Athena RC / OpenAIRE), the workshop brought together more than 30 Dutch research data professionals. The session explored how machine actionable Data Management Plans (maDMPs) can support integrated, FAIR-aligned workflows across their organisations. 

IMG 20250520 113601Eileen Waegemaekers (SURF) introducing the workshop.

The session took stock of last year’s discussion “What’s behind the Data Management Plan of the future?” which raised thoughtful questions about the administrative burden of DMPs, their usefulness to different stakeholders, and the potential role of automation. Building on those conversations, this year Dutch pilot moved from vision to implementation.

Exploring the Dutch landscape of DMP workflows

The session followed a hybrid format combining short presentations with interactive activities to further engage with participants and a tool demonstration. It provided an opportunity to connect broader goals around interoperability and FAIRness with the practical realities of managing data-related workflows in Dutch institutions. In doing so, it highlighted how OSTrails is working to:

  • Enable FAIR assessment of plans and associated research outputs.
  • Offer a framework for institutions to connect planning with execution and reporting.

As part of the Dutch pilot, we showcased  ARGOS DMP platform and the Leiden University blueprint to demonstrate how data management planning can become a practical, integrated part of research workflows, rather than a disconnected administrative task. 

Andrew S. Hoffman (CWTS) presenting Leiden Social Sciences example.

Following the Leiden Social Sciences example, participants were invited to sketch out how data management, ethics, and privacy processes are currently handled in their affiliated institutions. These hand-drawn workflow diagrams sparked conversation about fragmentation, duplicated efforts, and where bottlenecks tend to appear

Digging the surface

As we moved into the discussion, it became clear that many institutions are still figuring out how to bring different processes together. People shared that data management, ethics, and privacy are often handled separately, by different teams, in different systems, with little coordination. That fragmentation creates extra work, confusion, and often leaves researchers without a clear path.

There was a real interest in how maDMP tools might help address this, but also some hesitation. The idea of automating parts of planning is appealing, especially for reducing administrative burden, but it also raised questions. How do you build something structured without making it rigid? Can a plan be both machine-actionable and flexible enough to work across disciplines and contexts?

edited

The discussions also revolved around the use of DMP Common Standard, which many recognised as valuable, though not a current priority. Currently, institutions tend to focus on tailoring internal tools to meet immediate needs, often placing machine actionability and interoperability in the background. This highlights clear opportunities for “under-the-hood” interoperability that can enable both researcher workflows on the one hand, and open science workflows on the other. 

Looking ahead

This dialogue doesn’t end here. The Dutch pilot within the OSTrails project is ongoing, and we are looking forward to working more closely with institutions that want to explore how tools like ARGOS can support coordination across data management, privacy, and ethics. Future activities will include more testing, follow-up sessions, and opportunities to feed back into the development of templates and workflows that reflect local needs.

We are also continuing these exchanges through the OSTrails mentorship programme , where institutions can learn from one another and experiment with shared tools and frameworks in a more supported setting.

A big thanks to everyone who joined us!  

Netherlands, National Pilots

  • Created on .
  • Last updated on .
  • Hits: 204

Thematic Pilot Interview: Language Resources

Read the Interview with the CLARIN Thematic Pilot to discover the latest updates on OSTrails pilot studies. Explore their progress in integrating open science principles and advancing research assessment. This month we had the pleasure of speaking with Daan Broeder and Menzo Windhouwer, of CLARIN ERIC. 

Pilot Interviews th 8 CLARIN Broeder Windhouwer 
  - Menzo Windhouwer & Daan Broeder

"Each science cluster can benefit from seamless information exchange through the Scientific Knowledge Graph Interoperability Framework, which cross-connects repositories, databases, catalogues, knowledge graphs and Linked Open Data collections."

 

-Can you briefly introduce your organisation? How do they contribute to EOSC?  

CLARIN is the European infrastructure whose core business is to provide access to language resources and tools for processing them. CLARIN was one of the first ERICs to be established and nowadays its network spans in 26 countries (25 in Europe, plus South Africa). All these distributed resources and expertise will be made available to the wider EOSC ecosystem under the emerging model of service federation via one or more CLARIN nodes. CLARIN is the European infrastructure whose core business is to provide access to language resources and tools for processing them. CLARIN was one of the first ERICs to be established and nowadays its network spans 26 countries (25 in Europe, plus South Africa). All these distributed resources and expertise will be made available to the wider EOSC ecosystem under the emerging model of service federation via one or more CLARIN nodes.  CLARIN is one of the research infrastructures in the Social Science and Humanities Open Cluster (SSHOC).

Over the years, Daan Broeder and Menzo Windhouwer have been working for various institutions, all of which have been deeply involved in the development of the CLARIN infrastructure and its embedding in the European context.

Menzo is currently based at the Humanities Cluster of the Royal Netherlands Academy of Arts and Sciences (KNAW-HuC). Several institutes in the domain of Social Sciences and Humanities (SSH) are part of KNAW-HuC, including the Meertens Institute and the Huygens Institute. Both are also CLARIN centres.

Daan is currently based at the at the CLARIN ERIC central office, which coordinates the CLARIN research infrastructure. In project OSTrails, CLARIN ERIC represents the SSHOC cluster and participates in the project board meetings as representative for the five science clusters.

-What are you most excited about in OSTrails? What are you looking forward to?   

Within CLARIN we are looking forward to making the FAIR principles more tangible to our communities. What does it mean for a dataset to be FAIR? What kind of positive impact will that have for a researcher? And will it spark the willingness to spend the required effort to make resources FAIR?  And if so, will FAIR become the norm because funders enforce it, or because the researchers see why it matters.

CLARIN has its own flexible metadata standard: CMDI. It can handle the many types of datasets and modalities in the language domain, e.g. raw text collections, annotated corpora, lexicons, speech recordings, field work on endangered languages, all in a multitude of languages. However, due to the flexibility of CMDI, metadata schema proliferation is an inherent challenge. CLARIN addresses this via a shared semantic overlay: the CLARIN Concept Registry. The proliferation issues could have been overcome if an extendable common core set of metadata profiles had been available for the community from the start. The Scientific Knowledge Graph Interoperability Framework (SKG-IF), initiated by a working group of the Research Data Alliance (RDA) and now further developed within OSTrails, gives us a fresh start for implementing such a strategy. CLARIN is looking forward to seeing the SKG approach panning out. We have planned to develop and test it working together with the SSHOC partners in OSTrails. This joint work is building on the existing collaboration on the EOSC entry point for the SSH cluster: the SSH Open Marketplace.

-How is planning, tracking and assessing research being realised in your scientific domain?

In the CLARIN infrastructure, FAIR was part of the design avant la lettre and what is now called FAIR assessment has been implemented through technical certification of the CLARIN centres.  Enabling proper citation has also been on the agenda for over a decade. CLARIN was one of the first RIs to require proper PIDs for resources and it offers the Virtual Collection Registry (VCR), a tool that enables to build virtual collections distributed across repositories and domains.

At country level things are partly dependent on national circumstances. In the Netherlands DMPs are still paper trails. Tracking happens in a disconnected way. Assessment of datasets is slowly taking off by communities like CLARIAH (a collaboration of the national humanities infrastructures CLARIN and DARIAH), ODISSEI (the Dutch social sciences infrastructure) and NDE (network of Dutch cultural heritage institutions).  These initiatives are creating FAIR Implementation Profiles and are eager to make tools available for assessing if the datasets produced by the communities are actually matching these profiles.

-Can you provide some details on your pilot's main actors, services and priorities? How will your pilot adopt the results of OSTrails?

CLARIN’s thematic pilot centres around the central catalogue and discovery platform that is used within CLARIN: the Virtual Language Observatory (VLO). This catalogue is based on a weekly harvest of the OAI-PMH providers of the CLARIN centres and other relevant providers. The centres provide their CMDI metadata, from which a dozen common facets are extracted using the shared semantic overlay.

Already in 2017, a pilot was implemented for making this joint metadata space available as RDF: CMD2RDF.  In CLARIN’s thematic OSTrails pilot, CMD2RDF will be refreshed by making it deliver RDF that is compliant with the SKG-IF data model via the API developed by OSTrails and RDA.  Some entity types will have to be added explicitly to the semantic overlay, e.g. persons, services and projects, including entities from SKGs to be developed by other OSTrails partners, such as CESSDA. For the linking of entities, we will use Lenticular Lens: an alignment tool developed at KNAW-HuC.

This alignment should enable researchers to use the SKG federation to connect a CLARIN dataset to related entities in the CESSDA SKG. This could for example help discover datasets from different domains, which can be useful for interdisciplinary research, e.g. investigations into the influence of socio-economic status on language use. The alignment will also enhance the findability of resources available in the SKG that will become available for the SSH Open Marketplace.

In addition, we will provide FAIR assessment information for the CLARIN datasets and guidance to both researchers (How can the FAIRness of this dataset facilitate the research process?) and providers (How can you improve the FAIRness of this dataset?).

-Ongoing activities and Next Steps? 

Currently we are actively involved with the design of the SKG-IF data model and the API development. We are also processing the outcomes of the face-to-face hackathon in Athens (March 2025), and as the actual pilot implementation is gradually coming closer some of the tooling is now being prepared:

  • CMD2RDF is adjusted to deal with the latest developments in the CMDI ecosystem and to take advantage of state-of-the-art RDF facilities, such as RDF*;
  • Lenticular Lens is generalized to take input data from any SPARQL-based triple store and will be tested with the SKG-IF data model;
  • The FAIR assessment tool pyFAT is extended with guidance fitting OSTrails developments.

These action lines should enable us to get a first version of the pilot going, as soon as the OSTrails Interoperability Frameworks are available.

 

Thematic Pilots, Pilot Interview

  • Created on .
  • Last updated on .
  • Hits: 174

OSTrails at PISA 2025: Advancing the Role of Grey Literature in Open Science

OSTrails participated in the PISA 2025 conference with a poster presentation titled Making Grey Literature FAIR: OSTrails and the Power of Scientific Knowledge Graphs.” The session provided a valuable opportunity to engage with peers and showcase how OSTrails addresses key challenges in managing and integrating grey literature.

 

Key Takeaways from the Event

  • Grey literature is a major part of research output across disciplines.
  • It remains undervalued due to outdated perceptions and low visibility. 
  • Making grey literature FAIR is essential for discoverability and reuse. 
  • Key barriers include lack of metadata standards, PIDs, and infrastructure. 
  • Integration into platforms for Scientific Knowledge Graphs is crucial. 
  • Research assessments must evolve to value diverse outputs. 

 

The Central Message of the Conference

The conference strongly underscored the evolving and increasingly vital role of grey literature (or "grey resources") in contemporary scientific communication. It made clear that the landscape of knowledge dissemination is undergoing a significant transformation, moving beyond the traditional emphasis on peer-reviewed journal articles. Grey literature, including datasets, software, protocols, technical documentation, and project reports, is now recognised as a legitimate and essential component of the research ecosystem.

The event challenged longstanding misconceptions and outdated perceptions surrounding grey literature. Key themes included the necessity for such resources to be available in open access, appropriately networked, and sustainably maintained to ensure long-term value and usability.

Why This Is Important for OSTrails.

By utalising the Plan-Track-Assess (PTA) Framework designed to enalbe seamless interoperability between research planning, tracking, and assessment services, OSTrails advances, inter alia, the discoverability and reusability of diverse research outcomes— including grey literature produced across the broader research ecosystem by research performing organisations, funders, and infrastructures. One key example of this innovation is the transformation of Data Management Plans (DMPs) from static documents into dynamic, machine-actionable resources that are linked to research outputs and integrated into repositories and Scientific Knowledge Graphs (SKGs).

Beyond DMPs, OSTrails also aims to make a broad range of grey outputs—such as datasets, software, and reports—FAIR and more visible. This is achieved through the use of structured metadata, Persistent Identifiers (PIDs), and integration into SKGs. By working with repositories, catalogues, and institutional databases as entry points into these networks, OSTrails is helping to embed grey literature within the wider research ecosystem.

Participation in this conference provided a valuable platform to demonstrate how OSTrails’ standards-based integration approach can significantly improve the visibility, discoverability, and strategic value of grey resources.

Why This Is Important for Open Science More Broadly

The conference held considerable relevance for the wider Open Science (OS) movement. It emphasised the importance of embracing the full range of research outputs, not just conventional journal publications. The integration of grey literature, including datasets, software, and protocols, is fundamental to the OS principles of transparency, reproducibility, and accessibility.

Ensuring that grey literature is FAIR is key to unlocking the value of large volumes of scientific work that have traditionally remained underutilised or overlooked. The event also spotlighted the role of scientific libraries and other research infrastructure in supporting the broader sharing and preservation of diverse research materials.

Overcoming technical and cultural barriers—such as the lack of standardised metadata, missing PIDs, and poor indexing of grey literature—is crucial to creating an interoperable Open Science environment, especially in the context of initiatives like EOSC.

Importantly, the event drew attention to the need to reform research assessment practices. Grey literature's underrecognition is symptomatic of broader systemic issues, with commercial journal articles continuing to dominate despite the scientific value of other outputs. Valuing the full diversity if scholarly contributions is essential for a more inclusive and impactful research culture.

Insights from the Conference

The impressions from the conference, particularly from the panel session, suggest a community deeply engaged with the inherent value and challenges of grey literature. There was a strong sense that grey literature is vital, especially in specific domains and developing countries, but faces significant hurdles due to historical perceptions (e.g., perceived lack of peer review or lower credibility), lack of standardised practices (metadata, citation), poor discoverability (not in commercial databases), infrastructure limitations (especially concerning digital libraries and OCR), and lack of PIDs. The discussions highlighted that while some grey literature is peer-reviewed, the perception and lack of indexing hinder its recognition. There was a clear call for moving beyond the "grey" label itself, perhaps referring to these as "resources" due to their varied formats.

Grey Net 2025 2

Claudio Atzori (CNR) presenting OSTrails poster

Notable Reflections from OSTrails

A striking moment during the panel session came when one participant provocatively asked, “How do we protect science from grey literature?”—reflecting concerns about quality control, inconsistent practices, and limited visibility.

Yet, a more constructive response emerged: “How do we protect grey literature from the misinformation we see in today’s society?” This shift in perspective reframes grey literature not as a risk, but as a valuable and vulnerable resource that must be safeguarded through better practices, robust infrastructure, and responsible stewardship.

Conclusions

OSTrails' participation at the PISA 2025 conference confirmed the strong alignment between the project and the broader goals of the research community in advancing the role of grey literature in contemporary scientific communication. The discussions underscored that making grey literature FAIR is not merely a technical challenge, but also a cultural and institutional one. OSTrails actively supports this effort by providing practical tools, standards, and frameworks—including machine-actionable DMPs, integration with SKGs, and consistent FAIR assessments.

The conference provided validation of OSTrails’ approach and strengthened its connection with key stakeholders tackling these challenges. Moving forward, continued efforts in standardisation, infrastructure development, and policy advocacy will be essential to ensuring grey literature achieves its rightful place in the scientific knowledge ecosystem.

—Written by Claudio Atzori (CNR)

Community Event

  • Created on .
  • Last updated on .
  • Hits: 206

OSTrails at the 2nd Austrian Library Congress 2025

From 26 to 28 March, 2025, the Austria Center in Vienna hosted the 2nd Austrian Library Congress under the banner “Libraries: democratic – diverse – sustainable”. The event brought together a wide-ranging community of information professionals, researchers, policymakers, and infrastructure providers, highlighting the multifaceted role of libraries in the age of digital transformation. Hot topics included the growing impact of artificial intelligence in the sector, the future of research communication, inclusivity & accessibility, open access, and, of course, Open Science.

OSTrails was proud to contribute to this dynamic exchange of ideas. Represented by Daniel Spichtinger (University of Vienna), OSTrails presented its vision and early implementation steps toward a more integrated and FAIR-aligned research data ecosystem. His talk, "Improving digital research data management: the OSTrails project", was part of a series of forward-looking contributions tackling the transformation of scholarly infrastructure.

Tackling Fragmentation in Research Data Management

At the core of OSTrails is the recognition of the inefficiencies in current research data management (RDM) practices across Europe. While the FAIR principles are widely accepted as the de facto standard for good RDM, practical implementation remains uneven and often siloed. OSTrails aims to address the currently existing fragmentation in data management based on its Plan-Track-Assess (PTA) framework:

  • Plan: Increase the efficacy of DMPs and reach more researcher-centric, educative, and integrated “machine actionable” DMPs (maDMPs).
  • Track: Establish an open, interoperable and high quality SKG ecosystem of different types of research products, their relationships and metrics for evaluation.
  • Assess: Deliver modular and extendable FAIR tests, to make metrics “machine actionable”, complemented by user guidance.

Pilots: Testing in the Real World

One of OSTrails' strengths lies in its broad and diverse pilot structure. The project encompasses 15 national, 8 thematic, and one Horizon Europe pilot, tailored to the specific needs and infrastructures of their research communities. A survey conducted in the first year of the project among the national pilots showcased the different local requirements, needs and priorities of the national pilots. Consequently, the pilots address a number of different use cases, related to the PTA framework.The Austrian pilot, in which TU Graz, TU Wien and the University of Vienna (including AUSSDA and PHAIDRA) participate, will extend DMP tools, support researchers in the creation of discipline-specific DMPs, and check the digital objects for the FAIR principles. These efforts directly align with national needs and policies, while contributing to the broader European Open Science framework envisioned by the European Open Science Cloud (EOSC).

Libraries as Connectors and Enablers

A recurring theme throughout the congress was the centrality of libraries in shaping future research practices—especially around data stewardship, digital literacy, and inclusivity. OSTrails underscores this by highlighting the role libraries play in supporting machine-actionable DMPs, developing community-specific metadata standards, and embedding FAIR assessment tools into everyday workflows.

As institutions that sit at the intersection of research and infrastructure, libraries are uniquely positioned to support—and in many cases lead—the adoption of OSTrails outputs. The project actively collaborates with library-based services and works to lower adoption barriers through open resources and transparent interoperability protocols.

Looking Ahead

Although the OSTrails pilots officially only kick off in July 2025, many have already started preparatory work. As OSTrails continues to scale and refine its technical components, the project remains committed to a collaborative, community-driven approach. Tools and insights developed through the pilots will be made openly available via the OSTrails Commons.

For more information on the project and its tools, visit:

—Written by Dominik Denk (UNIVIE)

Community Event

  • Created on .
  • Last updated on .
  • Hits: 168