Skip to main content

Thematic Pilot Interview: Language Resources

Read the Interview with the CLARIN Thematic Pilot to discover the latest updates on OSTrails pilot studies. Explore their progress in integrating open science principles and advancing research assessment. This month we had the pleasure of speaking with Daan Broeder and Menzo Windhouwer, of CLARIN ERIC. 

Pilot Interviews th 8 CLARIN Broeder Windhouwer 
  - Menzo Windhouwer & Daan Broeder

"Each science cluster can benefit from seamless information exchange through the Scientific Knowledge Graph Interoperability Framework, which cross-connects repositories, databases, catalogues, knowledge graphs and Linked Open Data collections."

 

-Can you briefly introduce your organisation? How do they contribute to EOSC?  

CLARIN is the European infrastructure whose core business is to provide access to language resources and tools for processing them. CLARIN was one of the first ERICs to be established and nowadays its network spans in 26 countries (25 in Europe, plus South Africa). All these distributed resources and expertise will be made available to the wider EOSC ecosystem under the emerging model of service federation via one or more CLARIN nodes. CLARIN is the European infrastructure whose core business is to provide access to language resources and tools for processing them. CLARIN was one of the first ERICs to be established and nowadays its network spans 26 countries (25 in Europe, plus South Africa). All these distributed resources and expertise will be made available to the wider EOSC ecosystem under the emerging model of service federation via one or more CLARIN nodes.  CLARIN is one of the research infrastructures in the Social Science and Humanities Open Cluster (SSHOC).

Over the years, Daan Broeder and Menzo Windhouwer have been working for various institutions, all of which have been deeply involved in the development of the CLARIN infrastructure and its embedding in the European context.

Menzo is currently based at the Humanities Cluster of the Royal Netherlands Academy of Arts and Sciences (KNAW-HuC). Several institutes in the domain of Social Sciences and Humanities (SSH) are part of KNAW-HuC, including the Meertens Institute and the Huygens Institute. Both are also CLARIN centres.

Daan is currently based at the at the CLARIN ERIC central office, which coordinates the CLARIN research infrastructure. In project OSTrails, CLARIN ERIC represents the SSHOC cluster and participates in the project board meetings as representative for the five science clusters.

-What are you most excited about in OSTrails? What are you looking forward to?   

Within CLARIN we are looking forward to making the FAIR principles more tangible to our communities. What does it mean for a dataset to be FAIR? What kind of positive impact will that have for a researcher? And will it spark the willingness to spend the required effort to make resources FAIR?  And if so, will FAIR become the norm because funders enforce it, or because the researchers see why it matters.

CLARIN has its own flexible metadata standard: CMDI. It can handle the many types of datasets and modalities in the language domain, e.g. raw text collections, annotated corpora, lexicons, speech recordings, field work on endangered languages, all in a multitude of languages. However, due to the flexibility of CMDI, metadata schema proliferation is an inherent challenge. CLARIN addresses this via a shared semantic overlay: the CLARIN Concept Registry. The proliferation issues could have been overcome if an extendable common core set of metadata profiles had been available for the community from the start. The Scientific Knowledge Graph Interoperability Framework (SKG-IF), initiated by a working group of the Research Data Alliance (RDA) and now further developed within OSTrails, gives us a fresh start for implementing such a strategy. CLARIN is looking forward to seeing the SKG approach panning out. We have planned to develop and test it working together with the SSHOC partners in OSTrails. This joint work is building on the existing collaboration on the EOSC entry point for the SSH cluster: the SSH Open Marketplace.

-How is planning, tracking and assessing research being realised in your scientific domain?

In the CLARIN infrastructure, FAIR was part of the design avant la lettre and what is now called FAIR assessment has been implemented through technical certification of the CLARIN centres.  Enabling proper citation has also been on the agenda for over a decade. CLARIN was one of the first RIs to require proper PIDs for resources and it offers the Virtual Collection Registry (VCR), a tool that enables to build virtual collections distributed across repositories and domains.

At country level things are partly dependent on national circumstances. In the Netherlands DMPs are still paper trails. Tracking happens in a disconnected way. Assessment of datasets is slowly taking off by communities like CLARIAH (a collaboration of the national humanities infrastructures CLARIN and DARIAH), ODISSEI (the Dutch social sciences infrastructure) and NDE (network of Dutch cultural heritage institutions).  These initiatives are creating FAIR Implementation Profiles and are eager to make tools available for assessing if the datasets produced by the communities are actually matching these profiles.

-Can you provide some details on your pilot's main actors, services and priorities? How will your pilot adopt the results of OSTrails?

CLARIN’s thematic pilot centres around the central catalogue and discovery platform that is used within CLARIN: the Virtual Language Observatory (VLO). This catalogue is based on a weekly harvest of the OAI-PMH providers of the CLARIN centres and other relevant providers. The centres provide their CMDI metadata, from which a dozen common facets are extracted using the shared semantic overlay.

Already in 2017, a pilot was implemented for making this joint metadata space available as RDF: CMD2RDF.  In CLARIN’s thematic OSTrails pilot, CMD2RDF will be refreshed by making it deliver RDF that is compliant with the SKG-IF data model via the API developed by OSTrails and RDA.  Some entity types will have to be added explicitly to the semantic overlay, e.g. persons, services and projects, including entities from SKGs to be developed by other OSTrails partners, such as CESSDA. For the linking of entities, we will use Lenticular Lens: an alignment tool developed at KNAW-HuC.

This alignment should enable researchers to use the SKG federation to connect a CLARIN dataset to related entities in the CESSDA SKG. This could for example help discover datasets from different domains, which can be useful for interdisciplinary research, e.g. investigations into the influence of socio-economic status on language use. The alignment will also enhance the findability of resources available in the SKG that will become available for the SSH Open Marketplace.

In addition, we will provide FAIR assessment information for the CLARIN datasets and guidance to both researchers (How can the FAIRness of this dataset facilitate the research process?) and providers (How can you improve the FAIRness of this dataset?).

-Ongoing activities and Next Steps? 

Currently we are actively involved with the design of the SKG-IF data model and the API development. We are also processing the outcomes of the face-to-face hackathon in Athens (March 2025), and as the actual pilot implementation is gradually coming closer some of the tooling is now being prepared:

  • CMD2RDF is adjusted to deal with the latest developments in the CMDI ecosystem and to take advantage of state-of-the-art RDF facilities, such as RDF*;
  • Lenticular Lens is generalized to take input data from any SPARQL-based triple store and will be tested with the SKG-IF data model;
  • The FAIR assessment tool pyFAT is extended with guidance fitting OSTrails developments.

These action lines should enable us to get a first version of the pilot going, as soon as the OSTrails Interoperability Frameworks are available.

 

Thematic Pilots, Pilot Interview

  • Created on .
  • Last updated on .
  • Hits: 1042

OSTrails at PISA 2025: Advancing the Role of Grey Literature in Open Science

OSTrails participated in the PISA 2025 conference with a poster presentation titled Making Grey Literature FAIR: OSTrails and the Power of Scientific Knowledge Graphs.” The session provided a valuable opportunity to engage with peers and showcase how OSTrails addresses key challenges in managing and integrating grey literature.

 

Key Takeaways from the Event

  • Grey literature is a major part of research output across disciplines.
  • It remains undervalued due to outdated perceptions and low visibility. 
  • Making grey literature FAIR is essential for discoverability and reuse. 
  • Key barriers include lack of metadata standards, PIDs, and infrastructure. 
  • Integration into platforms for Scientific Knowledge Graphs is crucial. 
  • Research assessments must evolve to value diverse outputs. 

 

The Central Message of the Conference

The conference strongly underscored the evolving and increasingly vital role of grey literature (or "grey resources") in contemporary scientific communication. It made clear that the landscape of knowledge dissemination is undergoing a significant transformation, moving beyond the traditional emphasis on peer-reviewed journal articles. Grey literature, including datasets, software, protocols, technical documentation, and project reports, is now recognised as a legitimate and essential component of the research ecosystem.

The event challenged longstanding misconceptions and outdated perceptions surrounding grey literature. Key themes included the necessity for such resources to be available in open access, appropriately networked, and sustainably maintained to ensure long-term value and usability.

Why This Is Important for OSTrails.

By utalising the Plan-Track-Assess (PTA) Framework designed to enalbe seamless interoperability between research planning, tracking, and assessment services, OSTrails advances, inter alia, the discoverability and reusability of diverse research outcomes— including grey literature produced across the broader research ecosystem by research performing organisations, funders, and infrastructures. One key example of this innovation is the transformation of Data Management Plans (DMPs) from static documents into dynamic, machine-actionable resources that are linked to research outputs and integrated into repositories and Scientific Knowledge Graphs (SKGs).

Beyond DMPs, OSTrails also aims to make a broad range of grey outputs—such as datasets, software, and reports—FAIR and more visible. This is achieved through the use of structured metadata, Persistent Identifiers (PIDs), and integration into SKGs. By working with repositories, catalogues, and institutional databases as entry points into these networks, OSTrails is helping to embed grey literature within the wider research ecosystem.

Participation in this conference provided a valuable platform to demonstrate how OSTrails’ standards-based integration approach can significantly improve the visibility, discoverability, and strategic value of grey resources.

Why This Is Important for Open Science More Broadly

The conference held considerable relevance for the wider Open Science (OS) movement. It emphasised the importance of embracing the full range of research outputs, not just conventional journal publications. The integration of grey literature, including datasets, software, and protocols, is fundamental to the OS principles of transparency, reproducibility, and accessibility.

Ensuring that grey literature is FAIR is key to unlocking the value of large volumes of scientific work that have traditionally remained underutilised or overlooked. The event also spotlighted the role of scientific libraries and other research infrastructure in supporting the broader sharing and preservation of diverse research materials.

Overcoming technical and cultural barriers—such as the lack of standardised metadata, missing PIDs, and poor indexing of grey literature—is crucial to creating an interoperable Open Science environment, especially in the context of initiatives like EOSC.

Importantly, the event drew attention to the need to reform research assessment practices. Grey literature's underrecognition is symptomatic of broader systemic issues, with commercial journal articles continuing to dominate despite the scientific value of other outputs. Valuing the full diversity if scholarly contributions is essential for a more inclusive and impactful research culture.

Insights from the Conference

The impressions from the conference, particularly from the panel session, suggest a community deeply engaged with the inherent value and challenges of grey literature. There was a strong sense that grey literature is vital, especially in specific domains and developing countries, but faces significant hurdles due to historical perceptions (e.g., perceived lack of peer review or lower credibility), lack of standardised practices (metadata, citation), poor discoverability (not in commercial databases), infrastructure limitations (especially concerning digital libraries and OCR), and lack of PIDs. The discussions highlighted that while some grey literature is peer-reviewed, the perception and lack of indexing hinder its recognition. There was a clear call for moving beyond the "grey" label itself, perhaps referring to these as "resources" due to their varied formats.

Grey Net 2025 2

Claudio Atzori (CNR) presenting OSTrails poster

Notable Reflections from OSTrails

A striking moment during the panel session came when one participant provocatively asked, “How do we protect science from grey literature?”—reflecting concerns about quality control, inconsistent practices, and limited visibility.

Yet, a more constructive response emerged: “How do we protect grey literature from the misinformation we see in today’s society?” This shift in perspective reframes grey literature not as a risk, but as a valuable and vulnerable resource that must be safeguarded through better practices, robust infrastructure, and responsible stewardship.

Conclusions

OSTrails' participation at the PISA 2025 conference confirmed the strong alignment between the project and the broader goals of the research community in advancing the role of grey literature in contemporary scientific communication. The discussions underscored that making grey literature FAIR is not merely a technical challenge, but also a cultural and institutional one. OSTrails actively supports this effort by providing practical tools, standards, and frameworks—including machine-actionable DMPs, integration with SKGs, and consistent FAIR assessments.

The conference provided validation of OSTrails’ approach and strengthened its connection with key stakeholders tackling these challenges. Moving forward, continued efforts in standardisation, infrastructure development, and policy advocacy will be essential to ensuring grey literature achieves its rightful place in the scientific knowledge ecosystem.

—Written by Claudio Atzori (CNR)

Community Event

  • Created on .
  • Last updated on .
  • Hits: 759

OSTrails at the 2nd Austrian Library Congress 2025

From 26 to 28 March, 2025, the Austria Center in Vienna hosted the 2nd Austrian Library Congress under the banner “Libraries: democratic – diverse – sustainable”. The event brought together a wide-ranging community of information professionals, researchers, policymakers, and infrastructure providers, highlighting the multifaceted role of libraries in the age of digital transformation. Hot topics included the growing impact of artificial intelligence in the sector, the future of research communication, inclusivity & accessibility, open access, and, of course, Open Science.

OSTrails was proud to contribute to this dynamic exchange of ideas. Represented by Daniel Spichtinger (University of Vienna), OSTrails presented its vision and early implementation steps toward a more integrated and FAIR-aligned research data ecosystem. His talk, "Improving digital research data management: the OSTrails project", was part of a series of forward-looking contributions tackling the transformation of scholarly infrastructure.

Tackling Fragmentation in Research Data Management

At the core of OSTrails is the recognition of the inefficiencies in current research data management (RDM) practices across Europe. While the FAIR principles are widely accepted as the de facto standard for good RDM, practical implementation remains uneven and often siloed. OSTrails aims to address the currently existing fragmentation in data management based on its Plan-Track-Assess (PTA) framework:

  • Plan: Increase the efficacy of DMPs and reach more researcher-centric, educative, and integrated “machine actionable” DMPs (maDMPs).
  • Track: Establish an open, interoperable and high quality SKG ecosystem of different types of research products, their relationships and metrics for evaluation.
  • Assess: Deliver modular and extendable FAIR tests, to make metrics “machine actionable”, complemented by user guidance.

Pilots: Testing in the Real World

One of OSTrails' strengths lies in its broad and diverse pilot structure. The project encompasses 15 national, 8 thematic, and one Horizon Europe pilot, tailored to the specific needs and infrastructures of their research communities. A survey conducted in the first year of the project among the national pilots showcased the different local requirements, needs and priorities of the national pilots. Consequently, the pilots address a number of different use cases, related to the PTA framework.The Austrian pilot, in which TU Graz, TU Wien and the University of Vienna (including AUSSDA and PHAIDRA) participate, will extend DMP tools, support researchers in the creation of discipline-specific DMPs, and check the digital objects for the FAIR principles. These efforts directly align with national needs and policies, while contributing to the broader European Open Science framework envisioned by the European Open Science Cloud (EOSC).

Libraries as Connectors and Enablers

A recurring theme throughout the congress was the centrality of libraries in shaping future research practices—especially around data stewardship, digital literacy, and inclusivity. OSTrails underscores this by highlighting the role libraries play in supporting machine-actionable DMPs, developing community-specific metadata standards, and embedding FAIR assessment tools into everyday workflows.

As institutions that sit at the intersection of research and infrastructure, libraries are uniquely positioned to support—and in many cases lead—the adoption of OSTrails outputs. The project actively collaborates with library-based services and works to lower adoption barriers through open resources and transparent interoperability protocols.

Looking Ahead

Although the OSTrails pilots officially only kick off in July 2025, many have already started preparatory work. As OSTrails continues to scale and refine its technical components, the project remains committed to a collaborative, community-driven approach. Tools and insights developed through the pilots will be made openly available via the OSTrails Commons.

For more information on the project and its tools, visit:

—Written by Dominik Denk (UNIVIE)

Community Event

  • Created on .
  • Last updated on .
  • Hits: 641

Highlights from the OSTrails Hackathons in Athens

On 12 March 2025, OSTrails hosted a series of high-impact hackathons in Athens, held just prior to its General Assembly. These full-day events convened developers and domain experts from across scientific clusters and research infrastructure communities to collaboratively advance Data Management Planning (DMP) Platforms, Scientific Knowledge Graphs (SKG), and FAIR Assessment Tools, that are vital for effective research data management and sharing.

The hackathon brought together experts and service providers from well-established research data management platforms, both within the consortium and beyond, to build on the Plan-Track-Assess (PTA) Framework (the Reference Architecture and Pathways presented in project deliverable D1.4 and D1.1), and test key components in practice.

DMP-IF Hackathon: Laying the Groundwork for a Common maDMP API 

The DMP Hackathon gathered developers and service providers, including  platform providers from the consortium and several from outside Europe. The event focused on advancing machine-actionable Data Management Plans (maDMPs) by fostering collaboration around two key objectives: 1) Maintaining the Research Data Alliance DMP Common Standard (RDA DMP CS), 2) Initiating work on a Common maDMP API specification. 

As outlined in the OSTrails Architecture, the DMP-IF aims to meet funder and community needs by extending the RDA DMP CS data model, while enabling real-time communication between systems through the introduction of a new Application Profile Interfaces (APIs). The discussions centred around improving date handling, identifier usage, and specification governance—resulting in concrete proposals to be submitted to the RDA Working Group for inclusion in the standard. In parallel, participants reviewed existing APIs and user requirements, drafted a shared API design for maDMPs, and agreed to continue development through close collaboration within the newly established RDA group.

Developers exchanging ideas at the DMP IF Hackathon

Developers exchanging ideas at the DMP IF Hackathon

SKG-IF Hackathon: Advancing Interoperability for Scientific Knowledge Graphs 

The SKG-IF Hackathon brought together providers of SKG services and infrastructure onboarded in OSTrails to advance interoperability through hands-on experimentation with the SKG-IF OpenAPI specification and metadata model. The session focused on mapping institutional data to the SKG-IF model and exploring its capacity to accommodate diverse research outputs.

Building on the RDA SKG-IF core data model, OSTrails aims to enhance the framework with a flexible extension mechanism to support domain-specific entities—such as instruments and provenance information, while supporting seamless discovery and integration of graph data with the introduction of a new API. The hackathon focused on validating the SKG-IF model through real-world data mapping and collected detailed feedback via the GitHub issue tracker. Major discussion topics were:  1) Gaps in the SKG-IF data model, such as missing fields, insufficient documentation, and the need for greater extensibility; 2) Issues in the OpenAPI specification, including unclear documentation and undefined field requirements.

The model’s scope was also extended to support a broader range of research products by introducing new product types aligned with community needs: literature, research data, and research software. Hackathon contributions were fed directly into the roadmap of the RDA SKG-IF Working Group, supporting the finalisation of the specification. 

Hands on collaboration during the SKGIF Hackathon

 Hands-on collaboration during the SKG-IF Hackathon

FAIR-IF Hackathon: Making FAIR Assessments More Interoperable 

Running in parallel, the FAIR-IF Hackathon brought together developers from various FAIR-related tools, including those onboarded in the project as well as several external platforms. The focus was on aligning assessment services and harmonising API formats to improve interoperability across FAIR tools.

In the first part of the hackathon, participants discussed how key components of the FAIR-IF, such as benchmarks, could help ensure consistent outcomes with minimal or no manual curation. They also emphasised the need to harmonise APIs through standards such as OpenAPI and highlighted transparency and record provenance as essential for trust and reproducibility.

The second part of the hackathon was hands-on, building on the earlier discussions. It focused on API functionality and tool alignment, showing that the proposed common API structure provided a solid foundation for implementation, integration, and mapping across existing tools—advancing interoperability within the FAIR-IF ecosystem.

ImportedPhoto.763461064.972098Insights from the FAIR-IF Hackathon

Looking Ahead 

These hackathons not only advanced technical developments but also reaffirmed OSTrails' commitment to open collaboration across diverse research domains and settings. By bringing together experts from across Europe and beyond, OSTrails is setting the stage for truly interoperable research infrastructures.

Learn more about the OSTrails Architecture and Interoperability Frameworks by exploring our blog, reading the documentation, and watching the Interoperability Webinar Series

Stay tuned for upcoming events and learn more about future OSTrails hackathons by visiting our page: OSTrails Hackathons.

-Written by Tassos Stavropoulos (OpenAIRE)

OSTrails Event, Hackathon

  • Created on .
  • Last updated on .
  • Hits: 721

OSTrails First Public Webinar: Checked!

On 15 April 2024, OSTrails hosted its first public webinar, bringing together over 100 participants from across the research community. The session introduced the project’s goals, early results, and ways to get involved in shaping how research planning, tracking, and assessment can be improved.

"This first webinar was an important milestone for us. After months of work, we were finally able to share early results and open the door for others to get involved."
— Elli Papadopoulou, Athena Research Center / OSTrails Deputy Coordinator

Plan-Track-Assess (PTA) Framework: Addressing the Silos in Research Data Management

Research today relies on many separate systems. The same information is often re-entered in different tools, and outputs are difficult to follow or assess. OSTrails addresses these issues by connecting workflows, reducing duplication, and supporting reuse and visibility of research outputs.

The project builds a unified framework that:

  • Assists researchers in reducing repetitive tasks and improving data management.
  • Supports institutions in ensuring compliance and facilitating data reuse.
  • Enables funders and policymakers to obtain consistent and reliable metrics on FAIR and Open Science practices.

OSTrails Webinar 1 recap intro

Elli Papadopoulou presenting the OSTrails toolkit.

Two Pilots, One Message: This Works!

Two pilot initiatives were featured during the webinar to show what the PTA Framework looks like when in action for different organisations:

  • In the Dutch National Pilot, led by CWTS and SURF, institutions are working with dynamic Data Management Plans (DMPs) embedded in their systems, cutting repetition and improving coordination across teams. The pilot highlights the Netherlands’ diverse and decentralised research data management (RDM) landscape, and the need for machine-actionable (ma)DMP tooling that meets both broad Open Science goals and basic local administrative needs. The pilot focuses on aligning stakeholder interests at national and institutional levels—supporting researchers with domain-specific templates and administrators with integrated workflows that include RDM, privacy, and ethics reviews. It also explores different technical pathways to publish and connect maDMPs with other research outputs, including the use of Research Activity Identifiers (RAiDs) and links to local repositories or Zenodo.

OSTrails Webinar 1 recap pilot neatherlands

Andrew Hoffman (CWTS) showcasing the Dutch National Pilot.

  • In Photon and Neutron Science, researchers at ESRF are combining DMPs, metadata services, and FAIR assessment tools to better describe and evaluate their datasets.

OSTrails Webinar 1 recap pilot esfr

Assessing and sharing datasets from ESFRIs using the PTA Framework, as presented by Renaud Duyme.

Those are only two examples of the twenty-four use cases through which OSTrails is testing and adapting the PTA Framework to streamline and automate processes.

From Design to Adoption: Supporting the People Who Make It Happen

The webinar also launched the OSTrails Mentorship Programme, which provides support for those already working on improving research workflows by helping them apply OSTrails tools in practice: IT staff, research support professionals, and policy officers.

Insights into the OSTrails training roadmap by Pedro Principe (UMinho).

Community Discussions

The discussion during the webinar showed strong interest in the work OSTrails is doing. Participants highlighted shared challenges like scattered tools, manual processes, and inconsistent planning, and welcomed the focus on making systems work better together. The event opened space for future collaboration and real-world application.

There were also many questions about how to get involved, especially through the OSTrails Mentorship Programme. The team shared resources on training opportunities, the mentorship call for mentees, and other upcoming events. As one participant put it:

“This is a great initiative—thank you for this mentoring programme.”

Check out the full webinar here.

Webinar, OSTrails Event

  • Created on .
  • Last updated on .
  • Hits: 645