Postdoctoral position OT-17811
Postdoc on ontology mapping: everything else than matching ontologies
Montpellier
INRAE presentation
University of Montpellier (UM) is a French research-intensive university covering most science and technology fields. It gathers around 43,000 students and 4,618 staff members. The Laboratory of Informatics, Robotics, and Microelectronics of Montpellier) (LIRRM) is a 350-person cross-faculty research entity of the UM and the National Center for Scientific Research (CNRS) involved with AI, knowledge engineering, bioinformatics, integrated, mobile and communicating systems, algorithms, human-machine interaction, robotics, databases, distributed systems and more. Several research groups have leading expertise in knowledge engineering, web, text mining and ontologies.
The French National Research Institute for Agriculture, Food, and Environment (INRAE) is a major player in research and innovation. It is a community of 12,000 people with 272 research, experimental research, and support units located in 18 regional centres throughout France. Internationally, INRAE is among the top research organisations in the agricultural and food sciences, plant and animal sciences, as well as in ecology and environmental science.
Work environment, missions and activities
Standard vocabularies and ontologies, or more largely semantic artefacts, are key elements to achieve data interoperability. With the growing number of semantic artefacts and ontologies and their diversified uses, problems related to interconnecting these artefacts become worse. As any other type of data, we need a strategy to deal with mappings and ensure mappings are available following the FAIR principles in relevant mapping repositories where they can be curated, integrated and rendered for reuses. Recent initiatives like the SEMAF study or the SSSOM format or the activities within the FAIR-IMPACT project have revealed an enormous need with respect to mappings and crosswalks engineering not currently addressed and which goes way beyond the fact of automatically matching ontologies (e.g., OAEI).
At University of Montpellier (LIRMM) and INRAE (MISTEA), we develop ontology repositories whose goal is in part to host and serve multiple semantic resources as well as the mappings between them. We collaborate with the OntoPortal Alliance to synchronize our efforts and mutualize technology development. We develop and maintain AgroPortal a reference repository for semantic artefacts in agri-food. We are offering a postdoc or researcher position to study all the aspects aspect related to ontology alignment: you will design and specify a state-of-the-art framework for mapping extraction, generation, validation, evaluation, storage and retrieval by adopting a complete semantic web and linked open data approach and engaging the community for curation.
Semantic artefacts –a broader term to include ontologies, terminologies, taxonomies, thesauri, vocabularies, metadata schemas and standards – are both heterogeneous and overlapping in terms of coverage as they are mostly designed independently, by different developers, and following diverse modeling principles and patterns. To achieve interoperability and integration, one solution is to identify/generate mappings, or correspondences (the term crosswalk is used for metadata schemas) between different artefacts of the same domain or used to represent the same type of information. This process is known in the semantic web domain as ontology matching or ontology alignment. Building algorithms to identify these mappings is itself a scientific challenge. Surprisingly, it seems there is a gap between the state-of-the-art results obtained in automatically generating mappings at each edition of the Ontology Alignment Evaluation Initiative (OAEI – http://oaei.ontologymatching.org) and the day-to-day reality of ontology developers. Tools are often hardly reusable, and results cannot be easily reproduced outside of the benchmarking effort; already existing mappings are not uniformly described or not shared/available; mappings quality and provenance is always in doubt; multiple mappings cause conflicts. Furthermore, tools to map semantic artefacts vary much in their functionality and mappings are not uniformly describe. Before the SSSOM initiative still in its infancy, there was no recognized standard way to represent mappings (with provenance information) and no shared repository to merge, store and retrieve them.
We have identified several important aspects related to mappings when building a mapping repository. Indeed, ontology repositories shall include mapping repositories and support the representation, extraction, harvesting, generation, validation, merging, evaluation, visualization, analysis, storage and retrieval of mappings between the ontologies they host and other ones.
Our work will be to design and specify a mapping repository that should support all these aspects. Our current ontology repositories already handle some mappings capabilities but hardly support the whole lifecycle to make these FAIR objects and ease their reuse. In doing so, we will build on the SEMAF proposal (https://doi.org/10.5281/zenodo.4651421) for a flexible semantic mapping framework, possibly by making the mapping repositories inside the semantic artefact catalogues “SEMAF-compliant” and we will ensure compliance with the SSSOM format and tooling (https://mapping-commons.github.io/sssom).
We will conceive our framework within AgroPortal, an ontology repository for the agri-food (http://agroportal.lirmm.fr). The main objective of the AgroPortal project is to develop and support a reference ontology repository for agri-food. It offers a robust and reliable advanced prototype service to the community that features ontology hosting, search, versioning, visualization, comment, services for semantically annotating data with the ontologies, as well as storing and exploiting ontology mapping, all of these in a semantic web compliant infrastructure. This repository is based on the OntoPortal technology originally developed by BMIR at Stanford University for the NCBO BioPortal (https://bioportal.bioontology.org), the reference and most comprehensive repository of biomedical ontologies and terminologies. Several research organizations have now joined the OntoPortal Alliance (https://ontoportal.org) to co-develop, maintain and disseminate a shared, generic technology to develop ontology repositories. Please refer to our GitHub repositories and documentation:
- AgroPortal: https://github.com/agroportal
- OntoPortal: https://ontoportal.github.io/documentation
The postdoc mission will be to:
- Design and specify an ontology alignment framework to represent, extract, harvest, generate, validate, merge, evaluate, visualize, analyze and semantic artefact mappings.
- Supervise the implementation (working with a developper) of the designed framework
- Consolidate the content of AgroPortal mapping repository by aligning ontologies to other relevant ones and to a selected hub thesaurus. Release mappings as linked open data within the mapping repository.
- Evaluate and publish the results.
- Participate in the SSSOM initiative: discussion, specification, complaint tooling, etc.
- Work with partners and users on generating and curating mappings thanks to the framework developed.
Training and skills
We are looking for a motivated postdoc or experienced researcher. You will work with a group (3-4 persons) at LIRMM & MISTEA in both a national and international context. The candidate must hold a PhD in Informatics / Computer science and must have experience in the semantic web area and using ontologies. The candidate will demonstrate aptitudes or matches with most of the following aspects:
- High motivation for scientific research.
- Experience with semantic web technologies, especially JSON-LD/RDF/OWL/SKOS/SPARQL.
- Data science and management expertise (open data, FAIR principles).
- Excellent technical skills to conduct experiments.
- Knowledge of ontology mapping / alignment issues and tools (e.g., OAEI).
- Possible experience in the agri-food domain.
- Excellent remote working capabilities (emails, trackers, collaborative tools, etc.).
- Excellent aptitude to work with others and engage external users.
- Excellent writing skills and publication motivation.
- Perfect English oral and writing skills (everything will be done in English).
- Basic knowledge of French with the objective to learn the language during the contract.
- International trips accepted (collaboration with Stanford and in Europe).
- Autonomy and initiative, take on decisions (management, functional, technical) and justify choices.
- Friendly person to join a small research team in Montpellier.
INRAE's life quality
By joining our teams, you benefit from (depending on the type of contract and its duration):
- up to 30 days of annual leave + 15 days "Reduction of Working Time" (for a full time);
- parenting support: CESU childcare, leisure services;
- skills development systems: training, career advise;
- social support: advice and listening, social assistance and loans;
- holiday and leisure services: holiday vouchers, accommodation at preferential rates;
- sports and cultural activities;
- collective catering.
How to apply
Application for this position must be done solely via this form: https://forms.gle/BhENJNeayRQX1LuXA
Remote and face to face interviews will be organized.
Documents (PDF) required are:
- a curriculum vitae describing your education and experience;
- a motivation letter describing your interest in the position and the matches with the expected profile;
- copies of highest diploma and/or other relevant certificates and/or school grades;
- names and contact details of referees.