LLMs for specifying data sharing policies H/F

Détail de l'offre

Informations générales

Entité de rattachement

Le CEA est un acteur majeur de la recherche, au service des citoyens, de l'économie et de l'Etat.

Il apporte des solutions concrètes à leurs besoins dans quatre domaines principaux : transition énergétique, transition numérique, technologies pour la médecine du futur, défense et sécurité sur un socle de recherche fondamentale. Le CEA s'engage depuis plus de 75 ans au service de la souveraineté scientifique, technologique et industrielle de la France et de l'Europe pour un présent et un avenir mieux maîtrisés et plus sûrs.

Implanté au cœur des territoires équipés de très grandes infrastructures de recherche, le CEA dispose d'un large éventail de partenaires académiques et industriels en France, en Europe et à l'international.

Les 20 000 collaboratrices et collaborateurs du CEA partagent trois valeurs fondamentales :

• La conscience des responsabilités
• La coopération
• La curiosité
  

Référence

2024-33440  

Description du poste

Domaine

Sciences pour l'ingénieur

Contrat

CDD

Intitulé de l'offre

LLMs for specifying data sharing policies H/F

Statut du poste

Cadre

Durée du contrat (en mois)

18

Description de l'offre

Developing physical or digital systems is a complex process involving both technical and human challenges. The first step is to give shape to ideas by drafting specifications for the system to come. Usually written in natural language by business analysts, these documents are the key that bind all stakeholders for the duration of the project, making it easier to share and understand what needs to be done. Requirements engineering proposes various techniques (reviews, modeling, formalization, etc.) to regulate this process and improve the quality (consistency, completeness, etc.) of the documents produced, with the aim of detecting and correcting defects even before system implementation.

 

In the field of requirements engineering, the recent arrival of very large model neural networks (LLM) has the potential to be a “game changer”. We propose to support the analyst by working around specifications on the data part age. The idea is to be able to model data sharing policies (ODRL) from natural text. The tool will exploit an AI transformer/LLM (such as ChatGPT or Lama) combined with rigorous analysis and consulting methods. It will propose options for rewriting requirements in controlled languages inspired by INCOSE or EARS standards, analyze the results produced by the LLM, and provide an audit on the quality of the model obtained.

 

More specifically, LLMs are particularly promising for the following uses:

  • Automatically transforming unstructured requirements into requirements formatted in structured models such as EARS or user stories.
  • Classify requirements: behavioral, non-functional, etc.
  • flag ambiguities, inconsistencies or potential violations on the basis of predefined validation heuristics.

LLMs also have limitations that need to be taken into account in the context of requirements engineering: hallucination, non-determinism, algorithmic biases and limited generalization.

 

As part of the laboratory's “Intelligent Requirements” team, the candidate's work will involve :

  • Determine schemas or a controlled language to represent the ODRL model.
  • Determine the effectiveness of different techniques and formalisms, such as NLP or Blue metric inspiration, to avoid hallucinations during rewriting.
  • Analyze, manage or generate training data for LLMs.
  • Configure and pilot one or more LLMs using the most effective techniques for improving the consistency and completeness of data-sharing policies.
  • Develop the software tools required for the above tasks.

Profil du candidat

Doctorat

Connaissance en Java, python, Eclipse EMF, Node JS, REACT

Localisation du poste

Site

Saclay

Localisation du poste

France, Ile-de-France, Hauts-de-Seine (92)

Ville

91120 PALAISEAU

Critères candidat

Langues

Anglais (Courant)

Formation recommandée

phD

Demandeur

Disponibilité du poste

07/10/2024