Stage - Trustworthy Deep Learning: VLM-based (CLIP) OoD Detection H/F

Vacancy details

General information

Organisation

The French Alternative Energies and Atomic Energy Commission (CEA) is a key player in research, development and innovation in four main areas :
• defence and security,
• nuclear energy (fission and fusion),
• technological research for industry,
• fundamental research in the physical sciences and life sciences.

Drawing on its widely acknowledged expertise, and thanks to its 16000 technicians, engineers, researchers and staff, the CEA actively participates in collaborative projects with a large number of academic and industrial partners.

The CEA is established in ten centers spread throughout France
  

Reference

2024-34494  

Position description

Category

Information system

Contract

Internship

Job title

Stage - Trustworthy Deep Learning: VLM-based (CLIP) OoD Detection H/F

Subject

CLIP-based OoD Detection with Post-hoc Methods

Contract duration (months)

6

Job description

Context
The List Institute at CEA Tech (CEA’s technological research division), dedicate its activities to driving innovation in intelligent digital systems. The specialized R&D programs aim to carry out technological developments of excellence in critical industry sectors and by partnering with key industry and academic actors.

Within the LIST Institute, at the heart of the Paris-Saclay Campus (Essonne), the Embedded and Autonomous Systems Design Laboratory (LSEA) works on methods and tools for the design & development of trustworthy autonomous systems that incorporate AI-based components. In particular, the LSEA’s Trustworthy Deep Learning (TDL) team conducts research on confidence (uncertainty) representation and monitoring in deep neural networks (DNNs) for computer vision tasks and automated robots.

Mission

The detection of out-of-distribution (OoD) samples is crucial for deploying machine learning (ML) models in real-world scenarios. OoD samples pose a challenge to ML models as they are not represented in the training data and can naturally arrive during deployment (i.e., a distribution shift), increasing the risk of obtaining wrong predictions. Consequently, the detection of OoD samples is
crucial in safety-critical domains, such as healthcare or automated vehicles, where trustworthy models are required.

To address the OoD detection task, previous works have been focused on proposing post-hoc confidence scores for fully supervised settings using a single data modality (e.g., images and image classification tasks). However, the advent of vision-language models (VLMs), represented by CLIP, has accelerated the field of computer vision and allowed zero-shot and few-shot learning schemes for
different tasks. In this regard, a new paradigm has emerged where CLIP is used for OoD detection with confidence scores that leverage visual features and textual concepts, leaving the applicability of existing post-hoc confidence scores for situations where CLIP is fine-tuned with more data.

Interestingly, recent works showed that CLIP fine-tuning tends to improve classification accuracy but does not necessarily enhance OoD detection accuracy when using post-hoc methods. A plausible hypothesis for that effect is that fine-tuning procedures may destroy CLIP’s rich visual-language representations. Therefore, with this internship, we seek to explore strategies to augment CLIP’s robustness when fine-tuning procedures are applied so that existing and new post-hoc confidence measures can be used to detect OoD samples without a decrease in detection performance.

Internship Objectives

  • Study the State-of-the-Art methods for fine-tuning and augmenting CLIP’s robustness.
  • Evaluate the performance of post-hoc confidence scores for OoD detection on fine-tuned CLIP that employ robustness augmentation methods.
  • Extract/inspect internal visual-language features of CLIP, and design a CLIP-based post-hoc confidence score for OoD detection.

 

Methods / Means

Python, PyTorch, VLMs-CLIP

Applicant Profile

What do we expect from you?

  • You are a 2nd year Master student (M2 – France).
  • Proficiency in Python and PyTorch.
  • Computer vision and deep learning skills: VLMs (CLIP).

 

In line with CEA's commitment to integrating people with disabilities, this job is open to all.

Position location

Site

Other

Job location

France, Ile-de-France, Essonne (91)

Location

Palaiseau

Candidate criteria

Languages

English (Fluent)

Prepared diploma

Bac+5 - Master 2

Recommended training

Computer Science

PhD opportunity

Oui

Requester

Position start date

29/11/2024