Modèle fondamental multimodal vision‑langage pour le raisonnement en santé // multimodal medical vision-language foundation model for healthcare reasoning

Palaiseau

Institut Polytechnique de Paris Télécom Paris

Médical

Publiée le 19 avril

Description de l'offre

Topic description

Ce projet de doctorat vise à construire un ensemble de données multimodal à grande échelle, longitudinal et enrichi de signaux d'ancrage solides, puis à développer un modèle vision‑langage médical (VLM) compact mais évolutif, dont la structure interne s'aligne étroitement sur les flux de travail des médecins.
La recherche sera organisée autour de deux axes étroitement liés. Le premier porte sur la construction de l'ensemble de données, impliquant la collecte et l'harmonisation de données hospitalières vietnamiennes dé‑identifiées, couvrant les radiographies, scanners (CT), PET, IRM et rapports cliniques, complétées par des ensembles de données publiques soigneusement sélectionnés. Le second axe concerne la méthodologie, en partant de modèles de base de taille modérée et cliniquement performants, dans l'esprit de LLaVA-Med, puis en décomposant le système en modules experts interactifs pour la récupération, la localisation, la segmentation, la quantification, le masquage, le contrôle, la vérification et la génération.
L'objectif attendu est la création d'un cadre de recherche cliniquement ancré, capable de soutenir la génération de rapports, les questions-réponses visuelles médicales (VQA), la localisation, l'interprétation et l'aide à la décision. Ce cadre offre également une voie réaliste pour passer de modèles compacts spécifiques au domaine à des systèmes multimodaux de raisonnement en santé plus larges, garantissant à la fois applicabilité pratique et pertinence clinique tout au long du doctorat.
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------------------------------------------

This PhD project aims to construct a large-scale, longitudinal, multimodal dataset enriched with strong grounding signals and to develop a compact-to-scalable medical vision-language model (VLM) whose internal structure aligns closely with physician workflows.
The research will be organized around two tightly coupled thrusts. The first focuses on dataset construction, involving the collection and harmonization of de-identified Vietnamese hospital data across X-ray, CT, PET, MRI, and clinical reports, complemented by carefully curated public datasets. The second focuses on methodology, starting from clinically competitive, moderate-size backbone models in the spirit of LLaVA-Med, and decomposing the system into interactive expert modules for retrieval, localization, segmentation, quantification, masking, gating, verification, and generation.
The expected outcome is a clinically grounded research framework capable of supporting report generation, medical visual question answering (VQA), localization, interpretation, and decision support. Crucially, this framework provides a realistic pathway from compact, domain-specific modeling toward larger multimodal healthcare reasoning systems, ensuring both practical applicability and clinical relevance throughout the course of the PhD.
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Début de la thèse : 01/10/

Funding category

Other public funding

Funding further details

Concours IPP ou école membre*Contrat Doctoral E4H*Contrat doctoral Hi!Paris*Demi-allocation ANR IA*

Postuler

Créer une alerte

Sauvegarder

Offre similaire

Attaché scientifique médical (f/h)

Gentilly

Expectra

Médical

Offre similaire

Technicien d'information médicale /chargé information médicale (h/f) (intérim)

Gometz-le-Châtel

Intérim

Manpower

Médical

Offre similaire

Technicien d'information médicale /chargé information médicale (h/f) (intérim)

Gometz-le-Châtel

Intérim

Manpower

Médical