Fabien Baradel

Senior Scientist & Team Lead
3D Humans NAVER LABS Europe

I lead a research team advancing human-centric 3D vision to enable machines to perceive, understand and simulate humans in real-world environments.

Publications

	Multi-HMR 2: Multi-Person Camera-Centric Human Detection, Mesh Recovery and Tracking Guénolé Fiche, Philippe Weinzaepfel, Romain Brégier, Fabien Baradel ECCV, 2026 Multi-HMR 2 detects humans and recovers their 3D meshes, placed in the scene, along with camera parameters. It also outputs per-human features that allow online tracking in videos, despite being trained only on still images.
	Anny-Fit: All-Age Human Mesh Recovery Laura Bravo Sánchez, Matthieu Armando, Romain Brégier, Grégory Rogez, Serena Yeung-Levy, Fabien Baradel CVPR Findings Track, 2026 An optimization-based framework for recovering multi-person 3D human meshes of all ages directly in camera space by integrating semantic, depth, keypoint, and VLM-based expert cues.
	Human Mesh Modeling for Anny Body Romain Brégier, Guénolé Fiche, Laura Bravo Sánchez, Thomas Lucas, Matthieu Armando, Philippe Weinzaepfel, Grégory Rogez, Fabien Baradel ECCV, 2026 Anny is a differentiable, open-source human body model under the Apache 2.0 license, spanning the full lifespan from babies to elders.
	CondiMen: Conditional Multi-Person Mesh Recovery Romain Brégier, Fabien Baradel, Thomas Lucas, Salma Galaaoui, Matthieu Armando, Philippe Weinzaepfel, Grégory Rogez CVPR'W, 2025 A Bayesian method for multi-person human mesh recovery, modeling ambiguities in 3D pose and shape while enabling uncertainty handling and multi-view integration.
	Multi-HMR: Multi-Person Whole-Body Human Mesh Recovery in a Single Shot Fabien Baradel, Matthieu Armando, Salma Galaaoui, Romain Brégier, Philippe Weinzaepfel, Grégory Rogez, Thomas Lucas ECCV, 2024 A simple single-shot model for multi-person 3D human mesh recovery from a single RGB image, leveraging a vision transformer. We won the ROBIN challenge @CVPR'24 with Multi-HMR.
	Cross-view and Cross-pose Completion for 3D Human Understanding Matthieu Armando, Salma Galaaoui, Fabien Baradel, Thomas Lucas, Vincent Leroy, Romain Brégier, Philippe Weinzaepfel, Grégory Rogez CVPR, 2024 A self-supervised pre-training strategy for human-centric 3D vision.
	Purposer: Putting Human Motion Generation in Context Nicolas Ugrinovic, Thomas Lucas, Fabien Baradel, Philippe Weinzaepfel, Grégory Rogez Francesc Moreno-Noguer 3DV, 2024 A method able to generate realistic-looking motions that interact with virtual scenes.
	SHOWMe: Benchmarking Object-agnostic Hand-Object 3D Reconstruction Anilkumar Swamy, Vincent Leroy, Philippe Weinzaepfel, Fabien Baradel, Salma Galaaoui, Romain Brégier, Matthieu Armando, Jean-Sebastien Franco, Grégory Rogez ACVR workshop ICCV, 2023 A new high-quality textured meshes dataset of hand holding an object.
	PoseGPT: Quantization-based 3D Human Motion Generation and Forecasting Thomas Lucas, Fabien Baradel, Philippe Weinzaepfel, Grégory Rogez ECCV, 2022 PoseGPT generates a human motion, conditioned on an action label, a duration and optionally on an observed pas human motion. We learn to quantize the human motion into a discrete latent space and we train a GPT-like model to sequentially predicts next discrete latent indices.
	PoseBERT: A Generic Transformer Module for Temporal 3D Human Modeling Fabien Baradel, Romain Brégier, Thibault Groueix, Philippe Weinzaepfel, Yannis Kalantidis, Grégory Rogez TPAMI, 2022 We propose a generic transformer model for temporal modeling of human and hand shape. We apply this model to different tasks such as pose estimation and future pose prediction. PoseBERT is able to denoise and interpolate which is very important for deploying pose estimation on he wild.
	Filtered-CoPhy: Unsupervised Learning of Counterfactual Physics in Pixel Space Steeven Janny, Fabien Baradel, Natalia Neverova, Madiha Nadri, Greg Mori, Christian Wolf ICLR, 2022 We propose a model learned in a unsupervised manner which is able to perform counterfactual predictions in pixel space.
	Leveraging MoCap Data for Human Mesh Recovery Fabien Baradel, Thibault Groueix, Philippe Weinzaepfel, Romain Brégier, Yannis Kalantidis, Grégory Rogez 3DV, 2021 We show that Mocap data can be used for improving image-based and video-based human mesh recovery methods. We propose a video-based transformer model called PoseBERT which is trained on synthetic data only.
	CoPhy: Counterfactual Learning of Physical Dynamics Fabien Baradel, Natalia Neverova, Julien Mille, Greg Mori, Christian Wolf ICLR, 2020 (Spotlight presentation) We introduce a new problem of counterfactual learning of object mechanics from visual input and a benchmark called CoPhy.
	Learning Video Representations using Contrastive Bidirectional Transformer Chen Sun, Fabien Baradel, Kevin Murphy, Cordelia Schmid arXiv preprint, 2019 Self-supervised video representation by leveraging ASR and long videos via noise contrastive estimation.
	Object Level Visual Reasoning in Videos Fabien Baradel, Natalia Neverova, Christian Wolf, Julien Mille, Greg Mori ECCV, 2018 A model capable of learning to reason about semantically meaningful spatio-temporal interactions in videos.
	Human Activity Recognition with Pose-driven Attention to RGB Fabien Baradel, Christian Wolf, Julien Mille BMVC, 2018 Human activity recogntion using skeleton data and RGB. We propose a network able to focus on relevant parts of the RGB stream given deep features extracted from the pose stream.
	Glimpse Clouds: Human Activity Recognition from Unstructured Feature Points Fabien Baradel, Christian Wolf, Julien Mille, Graham Taylor CVPR, 2018 We propose a new method for human action recognition relying on RGB data only. A visual attention module is able to extract glimpses within each frame. Resulting local descriptors are soft-assigned to distributed workers which are finally classifying the video.
	Human Action Recognition: Pose-based Attention draws focus to Hands Fabien Baradel, Christian Wolf, Julien Mille ICCV, Workshop "Hands in Action", 2017 A new spatio-temporal attention based mechanism for human action recognition able to automatically attend to most important human hands and detect the most discriminative moments in an action.
	Discrepancy-based networks for unsupervised domain adaptation: a comparative study Gabriela Csurka, Fabien Baradel, Boris Chidlovskii, Stephane Clinchant, ICCV, Workshop "Task-CV", 2017 We introduce a new dataset for Domain Adaptation and show a comparaison between shallow and deep methods based on Maximum Mean Discrepancy.
	Pose-conditioned Spatio-Temporal Attention for Human Action Recognition Fabien Baradel, Christian Wolf, Julien Mille arXiv preprint, 2017 We introduce an attention-based mechanism around hands on RGB videos conditioned on features extracted from human 3D pose.
PhD Thesis
	Structured Deep Learning for Video Analysis Fabien Baradel Université de Lyon - INSA Lyon, 2020 Runner-up thesis prize - AFRIF

Awesome webpage...