Fabien Baradel
I am a Research Scientist and Project Lead at NAVER LABS Europe in Grenoble, France, working on human-centric 3D vision.
I did my PhD at INSA Lyon
advised by Christian Wolf
and Julien Mille.
I have also spent time at Google, Simon Fraser University and University of Guelph during my
PhD journey.
I received my Engineer's degree (MSc) from ENSAI.
Email  / 
CV  / 
Scholar
 / 
Github  / 
LinkedIn  / 
Twitter
|
|
Research
|
|
Multi-HMR: Multi-Person Whole-Body Human Mesh Recovery in a Single Shot
Fabien Baradel*,
Matthieu Armando,
Salma Galaaoui,
Romain Brégier,
Philippe Weinzaepfel,
Grégory Rogez,
Thomas Lucas*
ECCV, 2024  
PDF
/
arXiv
/
code
/
demo
/
winner of ROBIN challenge @CVPR'24
A simple yet effective model for multi-person whole-bdy 3d human pose estimation running in real-time on a GPU and reaching SotA results.
|
|
Cross-view and Cross-pose Completion for 3D Human Understanding
Matthieu Armando,
Salma Galaaoui,
Fabien Baradel,
Thomas Lucas,
Vincent Leroy,
Romain Brégier,
Philippe Weinzaepfel,
Grégory Rogez
CVPR, 2024  
PDF
/
arXiv
/
bibtex
A self-supervised pre-training strategy for human-centric 3D vision.
|
|
Purposer: Putting Human Motion Generation in Context
Nicolas Ugrinovic,
Thomas Lucas,
Fabien Baradel,
Philippe Weinzaepfel,
Grégory Rogez
Francesc Moreno-Noguer
3DV, 2024  
PDF
/
arXiv
/
bibtex
A method able to generate realistic-looking motions that interact with virtual scenes.
|
|
SHOWMe: Benchmarking Object-agnostic Hand-Object 3D Reconstruction
Anilkumar Swamy,
Vincent Leroy,
Philippe Weinzaepfel,
Fabien Baradel,
Salma Galaaoui,
Romain Brégier,
Matthieu Armando,
Jean-Sebastien Franco,
Grégory Rogez
ACVR workshop ICCV, 2023  
PDF
/
project page
/
arXiv
/
bibtex
A new high-quality textured meshes dataset of hand holding an object.
|
|
PoseGPT: Quantization-based 3D Human Motion Generation and Forecasting
Thomas Lucas*,
Fabien Baradel*,
Philippe Weinzaepfel,
Grégory Rogez
ECCV, 2022  
PDF
/
code
/
arXiv
/
bibtex
PoseGPT generates a human motion, conditioned on an action label, a duration and optionally on an observed pas human motion.
We learn to quantize the human motion into a discrete latent space and we train a GPT-like model to sequentially predicts next discrete latent indices.
|
|
PoseBERT: A Generic Transformer Module for Temporal 3D Human Modeling
Fabien Baradel,
Romain Brégier,
Thibault Groueix,
Philippe Weinzaepfel,
Yannis Kalantidis,
Grégory Rogez
TPAMI, 2022  
arXiv
/
bibtex
We propose a generic transformer model for temporal modeling of human and hand shape.
We apply this model to different tasks such as pose estimation and future pose prediction.
PoseBERT is able to denoise and interpolate which is very important for deploying pose estimation on he wild.
|
|
Filtered-CoPhy: Unsupervised Learning of Counterfactual Physics in Pixel Space
Steeven Janny,
Fabien Baradel,
Natalia Neverova,
Madiha Nadri,
Greg Mori,
Christian Wolf
ICLR, 2022
PDF
/
OpenReview
/
Project Page
We propose a model learned in a unsupervised manner which is able to perform counterfactual predictions in pixel space.
|
|
Leveraging MoCap Data for Human Mesh Recovery
Fabien Baradel*,
Thibault Groueix*,
Philippe Weinzaepfel,
Romain Brégier,
Yannis Kalantidis,
Grégory Rogez
3DV, 2021  
PDF
/
arXiv
/
Video-short
/
Video-long
/
bibtex
We show that Mocap data can be used for improving image-based and video-based human mesh recovery methods.
We propose a video-based transformer model called PoseBERT which is trained on synthetic data only.
|
|
CoPhy: Counterfactual Learning of Physical Dynamics
Fabien Baradel,
Natalia Neverova,
Julien Mille,
Greg Mori,
Christian Wolf
ICLR, 2020 (Spotlight
presentation)
PDF
/
arXiv
/
Code-Dataset
/
Video
/
bibtex
We introduce a new problem of counterfactual learning of object mechanics from visual input
and a benchmark called CoPhy.
|
|
Learning Video Representations using Contrastive Bidirectional Transformer
Chen Sun,
Fabien Baradel,
Kevin Murphy,
Cordelia Schmid
arXiv preprint, 2019
PDF
/
arXiv
/
bibtex
Self-supervised video representation by leveraging ASR and long videos via noise contrastive
estimation.
|
|
Object Level Visual Reasoning in Videos
Fabien Baradel,
Natalia Neverova,
Christian Wolf,
Julien Mille,
Greg Mori
ECCV, 2018
Project page
/
PDF
/
arXiv
/
video
/
bibtex
/
Code
/
Complementary Mask Data
/
Poster
A model capable of learning to reason about semantically meaningful spatio-temporal
interactions in videos.
|
|
Human Activity Recognition with Pose-driven Attention to RGB
Fabien Baradel,
Christian Wolf,
Julien Mille
BMVC, 2018
PDF
/
bibtex
/
Poster
Human activity recogntion using skeleton data and RGB. We propose a network able to focus on
relevant parts of the RGB stream given deep features extracted from the pose stream.
|
|
Glimpse Clouds: Human Activity Recognition from Unstructured Feature Points
Fabien Baradel,
Christian Wolf,
Julien Mille,
Graham Taylor
CVPR, 2018
PDF
/
arXiv
/
project page
/
video
/
bibtex
/
CVPR Daily
/
Code
/
Poster
We propose a new method for human action recognition relying on RGB data only.
A visual attention module is able to extract glimpses within each frame.
Resulting local descriptors are soft-assigned to distributed workers which are finally
classifying the video.
|
|
Human Action Recognition: Pose-based Attention draws focus to Hands
Fabien Baradel,
Christian Wolf,
Julien Mille
ICCV, Workshop "Hands in Action", 2017
PDF
/
bibtex
/
Poster
A new spatio-temporal attention based mechanism for human action recognition able to
automatically attend to most important human hands and detect the most discriminative
moments in an action.
|
|
Discrepancy-based networks for unsupervised domain adaptation: a comparative
study
Gabriela
Csurka,
Fabien Baradel,
Boris
Chidlovskii,
Stephane
Clinchant,
ICCV, Workshop "Task-CV", 2017
PDF
/
bibtex
We introduce a new dataset for Domain Adaptation and show a comparaison between shallow and
deep methods based on Maximum Mean Discrepancy.
|
|
Pose-conditioned Spatio-Temporal Attention for Human Action Recognition
Fabien Baradel,
Christian Wolf,
Julien Mille
arXiv preprint, 2017
arXiv
/
PDF
/
project page
/
video
/
bibtex
We introduce an attention-based mechanism around hands on RGB videos conditioned on features
extracted from human 3D
pose.
|
PhD Thesis
|
|
Structured Deep Learning for Video Analysis
Fabien Baradel
Université de Lyon - INSA Lyon, 2020
Runner-up thesis prize - AFRIF
PDF
/
video
/
slides-pdf
/
slides-pptx
/
bibtex
|
|