ICRP 2022 tutorial: Deep Learning Models for Weakly-Supervised Object Localization and Segmentation

Date: August 21th, 2022. Hour: 9am-1pm.

Location: Montreal, Quebec, Canada.

Registration: Click here.

Slides: [here]

glas-results

Tutorial Abstract

Deep learning models, and in particular CNNs, provide state-of-the-art performance in many visual recognition applications, such as image classification, object localization and detection, and segmentation. However, they remain complex models with millions of parameters that typically require supervised end-to-end training on large annotated datasets. Weakly supervised learning (WSL) has recently emerged as an appealing approach to mitigate the cost and burden of annotating large datasets, by exploiting data with limited or coarse labels. In particular, WSL is largely beneficial in object localization and semantic segmentation tasks, to avoid the costly process of extracting bounding boxes and dense pixel-wise annotations. Weakly-supervised object localization (WSOL) and weakly-supervised semantic segmentation (WSSS) methods have drawn much attention because they rely on less costly annotation such as image-class labels.

This tutorial provides a detailed review of the recent progress with deep WSOL/WSSS models, and their applications in pattern recognition and computer vision. The tutorial will cover the fundamental principles, recent trends and developments, key challenges, and insights on these powerful models. In addition, the tutorial is accompanied by various examples where deep WSOL/WSSS models are applied to solve real-world problems in, e.g, visual recognition and medical image analysis.

Tutorial Outline

Part 1: Introduction

Weakly-supervised learning (motivations and definitions from pattern recognition and computer vision)
Focus of this tutorial - weakly-supervised object localization (WSOL) / weakly-supervised semantic segmentation (WSSS) in general

Part 2: Review of WSOL Methods

WSOL literature: bottom-up and top-down methods
Pause [30 min]
Case studies:
- F-CAM for improved interpolation
- Transformer-based models

Part 3: Review of WSSS Methods

Part 4: Applications of WSOL / WSSS

Application on medical: Histology
Application video processing
Application metric learning
Medical segmentations applications

Part 5: Key challenges and future directions

Speakers Introduction

Soufiane Belharbi

Soufiane Belharbi: Post-doc. LIVIA, Dept. of Systems Engineering, École de technologie supérieure, Montreal, Canada.

Soufiane Belharbi received his Ph.D. in Computer Science in 2018 from the Institut National des Sciences Appliquées de Rouen Normandie (INSA Rouen Normandie), in Litis laboratory. During that time, he conducted research on the regularization of neural networks through representation learning with particular focus on learning scenarios where only few training samples are available. Since 2018, he is pursuing a post-doc position at LIVIA laboratory, École de technologie supérieure, Montreal in collaboration with McCaffrey laboratory and GCRC at McGill university. His main focus is weakly-supervised learning for localization, and an interest in medical imaging applications. He has publications, and regularly reviews in top-tier machine learning and computer vision conferences and journals.

Eric Granger

Eric Granger: Professor, LIVIA, Dept. of Systems Engineering, École de technologie supérieure, Montreal, Canada.

Eric Granger, Ph.D., is the FRSQ Research Co-Chair in AI in Digital Health and Life Sciences, and the ETS-Distech Industrial Research Co-Chair on Embedded Neural Networks for Intelligent Connected Buildings. He is also Professor in the Dept. of Systems Engineering at École de technologie supérieure (ETS), and Director of the Laboratoire d’imagerie, de vision et d’intelligence artificielle (LIVIA). His research expertise includes machine learning, pattern recognition, and computer vision, and in particular domain adaptation and weakly-supervised learning, with applications in affective computing, biometrics, face analysis and recognition, medical image analysis, and video analytics and surveillance. To date, Dr. Granger has authored over 200 peer-reviewed papers and supervised /co-supervised over 65 postdocs and graduate students. He is an associate editor for Elsevier Pattern Recognition, Springer Nature Computer Science, and the EURASIP Journal on Image and Video Processing, and a regular contributor (submissions, reviews and organization) in top-tier conferences and journals in his field of expertise.

Ismail Ben Ayed

Ismail Ben Ayed: Professor, LIVIA, Dept. of Systems Engineering, École de technologie supérieure, Montreal, Canada.

Ismail Ben Ayed is currently Full Professor at the ETS Montreal, where he holds a research Chair on Artificial Intelligence in Medical Imaging. His interests are in computer vision, machine learning, optimization and medical image analysis algorithms, with a recent focus on weakly-supervised and few-shot learning. Ismail authored over 100 fully peer-reviewed papers, mostly published in the top venues of those areas, along with 2 books and 7 US patents. In the recent years, he gave over 30 invited talks, including 5 tutorials at flagship conferences (MICCAI and ISBI). His research has been covered in several visible media outlets, such as Radio Canada (CBC), Quebec Science Magazine and Canal du Savoir. His research team received several recent distinctions, such as the MIDL’21 best paper award and several top-ranking positions in internationally visible contests. Ismail has served as Program Chair for MIDL’20 and regularly as Area Chair for the MICCAI and MIDL conferences. Also, he serves regularly as reviewer for the main scientific journals of his field, and was selected several times among the top reviewers of prestigious conferences (such as CVPR’21, NeurIPS’20 and CVPR’15).

ICPR 2022 tutorial