16. March 2026

Exploring Multi-View Image Fusion for LiDAR Upsampling

Description

Recent advances in autonomous driving perception increasingly rely on multi-view,
multi-sensor fusion to recover dense and semantically consistent 3D scene representations.
While LiDAR upsampling has been explored, it remains an open problem. Often treated as a
standalone task without leveraging complementary visual cues or learning consistent geometric
priors across modalities. This thesis proposes a cross modal transformer framework designed to
systematically explore camera-LiDAR fusion strategies for enhancing LiDAR upsampling.
Rather than committing to a single paradigm, we will investigate multiple fusion approaches;
spanning range-view, frustum-based, and latent feature alignment schemes to determine how
multi-view image backbones can best inform the upsampling of sparse LiDAR signals. A
transformer based bridge module will be developed to connect image and LiDAR
representations, potentially through attention or geometry aware correlation mechanisms.
Through experiments on autonomous driving benchmarks, we will evaluate how cross modal
supervision and feature sharing improve spatial coherence, realism, and reconstruction fidelity.
The study aims to establish a principled understanding of which fusion paradigms are most
effective for realistic LiDAR upsampling

Supervisor

Prof. Dr. Vasileios Belagiannis,

Azhar Hussian (MSc)

Last update: 16. March 2026 - 18:01