Muheng Li

Currently, I am a joint doctoral student at Paul Scherrer Institut (PSI) and ETH Zurich, supervised by Prof. Antony Lomax and Dr. Ye Zhang. Before, I graduated with a M.S degree from the Department of Automation at Tsinghua University, where I was advised by Prof. Jiwen Lu and Prof. Jianjiang Feng. I also had my bachelor's degree at the department of Engineering Physics, Tsinghua University.

Now, I am working at PSI’s Center for Proton Therapy, where I am deeply involved in the intersection of AI technology and radiation therapy. The core of my research is focused on addressing the complex challenges associated with 4D data analysis and modeling.

The dynamic nature of organs in motion during therapy sessions produces intricate 4D data. In my earlier academic journey, my research was concentrated on computer vision and deep learning, specifically generative 3D vision and multi-modal video learning. Building on my background, I am dedicated to advancing the methodologies for 4D data analysis, aiming to enhance the precision and effectiveness of proton therapy for cancer treatment.

Email / Google Scholar / Github

News

2023-03: 1 paper on generative 3D modeling (Diffusion-SDF) is accepted to CVPR 2023.

2022-03: 1 paper on vision-language video understanding (Bridge-Prompt) is accepted to CVPR 2022.

Publications

* indicates equal contribution

	Diffusion-SDF: Text-to-Shape via Voxelized Diffusion Muheng Li, Yueqi Duan , Jie Zhou , Jiwen Lu IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023 [Paper] [Code] We propose a new generative 3D modeling framework called Diffusion-SDF for the challenging task of text-to-shape synthesis. We propose a SDF autoencoder together with the Voxelized Diffusion model to learn and generate representations for voxelized signed distance fields (SDFs) of 3D shapes. We also extend our approach to further text-to-shape tasks including text-conditioned shape completion and manipulation.
	Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos Muheng Li, Lei Chen, Yueqi Duan , Zhilan Hu, Jianjiang Feng , Jie Zhou , Jiwen Lu IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022 [Paper] [Code] We propose a vision-language prompt-based framework, Bridge-Prompt (Br-Prompt), to model the semantics across multiple adjacent correlated actions, so that it simultaneously exploits both out-of-context and contextual information from a series of ordinal actions in instructional videos.
	Uncertainty-Aware Representation Learning for Action Segmentation Lei Chen, Muheng Li, Yueqi Duan , Jie Zhou , Jiwen Lu International Joint Conference on Artificial Intelligence (IJCAI), 2022 [Paper] [Code] (to come) We propose an uncertainty-aware representation Learning (UARL) method for action segmentation. Specifically, we design the UARL to exploit the transitional expression between two action periods by uncertainty learning.
	Order-Constrained Representation Learning for Instructional Video Prediction Muheng Li, Lei Chen , Jiwen Lu, Jianjiang Feng, Jie Zhou IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2022 [Paper] [Code] (to come) We propose a weakly-supervised approach called Order-Constrained Representation Learning (OCRL) together with a special contrastive loss function called StepNCE to predict future actions from instructional videos by observing incomplete steps of actions.

Honors and Awards

2022 First Class Scholarship for Graduate Students, Tsinghua University

2019 Science and Innovation Scholarship, Tsinghua University

2018 National Scholarship (top 5%), Tsinghua University

2017 Philip K H Wong Foundation Scholarships (top 15%), Tsinghua University

Others

Conference Reviewer: CVPR, ICME, VCIP

Programming Skills: Python, Matlab, C/C++, Java

Website Template