Bridging Physics &
Artificial Intelligence

I am a final-year PhD candidate in Physics (Medical AI) at PSI & ETH Zurich, specializing in generative models and physics-informed deep learning for medical imaging and treatment planning.

I develop and evaluate deep generative and representation-learning models for high-dimensional imaging and spatiotemporal data. Using Diffusion Models, Flow Matching, Implicit Neural Representations, and Transformer architectures, I build end-to-end ML pipelines for tasks such as high-fidelity image synthesis, real-time motion tracking, and adaptive dose prediction in clinical workflows.

Muheng Li Profile
PyTorch & Lightning
Python / C++ / CUDA
Generative Models (Diffusion/Flows)
Geometric DL (NeRF/SDF)
Medical Physics (Monte Carlo)
HPC / Docker / Git

About

Affiliation & Supervision

I am embedded within the Center for Proton Therapy (CPT). My doctoral research is supervised by Prof. Antony Lomax and Dr. Ye Zhang, and is supported by the SNSF project EPIC-4DAPT (Grant No. 212855).

Education

Prior to Switzerland, I received my M.S. (2023) and B.S. (2020) degrees from Tsinghua University. During my master's in the Department of Automation, I focused on 3D generative vision and multimodal video understanding under the guidance of Prof. Jiwen Lu and Prof. Jianjiang Feng.

Latest Updates

Jul 2025
Paper Accepted: Proof-of-concept study on Direct MRI-based Proton Dose Calculation accepted to Physics and Imaging in Radiation Oncology (phiRO).
May 2025
Paper Accepted: Work on Diffusion Schrödinger Bridges Models (DSBM) for high-quality MR-to-CT synthesis accepted to Medical Physics.
Jul 2024
Awarded: Won the 1st Place Rising Star Award (Best Paper) at ICCR 2024 in Lyon for NGP-DIR work.
Mar 2023
CVPR 2023: Diffusion-SDF accepted to CVPR 2023.
Mar 2022
CVPR 2022: Bridge-Prompt accepted to CVPR 2022.

Selected Engineering Projects

MIND Framework
Real-Time Transformer phiRO 2025

MIND: Neural Dose Engine

A proof-of-concept neural engine that predicts individual proton pencil beam dose distributions directly from MRI. A Transformer-based model operating on beam’s-eye-view patches achieves Monte Carlo-comparable accuracy (median 99.8% gamma pass rate at 1 mm/1%) while reducing computation time to about 3 ms per beam (a 600x speedup over FRED MC on GPU).

DSBM Generative AI
Generative AI Diffusion Models Medical Physics 2025

Diffusion Schrödinger Bridges

A DSBM framework for high-fidelity MR-to-CT synthesis that models an entropic optimal transport (Schrödinger bridge) between MR and CT distributions. Unlike standard diffusion models starting from pure noise, DSBM leverages MRI as a structural prior to preserve geometric fidelity and dosimetric accuracy.

NGP Registration
Computer Vision Neural Fields ICCR Best Paper 2024

Instant Neural Motion Tracking

Adapts Instant Neural Graphics Primitives (NGP) and multi-resolution hash encoding to deformable medical image registration. Enables on-the-fly motion extraction with sub-second inference for 4D dose accumulation, addressing motion management bottlenecks.

CPT-4DMR
4D MRI Neural Fields Preprint

CPT-4DMR: Continuous 4D-MRI

A continuous function representation f(x, y, z, t) for 4D-MRI that eliminates binning artifacts and enables high-quality volumetric MRI reconstruction at arbitrary respiratory phases, critical for tracking irregular breathing motion.

Diffusion SDF
CVPR 2023 3D Vision Generative

Diffusion-SDF: Text-to-3D

A generative framework combining an SDF autoencoder with Voxelized Diffusion to synthesize high-quality 3D shapes from text descriptions, and extendable to text-conditioned shape completion and manipulation.

Bridge-Prompt
CVPR 2022 Video Understanding Vision-Language

Bridge-Prompt

A vision-language prompt-based framework for ordinal action understanding in instructional videos. Models semantics across adjacent correlated actions to exploit both out-of-context and contextual information.

Selected Publications

* indicates equal contribution.

A proof-of-concept study of direct magnetic resonance imaging-based proton dose calculation for brain tumors via neural networks with Monte Carlo-comparable accuracy
Muheng Li, Carla Winterhalter, Xia Li, Sairos Safai, Antony Lomax, Ye Zhang
Physics and Imaging in Radiation Oncology (phiRO), 2025

Presents the MIND neural dose engine that predicts individual proton pencil beam dose distributions directly from MR images, achieving a median 99.8% gamma pass rate at 1 mm/1% while reducing computation to about 3 ms per beam.

Neural network-driven direct CBCT-based dose calculation for head-and-neck proton treatment planning
Muheng Li, Evangelia Choulilitsa, Lisa Fankhauser, Francesca Albertini, Antony Lomax, Ye Zhang
Physics in Medicine & Biology (PMB), 2025

Develops an xLSTM-based CBCT neural dose engine (CBCT-NN) that predicts proton dose directly on CBCT images, achieving 95.1 ± 2.7% gamma pass rates at 2 mm/2% and under-3-minute computation for complete head-and-neck plans, enabling adaptive workflows without synthetic CT or complex correction pipelines.

Diffusion Schrödinger bridge models for high-quality MR-to-CT synthesis for proton treatment planning
Muheng Li, Xia Li, Sairos Safai, Antony Lomax, Ye Zhang
Medical Physics, 2025

Introduces DSBM, modeling an entropic optimal transport (Schrödinger bridge) between MR and CT distributions to achieve superior geometric fidelity and dosimetric accuracy in MR-only proton therapy.

Neural graphics primitives-based deformable image registration for on-the-fly motion extraction
Xia Li, Fabian Zhang, Muheng Li, Damien Weber, Antony Lomax, Joachim Buhmann, Ye Zhang
International Conference on the Use of Computers in Radiation Therapy (ICCR), 2024

Adapts Instant Neural Graphics Primitives and multi-resolution hash encoding to enable on-the-fly motion extraction with sub-second inference for accurate 4D dose accumulation.

CPT-4DMR: Continuous sPatial-Temporal representation for 4D-MRI reconstruction
Xinyang Wu*, Muheng Li*, Xia Li*, Orso Pusterla, Sairos Safai, Philippe C. Cattin, Antony Lomax, Ye Zhang
Preprint, 2025

Proposes a continuous spatio-temporal representation f(x, y, z, t) for 4D-MRI, enabling reconstruction at arbitrary respiratory phases and reducing binning artifacts.

Diffusion-SDF: Text-to-shape via voxelized diffusion
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023

Proposes Diffusion-SDF, combining an SDF autoencoder with voxelized diffusion for text-to-shape synthesis, and extends it to text-conditioned shape completion and manipulation.

Bridge-Prompt: Towards ordinal action understanding in instructional videos
Muheng Li, Lei Chen, Yueqi Duan, Zhilan Hu, Jianjiang Feng, Jie Zhou, Jiwen Lu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022

Introduces Bridge-Prompt, a prompt-based framework for ordinal action understanding that models semantics across multiple adjacent correlated actions in instructional videos.

Uncertainty-aware representation learning for action segmentation
International Joint Conference on Artificial Intelligence (IJCAI), 2022

Proposes an uncertainty-aware representation learning framework to capture transitional expressions between action periods, improving robustness in action segmentation.

Order-constrained representation learning for instructional video prediction
IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2022

Presents an Order-Constrained Representation Learning (OCRL) approach with a StepNCE contrastive loss to predict future actions from partially observed instructional videos in a weakly-supervised manner.