Homepage -

I am a master student at Peking University.

Yuanwei Li

M.S. Student
Peking University

yuanweili(at)stu.pku.edu.cn

About Me

I am Yuanwei Li (李远威) from Peking University. I'm a third-year M.S. student at the Medical Intelligence Lab (MILab), under the supervision of assistant professor Yanye Lu (卢闫晔). My research primarily investigates weakly supervised learning, vision foundation models, and microscopy image segmentation, with specific emphasis on nuclei segmentation in histopathology and mitochondria segmentation in electron microscopy.

Currently, I am a full-time MLLM Algorithm Engineer at the Search Algorithm Department within Tencent’s WeChat Group (WXG). My research interests focus on the post-training and distributed training of Multimodal Large Language Models (MLLMs), as well as the application of Agentic Reinforcement Learning in search scenarios.

Education

Sep. 2023 - Present

Peking University

Institute of Medical Technology
M.S. Student
Sep. 2019 - Jul. 2023

Chengdu University of Technology

College of Computer Science and Cyber Security
B.S. Student

Experience

Jul. 2025 - Present

Tencent WXG Search Algorithm Department

MLLM Algorithm Engineer (Full-time via Intern Conversion) | Distributed Training (Megatron), Agentic RL
Mar. 2025 - Jun. 2025

ByteDance TikTok Platform Governance Algorithm Department

MLLM Algorithm Engineer (Intern) | Post-training (SFT/GRPO)
Sep. 2024 - Feb. 2025

Xiaomi Automobile Intelligent Cabins Algorithm Department

MLLM Algorithm Engineer (Intern) | Post-training (SFT/DPO)

🏆 Honors & Awards

National Scholarship

2022
National Scholarship for Encouragement ( x2 times)

2021&2020

News

2025

I joined the Search Algorithm Department within Tencent’s WeChat Group (WXG). My work encompassed the post-training of MLLMs and the optimization of large-scale distributed training frameworks.

Jul 01

I joined the Platform Governance Algorithm Department at ByteDance TikTok E-commerce, where my work focused on the post-training of MLLMs.

Mar 01

2024

I joined the Intelligent Cockpit Algorithm Department at Xiaomi Automobile, specializing in the post-training methodologies of MLLMs for sentry intelligence.

Nov 01

Selected Publications (view all )

Geometry-as-context: Modulating Explicit 3D in Scene-consistent Video Generation to Geometry Context

JiaKui Hu, Jialun Liu, Liying Yang, Xinliang Zhang, Kaiwen Li, Shuang Zeng, Yuanwei Li, Haibin Huang, Chi Zhang, Yanye Lu

Computer Vision and Pattern Recognition (CVPR) 2026 CCF-A

Achieving scene-consistent video generation from camera trajectories remains a challenge due to the cumulative errors inherent in traditional reconstruction-based pipelines, which often rely on non-differentiable operators and disjointed inpainting steps. To overcome these limitations, this research introduces Geometry-as-Context (GaC), a novel framework that reformulates the iterative reconstruction and rendering process into a fully differentiable, autoregressive video generation task. By interleaving geometry and RGB frames within a single sequence, GaC enables end-to-end optimization of geometry estimation and novel-view synthesis. We further propose a camera-gated attention mechanism to effectively modulate self-attention using camera poses, alongside a geometry dropout strategy to streamline inference. Experimental results demonstrate that GaC significantly outperforms existing baselines, delivering superior 3D consistency, high-fidelity rendering, and robust generalization even in scenarios with large camera dynamics. This work provides a scalable solution for applications requiring seamless 3D experiences, such as AR/VR and embodied intelligence.

[Code] [DOI]

Geometry-as-context: Modulating Explicit 3D in Scene-consistent Video Generation to Geometry Context

JiaKui Hu, Jialun Liu, Liying Yang, Xinliang Zhang, Kaiwen Li, Shuang Zeng, Yuanwei Li, Haibin Huang, Chi Zhang, Yanye Lu

Computer Vision and Pattern Recognition (CVPR) 2026 CCF-A

[Code] [DOI]

Points-Supervised Fundus Vessel Segmentation via Shape Priors and Contrastive Learning

Kaiwen Li, Hangzhou He, Shuang Zeng, Xinliang Zhang, Yuanwei Li, Lei Zhu, Yanye Lu

IEEE Transactions on Medical Imaging(TMI) 2025 中科院一区Top

Accurate segmentation of fundus vessels is essential for diagnosing cardiovascular and ophthalmic diseases, yet fully supervised methods depend on laborious, pixel-wise manual annotations. To address the trade-off between annotation cost and model performance, this research pioneers the application of point annotations, a sparse and efficient labeling strategy, to fundus vessel segmentation. We propose the Points-based Vessel segmentation Network (PVN), a novel framework designed to learn robust representations from limited supervision. Key technical contributions include the integration of Point Activation Maps (PAM) with learned shape priors to provide soft supervision, effectively mitigating the noise often found in generated pseudo-labels. Additionally, a unique pixels-and-regions-mixed contrastive learning method is introduced to enhance feature discrimination between vessels and the background. Experimental results across multiple datasets demonstrate that PVN significantly outperforms existing point-supervised methods, achieving excellent segmentation accuracy using only 1% of annotated pixels. This work offers a flexible, scalable solution that drastically reduces the labeling burden for medical imaging tasks.

[Code] [DOI]

Points-Supervised Fundus Vessel Segmentation via Shape Priors and Contrastive Learning

Kaiwen Li, Hangzhou He, Shuang Zeng, Xinliang Zhang, Yuanwei Li, Lei Zhu, Yanye Lu

IEEE Transactions on Medical Imaging(TMI) 2025 中科院一区Top

[Code] [DOI]

Current status, application, and challenges of the interpretability of generative adversarial network models

Sulin Wang, Chengqiang Zhao, Lingling Huang, Yuanwei Li, RuochenLi

Computational Intelligence 2022 SCI 3 区

While Generative Adversarial Networks (GANs) have revolutionized unsupervised learning and image generation, their opaque internal mechanisms pose significant challenges regarding reliability, controllability, and security. This research addresses the critical need for transparency by systematically exploring the interpretability of GANs. We propose a multi-dimensional analysis framework that investigates the model from an "inside-to-outside" perspective, linking internal network structures and feature extraction processes to the causal relationships inherent in the output results. The study evaluates the validity and robustness of interpretable GANs, specifically diagnosing weaknesses in high-stakes applications such as medical diagnosis and military detection. By analyzing the transition from representation learning to mechanism analysis, this work aims to mitigate prediction risks and enhance the control of semantic attributes. Furthermore, we identify current limitations and outline pivotal future challenges, providing a theoretical foundation for designing more secure and explainable generative architectures.

[DOI]

Current status, application, and challenges of the interpretability of generative adversarial network models

Sulin Wang, Chengqiang Zhao, Lingling Huang, Yuanwei Li, RuochenLi

Computational Intelligence 2022 SCI 3 区

[DOI]

Comparing Convolutional Neural Network and Machine Learning Models in Landslide Susceptibility Mapping: A Case Study in Wenchuan County

Sikui Zhang, Lin Bai, Yuanwei Li, Weile Li, Mingli Xie

Frontiers in Environmental Science 2022 SCI 2 区

This research addresses the critical challenge of precision in Landslide Susceptibility Mapping (LSM) by introducing a high-level Convolutional Neural Network (CNN) framework. While machine learning has advanced LSM, comparative studies between deep learning and traditional methods remain limited. To fill this gap, we conducted a systematic evaluation in the seismically active Wenchuan region, utilizing 11 environmental and engineering causative factors alongside a detailed inventory of earthquake-induced landslides. The study benchmarks the proposed CNN model against established algorithms, specifically Support Vector Machine (SVM), Logistic Regression (LR), and Random Forest (RF). Experimental results indicate that the CNN-based approach significantly outperforms conventional methods, achieving a success-rate curve (SRC) of 93.14% and a prediction-rate curve (PRC) of 91.81%. This work demonstrates the superior goodness-of-fit and predictive capability of deep learning architectures, offering a more robust solution for quantitative disaster risk assessment.

[Code] [DOI]

Comparing Convolutional Neural Network and Machine Learning Models in Landslide Susceptibility Mapping: A Case Study in Wenchuan County

Sikui Zhang, Lin Bai, Yuanwei Li, Weile Li, Mingli Xie

Frontiers in Environmental Science 2022 SCI 2 区

[Code] [DOI]

Warning

Action required

Education

Experience

🏆 Honors & Awards

News

Selected Publications (view all )

Geometry-as-context: Modulating Explicit 3D in Scene-consistent Video Generation to Geometry Context

Geometry-as-context: Modulating Explicit 3D in Scene-consistent Video Generation to Geometry Context

Points-Supervised Fundus Vessel Segmentation via Shape Priors and Contrastive Learning

Points-Supervised Fundus Vessel Segmentation via Shape Priors and Contrastive Learning

Current status, application, and challenges of the interpretability of generative adversarial network models

Current status, application, and challenges of the interpretability of generative adversarial network models

Comparing Convolutional Neural Network and Machine Learning Models in Landslide Susceptibility Mapping: A Case Study in Wenchuan County

Comparing Convolutional Neural Network and Machine Learning Models in Landslide Susceptibility Mapping: A Case Study in Wenchuan County

All publications