I am Yuanwei Li (李远威) from Peking University. I'm a third-year M.S. student at the Medical Intelligence Lab (MILab), under the supervision of assistant professor Yanye Lu (卢闫晔). My research primarily investigates weakly supervised learning, vision foundation models, and microscopy image segmentation, with specific emphasis on nuclei segmentation in histopathology and mitochondria segmentation in electron microscopy.
Currently, I am a full-time MLLM Algorithm Engineer at the Search Algorithm Department within Tencent’s WeChat Group (WXG). My research interests focus on the post-training and distributed training of Multimodal Large Language Models (MLLMs), as well as the application of Agentic Reinforcement Learning in search scenarios.
") does not match the recommended repository name for your site ("").
", so that your site can be accessed directly at "http://".
However, if the current repository name is intended, you can ignore this message by removing "{% include widgets/debug_repo_name.html %}" in index.html.
",
which does not match the baseurl ("") configured in _config.yml.
baseurl in _config.yml to "".

JiaKui Hu, Jialun Liu, Liying Yang, Xinliang Zhang, Kaiwen Li, Shuang Zeng, Yuanwei Li, Haibin Huang, Chi Zhang, Yanye Lu
Computer Vision and Pattern Recognition (CVPR) 2026 CCF-A
Achieving scene-consistent video generation from camera trajectories remains a challenge due to the cumulative errors inherent in traditional reconstruction-based pipelines, which often rely on non-differentiable operators and disjointed inpainting steps. To overcome these limitations, this research introduces Geometry-as-Context (GaC), a novel framework that reformulates the iterative reconstruction and rendering process into a fully differentiable, autoregressive video generation task. By interleaving geometry and RGB frames within a single sequence, GaC enables end-to-end optimization of geometry estimation and novel-view synthesis. We further propose a camera-gated attention mechanism to effectively modulate self-attention using camera poses, alongside a geometry dropout strategy to streamline inference. Experimental results demonstrate that GaC significantly outperforms existing baselines, delivering superior 3D consistency, high-fidelity rendering, and robust generalization even in scenarios with large camera dynamics. This work provides a scalable solution for applications requiring seamless 3D experiences, such as AR/VR and embodied intelligence.
JiaKui Hu, Jialun Liu, Liying Yang, Xinliang Zhang, Kaiwen Li, Shuang Zeng, Yuanwei Li, Haibin Huang, Chi Zhang, Yanye Lu
Computer Vision and Pattern Recognition (CVPR) 2026 CCF-A
Achieving scene-consistent video generation from camera trajectories remains a challenge due to the cumulative errors inherent in traditional reconstruction-based pipelines, which often rely on non-differentiable operators and disjointed inpainting steps. To overcome these limitations, this research introduces Geometry-as-Context (GaC), a novel framework that reformulates the iterative reconstruction and rendering process into a fully differentiable, autoregressive video generation task. By interleaving geometry and RGB frames within a single sequence, GaC enables end-to-end optimization of geometry estimation and novel-view synthesis. We further propose a camera-gated attention mechanism to effectively modulate self-attention using camera poses, alongside a geometry dropout strategy to streamline inference. Experimental results demonstrate that GaC significantly outperforms existing baselines, delivering superior 3D consistency, high-fidelity rendering, and robust generalization even in scenarios with large camera dynamics. This work provides a scalable solution for applications requiring seamless 3D experiences, such as AR/VR and embodied intelligence.

Kaiwen Li, Hangzhou He, Shuang Zeng, Xinliang Zhang, Yuanwei Li, Lei Zhu, Yanye Lu
IEEE Transactions on Medical Imaging(TMI) 2025 中科院一区Top
Accurate segmentation of fundus vessels is essential for diagnosing cardiovascular and ophthalmic diseases, yet fully supervised methods depend on laborious, pixel-wise manual annotations. To address the trade-off between annotation cost and model performance, this research pioneers the application of point annotations, a sparse and efficient labeling strategy, to fundus vessel segmentation. We propose the Points-based Vessel segmentation Network (PVN), a novel framework designed to learn robust representations from limited supervision. Key technical contributions include the integration of Point Activation Maps (PAM) with learned shape priors to provide soft supervision, effectively mitigating the noise often found in generated pseudo-labels. Additionally, a unique pixels-and-regions-mixed contrastive learning method is introduced to enhance feature discrimination between vessels and the background. Experimental results across multiple datasets demonstrate that PVN significantly outperforms existing point-supervised methods, achieving excellent segmentation accuracy using only 1% of annotated pixels. This work offers a flexible, scalable solution that drastically reduces the labeling burden for medical imaging tasks.
Kaiwen Li, Hangzhou He, Shuang Zeng, Xinliang Zhang, Yuanwei Li, Lei Zhu, Yanye Lu
IEEE Transactions on Medical Imaging(TMI) 2025 中科院一区Top
Accurate segmentation of fundus vessels is essential for diagnosing cardiovascular and ophthalmic diseases, yet fully supervised methods depend on laborious, pixel-wise manual annotations. To address the trade-off between annotation cost and model performance, this research pioneers the application of point annotations, a sparse and efficient labeling strategy, to fundus vessel segmentation. We propose the Points-based Vessel segmentation Network (PVN), a novel framework designed to learn robust representations from limited supervision. Key technical contributions include the integration of Point Activation Maps (PAM) with learned shape priors to provide soft supervision, effectively mitigating the noise often found in generated pseudo-labels. Additionally, a unique pixels-and-regions-mixed contrastive learning method is introduced to enhance feature discrimination between vessels and the background. Experimental results across multiple datasets demonstrate that PVN significantly outperforms existing point-supervised methods, achieving excellent segmentation accuracy using only 1% of annotated pixels. This work offers a flexible, scalable solution that drastically reduces the labeling burden for medical imaging tasks.

Sulin Wang, Chengqiang Zhao, Lingling Huang, Yuanwei Li, RuochenLi
Computational Intelligence 2022 SCI 3 区
While Generative Adversarial Networks (GANs) have revolutionized unsupervised learning and image generation, their opaque internal mechanisms pose significant challenges regarding reliability, controllability, and security. This research addresses the critical need for transparency by systematically exploring the interpretability of GANs. We propose a multi-dimensional analysis framework that investigates the model from an "inside-to-outside" perspective, linking internal network structures and feature extraction processes to the causal relationships inherent in the output results. The study evaluates the validity and robustness of interpretable GANs, specifically diagnosing weaknesses in high-stakes applications such as medical diagnosis and military detection. By analyzing the transition from representation learning to mechanism analysis, this work aims to mitigate prediction risks and enhance the control of semantic attributes. Furthermore, we identify current limitations and outline pivotal future challenges, providing a theoretical foundation for designing more secure and explainable generative architectures.
Sulin Wang, Chengqiang Zhao, Lingling Huang, Yuanwei Li, RuochenLi
Computational Intelligence 2022 SCI 3 区
While Generative Adversarial Networks (GANs) have revolutionized unsupervised learning and image generation, their opaque internal mechanisms pose significant challenges regarding reliability, controllability, and security. This research addresses the critical need for transparency by systematically exploring the interpretability of GANs. We propose a multi-dimensional analysis framework that investigates the model from an "inside-to-outside" perspective, linking internal network structures and feature extraction processes to the causal relationships inherent in the output results. The study evaluates the validity and robustness of interpretable GANs, specifically diagnosing weaknesses in high-stakes applications such as medical diagnosis and military detection. By analyzing the transition from representation learning to mechanism analysis, this work aims to mitigate prediction risks and enhance the control of semantic attributes. Furthermore, we identify current limitations and outline pivotal future challenges, providing a theoretical foundation for designing more secure and explainable generative architectures.

Sikui Zhang, Lin Bai, Yuanwei Li, Weile Li, Mingli Xie
Frontiers in Environmental Science 2022 SCI 2 区
This research addresses the critical challenge of precision in Landslide Susceptibility Mapping (LSM) by introducing a high-level Convolutional Neural Network (CNN) framework. While machine learning has advanced LSM, comparative studies between deep learning and traditional methods remain limited. To fill this gap, we conducted a systematic evaluation in the seismically active Wenchuan region, utilizing 11 environmental and engineering causative factors alongside a detailed inventory of earthquake-induced landslides. The study benchmarks the proposed CNN model against established algorithms, specifically Support Vector Machine (SVM), Logistic Regression (LR), and Random Forest (RF). Experimental results indicate that the CNN-based approach significantly outperforms conventional methods, achieving a success-rate curve (SRC) of 93.14% and a prediction-rate curve (PRC) of 91.81%. This work demonstrates the superior goodness-of-fit and predictive capability of deep learning architectures, offering a more robust solution for quantitative disaster risk assessment.
Sikui Zhang, Lin Bai, Yuanwei Li, Weile Li, Mingli Xie
Frontiers in Environmental Science 2022 SCI 2 区
This research addresses the critical challenge of precision in Landslide Susceptibility Mapping (LSM) by introducing a high-level Convolutional Neural Network (CNN) framework. While machine learning has advanced LSM, comparative studies between deep learning and traditional methods remain limited. To fill this gap, we conducted a systematic evaluation in the seismically active Wenchuan region, utilizing 11 environmental and engineering causative factors alongside a detailed inventory of earthquake-induced landslides. The study benchmarks the proposed CNN model against established algorithms, specifically Support Vector Machine (SVM), Logistic Regression (LR), and Random Forest (RF). Experimental results indicate that the CNN-based approach significantly outperforms conventional methods, achieving a success-rate curve (SRC) of 93.14% and a prediction-rate curve (PRC) of 91.81%. This work demonstrates the superior goodness-of-fit and predictive capability of deep learning architectures, offering a more robust solution for quantitative disaster risk assessment.