user_image

Mengyao Lyu

I am currently a Ph.D. candidate at Tsinghua University, advised by Prof. Guiguang Ding. Prior to that, I was fortunate to work with Prof. Xiangzhi Bai and Prof. Hu Han. My research focuses on enhancing data efficiency and improving data explainability for computer vision algorithms.

Latest Publications 🙋🏻

mmSSR: Harvesting Rich, Scalable and Transferable Multi-Modal Data for Instruction Fine-Tuning
Mengyao Lyu, Yan Li, Huasong Zhong, Wenhao Yang, Hui Chen, Jungong Han, Guiguang Ding, Zhenheng Yang
arxiv 2025

Mitigating Hallucinations in Multi-modal Large Language Models via Image Token Attention-Guided Decoding
Xinhao Xu, Hui Chen, Mengyao Lyu, Sicheng Zhao, Yizhe Xiong, Zijia Lin, Jungong Han, Guiguang Ding
NAACL 2025

Towards Realistic Hierarchical Object Detection: Problem, Benchmark and Solution
Juexiao Feng, Yuhong Yang, Mengyao Lyu, Tianxiang Hao, Yi-Jie Huang, Yanchun Xie, Yaqian Li, Jungong Han, Liuyu Xiang, Guiguang Ding
T-CSVT 2025

Learn from the Learnt: Source-Free Active Domain Adaptation via Contrastive Sampling and Visual Persistence
Mengyao Lyu*, Tianxiang Hao*, Xinhao Xu, Hui Chen, Zijia Lin, Jungong Han, Guiguang Ding
ECCV 2024

One-dimensional Adapter to Rule Them All: Concepts, Diffusion Models and Erasing Applications
Mengyao Lyu*, Yuhong Yang*, Haiwen Hong, Hui Chen, Xuan Jin, Yuan He, Hui Xue, Jungong Han, Guiguang Ding
CVPR 2024 (Highlight, 2.8% acceptance rate)

Box-Level Active Detection
Mengyao Lyu, Jundong Zhou, Hui Chen, Yijie Huang, Dongdong Yu, Yaqian Li, Yandong Guo, Yuchen Guo, Liuyu Xiang, Guiguang Ding
CVPR 2023 (Highlight, 2.5% acceptance rate)

EXPERIENCE

  • 2024.07 ~ 2025.02

    Bytedance

    Research on Data Selection for MLLMs
    • First to design the efficient data selection algorithm for million-scale MLLM SFT data, providing rich, customizable, and diverse vision-language capabilities.
    • Achieved 99.1% of full performance using only 30% of the MLLM data, as validated across 14 benchmarks and over 10 experimental settings.
  • 2023.06 ~ 2024.03

    Alibaba

    Research on Controllable, Safe, and Fair Diffusion Generation
    • Precisely erased concepts from Diffusion models while maintaining safe concepts the same.
    • The obtained concept erasures facilitate training-free transfer and multi-concept customization.
    • Achieved SOTA results across ∼40 concepts, 7 Diffusion models and 4 erasing applications.
  • 2021.06 ~ 2022.11

    OPPO Research (Tsinghua-OPPO JCFDT)

    Research on Active Learning Algorithms for Data Closed-Loop in Object Detection
    • Proposed and implemented a novel active learning algorithm for object detection to improve data and training efficiency.
    • Reimplemented 10+ active detection baselines and SOTAs within a unified codebase for a fair evaluation.
    • Achieved SOTA results on public VOC, COCO and OPPO private datasets.
  • 2018.07 ~ 2018.08

    Horizon Robotics

    Research on Long-tailed Perception for Advanced Driver-Assistance System
    • Developed conditional generative adversarial networks to synthesize data for different road signs.
    • Achieved a 44% improvement in the accuracy of the road sign classification task.