Research Intern & Undergraduate Student
Passionate research intern exploring the frontiers of multimodal intelligence and security
I am an undergraduate student from Beijing University of Technology & a research intern at Hong Kong University of Science and Technology (Guangzhou). My current research approach primarily employs interpretability methods to focus on innovative research in multimodal systems, embodied intelligence, and AI for Security. Through my work, I hope to enable us not only to drive AI models but also to understand why models make certain decisions, which helps us build more white-box AI systems.
Testing MLLMs from a security perspective, observing substantial biases in models through strategies such as Logit Lens and statistical metrics across various results.
Implementing numerous Vision Language Action Model baselines, discovering significant vulnerabilities to jailbreak strategies such as Prompt Injection.
Exploring various MLLM models, conducting safety performance testing and fine-tuning of MLLMs from different perspectives, and discovering inherent security issues during current model training processes.
Academic foundation in artificial intelligence and computer science
Twice awarded First-Class Scholarship at Beijing University of Technology, once received the annual Outstanding Student award, and achieved scores of 90+ in multiple theoretical and practical courses (Advanced Mathematics (99), Deep Learning (97), Analog Electronic Technology (92), Complex Functions (92)).
Collaborative research at leading institutions
Supervisor: Prof. Renjing Xu
Engaged in advanced research projects focusing on AI security, multimodal systems, and adversarial machine learning. Contributing to cutting-edge publications and developing novel approaches to AI robustness and security challenges.
Participated in multiple CCF-A paper projects and gradually mastered practical and writing skills through this experience
IJCAI 2025 Workshop
Investigates novel attack vectors in multimodal AI systems through typographic visual prompts, revealing critical security vulnerabilities and proposing defensive mechanisms.
ACM Multimedia Conference 2025
Explores the dual nature of transfer attacks in multimodal systems, analyzing both malicious applications and potential benefits for robustness testing and model improvement.
Experience our cutting-edge audio-based jailbreak testing framework. This interactive demo showcases the capabilities of our benchmark system for evaluating multimodal AI security vulnerabilities through audio inputs.
Explore various audio attack scenarios, test different AI models, and understand the importance of audio security in multimodal systems.
Launch Interactive DemoTechnical and professional skills developed through research and practice
Proficient in modern web technologies including HTML5, CSS3, JavaScript, and various frameworks. Experience in building responsive, interactive web applications and research presentation platforms.
Advanced proficiency in Python for research intern, including deep learning frameworks like PyTorch and TensorFlow, data analysis with NumPy and Pandas, and scientific computing.
Demonstrating understanding and implementation of interpretability papers and algorithms to fellow students through Bilibili, receiving highly positive feedback from audiences.
Recognition for academic excellence and research contributions
Received Best Paper Nomination at IJCAI 2025 Workshop for research on typographic visual prompt injection threats in multimodal AI systems.
Maintained outstanding academic performance with a GPA of 88.36/100 in the competitive AI program at Beijing University of Technology.
Selected for prestigious research Intern with HKUST(GZ) under the supervision of Prof. Renjing Xu, focusing on advanced AI security research.