Research

My research focuses on scalable, efficient, and reliable learning for agentic AI systems, particularly in the context of large language models (LLMs), diffusion models, and other generative AI models. I explore methods that enhance inference-time performance, shift computational efficiency curves, and ensure robust and reliable learning across diverse applications.

Research Areas

Test-Time Scaling & Adaptation

I study test-time scaling laws to dynamically enhance AI performance at inference and testing stages. My work explores adaptive inference strategies, while also leveraging test-time training techniques to generate and utilize high-quality data. Additionally, I focus on optimizing reasoning workflows to improve model accuracy through self-refinement mechanisms.

Shifting the Computational Curve

To optimize the efficiency-performance tradeoff in AI systems, my research focuses on accelerating models and enabling dynamic computation. This includes model compression and distillation techniques, as well as conditional computation methods. I also explore system-algorithm co-design for large model inference and training, along with hardware-algorithm co-design to develop energy-efficient neural network architectures and optimize AI-accelerator collaboration.

Reliable Learning & Robust Generalization

Beyond efficiency, I focus on trustworthy AI by ensuring robustness, uncertainty quantification, and adaptive learning. My research includes uncertainty-aware AI techniques that enable models to recognize and communicate prediction uncertainty, improving their reliability in deployment. Furthermore, I investigate adaptive learning strategies that allow AI systems to refine their knowledge efficiently from new data with minimal supervision.

Applications

Education

Robotics

Agriculture

Networking

Sustainability

Climate

Transportation

Healthcare

Representative Work

Adaptix

AAAI 2025 Oral

MISSION: To revolutionize LLM decoding with adaptive, compute-optimal strategies and test-time learning

✓Fine-Tuning-Free Efficiency: Achieve 2.5x speedup without the need for fine-tuning
✓Dynamic Adaptability: Align token predictions with evolving output distributions in real time

Project Website AAAI'25 Oral Paper

Gentopia

MISSION: To build artificial general intelligence through collective growth of generative intelligent agents

✓Demo I (Create Agents)
✓Demo II (Customize and Interact with Agents)
✓Demo III (Evaluate Agents)

Project Website Full Documentation EMNLP'23 System Demo

MerryQuery

Best Demonstration Award

MISSION: To reshape next-generation education through trustworthy generative AI technologies

✓Innovations: ① Trust & Transparency, ② Dynamic & Controllable, ③ Multimodal & Multifunctional

Project Website Video Demo AAAI'25 Demo