Yifan Sui

I am a 5th-year Ph.D. candidate at Shanghai Jiao Tong University (since 2021), advised by Prof. Jianxun Li. I also work closely with Prof. Hao Wang from Stevens Institute of Technology. Currently, I am a research intern at the System Group of Microsoft Research Asia (since Jun 2025).

My research interests lie in Serverless Computing, Efficient LLM Systems, and Cloud Systems.

Research

My research focuses on re-architecting the serverless paradigm for state-of-the-art AI workflows. Serverless computing is critical for AI as it scales resources on demand, eliminating idle GPU waste and reducing costs. However, current serverless platforms are not yet well-suited for AI workloads. My work addresses this gap across the AI stack—from efficient model inference to LLM serving to agentic workflows.

Selected Publications

See Google Scholar for full list.

Act While Thinking: Accelerating LLM Agent Serving via Speculative Tool Execution
Yifan Sui, Han Zhao, Rui Ma, Hao Wang, Zhiyuan He, Jianxun Li, Yuqing Yang
Under Submission, 2026
[paper]
xLoRA: Faster and Cheaper LoRA LLM Serving with Serverless Computing
Yifan Sui, Hao Wang, Hanfei Yu, Yitao Hu, Chen Chen, Jianxun Li
Under Submission, 2025
[paper] [arXiv]
Accelerating ML Inference via Opportunistic Pre-Loading on Serverless Clusters
Yifan Sui, Hanfei Yu, Yitao Hu, Jianxun Li, Hao Wang
IEEE Transactions on Parallel and Distributed Systems (TPDS), 2025
[paper]
Pre-Warming Is Not Enough: Accelerating Serverless Inference With Opportunistic Pre-Loading Best Paper Award
Yifan Sui, Hanfei Yu, Yitao Hu, Jianxun Li, Hao Wang
ACM Symposium on Cloud Computing (SoCC), 2024
[paper]

Education

Ph.D. Candidate, Shanghai Jiao Tong University, 2021 - Present
B.S. , Beijing University of Posts and Telecommunications, 2017 - 2021