Bio
We haven't found any bio for you yet.
Researcher Links
Loading links...
Publications by Type
Loading publications…
The last 5 uploaded publications
LEANN: A Low-Storage Vector Index
Yichuan Wang, Zhifei Li, Shu Liu, Yongji Wu, Ziming Mao, Yilong Zhao, Yan Xiao, Zhiying Xu, Yang Zhou, Ion Stoica, Sewon Min, Matei Zaharia, Joseph E. Gonzalez (2025). LEANN: A Low-Storage Vector Index. , DOI: https://doi.org/10.48550/arxiv.2506.08276.
Preprint21 days agoAccelerating Large-Scale Reasoning Model Inference with Sparse Self-Speculative Decoding
Yilong Zhao, Jiaming Tang, Kan Zhu, Zihao Ye, Chi-Chih Chang, Chih‐Jen Lin, Jongseok Park, Guangxuan Xiao, Mohamed S. Abdelfattah, Mingyu Gao, Baris Kasikci, Song Han, Ion Stoica (2025). Accelerating Large-Scale Reasoning Model Inference with Sparse Self-Speculative Decoding. , DOI: https://doi.org/10.48550/arxiv.2512.01278.
Preprint21 days agoSparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity
Haocheng Xi, Shuo Yang, Yilong Zhao, Chenfeng Xu, Muyang Li, Xiuyu Li, Yujun Lin, Han Cai, Jintao Zhang, Dacheng Li, Chen Jian-fei, Ion Stoica, Kurt Keutzer, Song Han (2025). Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity. , DOI: https://doi.org/10.48550/arxiv.2502.01776.
Preprint21 days agoAn Extensible Software Transport Layer for GPU Networking
Yang Zhou, Zhongjie Chen, Ziming Mao, ChonLam Lao, Shuo Yang, Pravein Govindan Kannan, Jiaqi Gao, Yilong Zhao, Yongji Wu, Kaichao You, Fengyuan Ren, Zhiying Xu, Costin Raiciu, Ion Stoica (2025). An Extensible Software Transport Layer for GPU Networking. , DOI: https://doi.org/10.48550/arxiv.2504.17307.
Preprint21 days agoBlendServe: Optimizing Offline Inference for Auto-regressive Large Models with Resource-aware Batching
Yilong Zhao, Shuo Yang, Kan Zhu, Lianmin Zheng, Baris Kasikci, Yang Zhou, Jiarong Xing, Ion Stoica (2024). BlendServe: Optimizing Offline Inference for Auto-regressive Large Models with Resource-aware Batching. , DOI: https://doi.org/10.48550/arxiv.2411.16102.
Preprint21 days ago