Bio
We haven't found any bio for you yet.
Researcher Links
Loading links...
Publications by Type
Loading publications…
The last 5 uploaded publications
View all
Pie: Pooling CPU Memory for LLM Inference
Yi Xu, Ziming Mao, Xiangxi Mo, Shu Liu, Ion Stoica (2024). Pie: Pooling CPU Memory for LLM Inference. , DOI: https://doi.org/10.48550/arxiv.2411.09317.
Preprint115 days agoMoE-Lightning: High-Throughput MoE Inference on Memory-constrained GPUs
Shiyi Cao, Shu Liu, Tyler Griggs, Peter Schafhalter, Xiaoxuan Liu, Ying Sheng, Joseph E. Gonzalez, Matei Zaharia, Ion Stoica (2024). MoE-Lightning: High-Throughput MoE Inference on Memory-constrained GPUs. , DOI: https://doi.org/10.48550/arxiv.2411.11217.
Preprint115 days ago