Bio
We haven't found any bio for you yet.
Researcher Links
Loading links...
Publications by Type
Loading publications…
The last 5 uploaded publications
TurboSpec: Closed-loop Speculation Control System for Optimizing LLM Serving Goodput
Xiaoxuan Liu, Cade Daniel, Langxiang Hu, Woosuk Kwon, Zhuohan Li, Xiangxi Mo, Alvin Cheung, Zhijie Deng, Ion Stoica, Hao Zhang (2024). TurboSpec: Closed-loop Speculation Control System for Optimizing LLM Serving Goodput. , DOI: https://doi.org/10.48550/arxiv.2406.14066.
Preprint10 days agoEfficient Memory Management for Large Language Model Serving with PagedAttention
Woosuk Kwon, Z. Li, Siyuan Zhuang, Ying Sheng, L Zheng, Cody Hao Yu, Joseph E. Gonzalez, Hao Zhang, Ion Stoica (2023). Efficient Memory Management for Large Language Model Serving with PagedAttention. , DOI: https://doi.org/10.1145/3600006.3613165.
Article10 days agoEfficient Memory Management for Large Language Model Serving with PagedAttention
Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Hao Yu, Joseph E. Gonzalez, Hao Zhang, Ion Stoica (2023). Efficient Memory Management for Large Language Model Serving with PagedAttention. , DOI: https://doi.org/10.48550/arxiv.2309.06180.
Preprint10 days ago