publications
Please find the complete list here. Recent selected publications are shown below.
2024
- Amanda: Unified Instrumentation Framework for Deep Neural NetworksIn Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS’24 , 2024
- Aceso: Efficient Parallel DNN Training through Iterative Bottleneck AlleviationIn Proceedings of the Nineteenth European Conference on Computer Systems, EuroSys , 2024
- Tessel: Boosting Distributed DNN Execution with Flexible Schedule SearchIn 30th International Symposium on High-Performance Computer Architecture, HPCA , 2024
2023
- VBASE: Unifying Online Vector Similarity Search and Relational Queries via Relaxed MonotonicityIn 17th USENIX Symposium on Operating Systems Design and Implementation, OSDI , 2023
- Cocktailer: Analyzing and Optimizing Dynamic Control Flow in Deep LearningIn 17th USENIX Symposium on Operating Systems Design and Implementation, OSDI , 2023
- Welder: Scheduling Deep Learning Memory Access via Tile-graphIn 17th USENIX Symposium on Operating Systems Design and Implementation, OSDI , 2023
- Optimizing Dynamic Neural Networks with BrainstormIn 17th USENIX Symposium on Operating Systems Design and Implementation, OSDI , 2023
- PIT: Optimization of Dynamic Sparse Deep Learning Models via Permutation Invariant TransformationIn Proceedings of the 29th Symposium on Operating Systems Principles, SOSP , 2023
- SPFresh: Incremental In-Place Update for Billion-Scale Vector SearchIn Proceedings of the 29th Symposium on Operating Systems Principles, SOSP , 2023
- SiloD: A Co-design of Caching and Scheduling for Deep Learning ClustersIn Proceedings of the Eighteenth European Conference on Computer Systems, EuroSys , 2023
- ElasticFlow: An Elastic Serverless Training Platform for Distributed Deep LearningIn Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2, ASPLOS , 2023
- OliVe: Accelerating Large Language Models via Hardware-friendly Outlier-Victim Pair QuantizationIn Proceedings of the 50th Annual International Symposium on Computer Architecture, ISCA , 2023
- On Modular Learning of Distributed Systems for Predicting End-to-End LatencyIn 20th USENIX Symposium on Networked Systems Design and Implementation, NSDI , 2023
- Model-enhanced Vector IndexIn Advances in Neural Information Processing Systems, NeurIPS , 2023
2022
- SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian ApproximationIn The Tenth International Conference on Learning Representations, ICLR , 2022
- ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network QuantizationIn 55th IEEE/ACM International Symposium on Microarchitecture, MICRO , 2022
- ROLLER: Fast and Efficient Tensor Compilation for Deep LearningIn 16th USENIX Symposium on Operating Systems Design and Implementation, OSDI , 2022
- PilotFish: Harvesting Free Cycles of Cloud Gaming with Deep Learning TrainingIn 2022 USENIX Annual Technical Conference, USENIX ATC , 2022
- SparTA: Deep-Learning Model Sparsity via Tensor-with-Sparsity-AttributeIn 16th USENIX Symposium on Operating Systems Design and Implementation, OSDI , 2022
2020
- Capuchin: Tensor-based GPU Memory Management for Deep LearningIn ASPLOS ’20: Architectural Support for Programming Languages and Operating Systems, ALPLOS , 2020
- HiveD: Sharing a GPU Cluster for Deep Learning with GuaranteesIn 14th USENIX Symposium on Operating Systems Design and Implementation, OSDI , 2020
- Rammer: Enabling Holistic Deep Learning Compiler Optimizations with rTasksIn 14th USENIX Symposium on Operating Systems Design and Implementation, OSDI , 2020
- Retiarii: A Deep Learning Exploratory-Training FrameworkIn 14th USENIX Symposium on Operating Systems Design and Implementation, OSDI , 2020
2019
- Analysis of Large-Scale Multi-Tenant GPU Clusters for DNN Training WorkloadsIn 2019 USENIX Annual Technical Conference, USENIX ATC , 2019
2018
- Gandiva: Introspective Cluster Scheduling for Deep LearningIn 13th USENIX Symposium on Operating Systems Design and Implementation, OSDI , 2018