Fan Yang

Systems researcher, Research Manager of SRG@MSR-Asia

my_pic.jpg

personal email:

yang DOT fan AT 163 DOT com

work email: fanyang AT microsoft DOT com

My name is Fan Yang (杨凡 in Chinese). I am a systems researcher and research manager of the Systems Research Group (SRG) at Microsoft Research Asia (MSR-Asia). I joined MSR-Asia after receiving my doctoral degree and bachelor’s degree in Computer Science at Nanjing University.

My research passion lies in Computer Systems. My recent focus is on exploring the fundamental principles of the systems for Artificial Intelligence (AI). I am among the first to discover and advocate the now well-known design principles for AI systems, including the tile abstraction for AI compilers and the relaxed monotonicity for vector stores. Some techniques and solutions derived from these principles have been open-sourced and adopted by Microsoft products like Azure, M365, and Bing, and the corresponding research results have appeared in top systems conferences like OSDI/SOSP. Some open-source projects like OpenPAI or NNI even incubated new businesses. More recently, I have been passionate about the co-design of AI algorithms and systems, which I believe will define the next chapter of AI. In the past, I worked on large-scale systems, such as graph systems. I co-developed GraM, a high-performance graph engine that set a new speed record for trillion-scale graph analytics.

As a researcher, I also engage in public academic services, including serving on the program committee of ASPLOS (2022), EuroSys (2023, 2025), ChinaSys (19th).

news

latest posts

selected publications

  1. RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval
    Di Liu, and 13 more authors
    ArXiv, 2024
  2. Uncovering Nested Data Parallelism and Data Reuse in DNN Computation with FractalTensor
    Siran Liu, and 7 more authors
    In SOSP, 2024
  3. Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
    Marah Abdin, and 83 more authors
    ArXiv. (Applying LongRoPE to Phi-3) , 2024
  4. nnScaler: Constraint-Guided Parallelization Plan Generation for Deep Learning Training
    Zhiqi Lin, and 13 more authors
    In 18th USENIX Symposium on Operating Systems Design and Implementation, OSDI, 2024
  5. Parrot: Efficient Serving of LLM-based Applications with Semantic Variable
    Chaofan Lin, and 6 more authors
    In 18th USENIX Symposium on Operating Systems Design and Implementation, OSDI, 2024
  6. VBASE: Unifying Online Vector Similarity Search and Relational Queries via Relaxed Monotonicity
    Qianxi Zhang, and 11 more authors
    In 17th USENIX Symposium on Operating Systems Design and Implementation, OSDI, 2023
  7. ROLLER: Fast and Efficient Tensor Compilation for Deep Learning
    Hongyu Zhu, and 14 more authors
    In 16th USENIX Symposium on Operating Systems Design and Implementation, OSDI, 2022