Vector store

Vector store is an important storage system in the AI era. I believe it not only serves as an cache used to retrieve computation results from neural networks, but also becomes (or will become) a fundamental component at the core of neural models. For example, we realized that attention, a fundamental mechanism in the Transform-like neural architecture, can be viewed as vector index traversal (Liu et al., 2024). This makes the computation of sparse attention much more efficient. In the case of LLMs with a long context window, the benefit can be of one or multiple orders of magnitude.

Here are some of our thoughts on vector store.

  • Integrating vector indices with relational databases using relaxed monotonicity. (Zhang et al., 2023).
  • Updating a vector index incrementally. (Xu et al., 2023).
  • Vector indices can be dense or sparse. Instead of represented with separated solutions, they can be unified with one generic design. (Chen et al., 2024).
  • Attention can be transformed as vector retrieval, thus making sparse attention significantly more efficient. (Liu et al., 2024)

References

2024

  1. ENLSP
    RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval
    Di Liu, and 13 more authors
    In NeurIPS Efficient Natural Language and Speech Processing workshop, NeurIPS ENLSP-IV, 2024
  2. WWW
    OneSparse: A Unified System for Multi-index Vector Search
    Yaoqi Chen, and 16 more authors
    In Companion Proceedings of the ACM Web Conference 2024, Singapore, Singapore, 2024

2023

  1. VBASE: Unifying Online Vector Similarity Search and Relational Queries via Relaxed Monotonicity
    Qianxi Zhang, and 11 more authors
    In 17th USENIX Symposium on Operating Systems Design and Implementation, OSDI, 2023
  2. SPFresh: Incremental In-Place Update for Billion-Scale Vector Search
    Yuming Xu, and 11 more authors
    In Proceedings of the 29th Symposium on Operating Systems Principles, SOSP, 2023