Vector store
Vector store is an important storage system in the AI era. I believe it not only serves as an cache used to retrieve computation results from neural networks, but also becomes (or will become) a fundamental component at the core of neural models. For example, we realized that attention, a fundamental mechanism in the Transform-like neural architecture, can be viewed as vector index traversal (Liu et al., 2024). This makes the computation of sparse attention much more efficient. In the case of LLMs with a long context window, the benefit can be of one or multiple orders of magnitude.
Here are some of our thoughts on vector store.
- Integrating vector indices with relational databases using relaxed monotonicity. (Zhang et al., 2023).
- Updating a vector index incrementally. (Xu et al., 2023).
- Vector indices can be dense or sparse. Instead of represented with separated solutions, they can be unified with one generic design. (Chen et al., 2024).
- Attention can be transformed as vector retrieval, thus making sparse attention significantly more efficient. (Liu et al., 2024)
References
2024
- ENLSPRetrievalAttention: Accelerating Long-Context LLM Inference via Vector RetrievalIn NeurIPS Efficient Natural Language and Speech Processing workshop, NeurIPS ENLSP-IV, 2024