Vector Database

What is Vector database.?
A vector database is a specialized storage system designed to store, index, and search vector embeddings.
**How it works?
1. Vectorization:**
Raw data (text, image, audio) is processed by ml model to create a digital fingerprint (vector embedding) a large list of numbers which represents the item’s features.
2. Indexing
Organizes these vectors in multi-dimensional space. Related items are positioned mathematically closer to each other
3. Similarity Search
When a query is made, search is converted to vector and then database identifies nearest neighbours

Flow depicting the use of vector embedding
Why Vector Database?
Question comes in mind when there are already so many databases like (MySQL, Oracle, NoSQL, influx) available why need of vector database.
The selection and use of these databases depend on the data structure which we want to store in them.
As we know the data which we want to store are unstructured like text, images, audio. We required create or use database which can handle high-dimensional vector embeddings.
other aspects which lead to use is vector database allows semantic search or similarity search.
| Features | SQL/NoSQL | Vector Database |
| Data Types | Structured (integers, strings, dates) | Unstructured (embeddings, audio, visual) |
| Search Type | Exact Matches/logical query | Similarity/contextual searches |
| Indexing | B-Trees, Hash maps | Hierarchical Navigable Small World |
| Use Case | Transactions, inventory, CRM | Recommendations, chatbots, image retrievals |
Majorly used vector databases by popular AI Models
| Models | Vector Databases |
| Open AI(ChatGPT) | Pinecone, Milvus, Weaviate and Qdrant |
| Gemini | Vertex AI Vector, Alloy DB and pgvector |
| DeepSeek | Milvus, Qdrant and ChromaDB |
What’s next … which vector database to choose and comparison of popular vector databases




