AWS recently announced the general availability of S3 Vectors, a cloud object storage service with native support for storing and querying vector data. With the GA release, the company increases per-index capacity forty-fold to 2 billion vectors and introduces sub-100ms query latencies.
Earlier this year, in July, the service was available as a preview, and the company reports that users have already created over 250,000 vector indexes and ingested more than 40 billion vectors. This preview capped indexes at 50 million vectors, yet Sebastian Stromacq, principal developer at AWS, writes:
You can now store and search across up to 2 billion vectors in a single index… This means you can consolidate your entire vector dataset into a single index, eliminating the need to shard across multiple smaller indexes or to implement complex query federation logic.
In addition, the company enhances query performance, with infrequent queries returning results in under 1 second and frequent queries achieving latencies of 100ms or less, which is beneficial for interactive applications like conversational AI. Furthermore, according to the company, up to 100 search results can now be retrieved per query, improving the context for retrieval-augmented generation (RAG) applications. Finally, write performance now supports up to 1,000 PUT transactions per second for single-vector updates, enabling higher throughput with small batch sizes and immediate searchability of new data from multiple concurrent sources.
The company also states that two key integrations that were launched in preview are now generally available. Users can now leverage S3 Vectors as a vector storage engine for Amazon Bedrock Knowledge Base, and S3 Vectors integration with Amazon OpenSearch is now generally available, enabling users to use S3 Vectors as their vector storage layer while using OpenSearch for search and analytics capabilities.
Jalaj Nautiyal, a developer, writes in a LinkedIn post:
S3 Vectors moves vector search from a Compute-First problem to a Storage-First solution. The “Serverless” Shift: You no longer manage clusters, pods, or shards. You treat vectors like any other object in S3. Scale: Store billions of vectors.
Cost: Reduce total ownership costs by up to 90%. You pay for S3 storage (cheap) + query fees. No idle compute costs.
In addition, he writes:
For 80% of internal RAG applications and autonomous agents, you probably don’t need the Ferrari of vector databases. You just need a reliable, infinite trunk. S3 just became that trunk.
Currently, S3 Vectors is available in 14 AWS regions, up from 5 during the preview. In addition, the pricing of the service depends on three dimensions:
- PUT pricing is calculated based on the logical GB of vectors users upload, where each vector includes its logical vector data, metadata, and key.
- The total logical storage across indexes determines storage costs.
- Query charges include a per-API charge plus a $/TB charge based on index size (excluding non-filterable metadata).
More details on pricing are available on the pricing page.
