Instacart has redesigned its search infrastructure by replacing Elasticsearch with PostgreSQL, combining keyword and embedding-based retrieval in a single system. By consolidating catalog and search data into Postgres, the company aimed to simplify operations, reduce synchronization overhead, and improve precision and recall in search results.
A key part of the redesign was improving how results are retrieved. Traditional keyword search excels at matching exact product attributes, for example, a query like “pesto pasta sauce 8oz” benefits from precise lexical matching. But broader intent-driven queries, such as “healthy foods”, are better handled through semantic retrieval, which understands relationships between terms and concepts. By combining both approaches in Postgres, Instacart can balance precision (returning only relevant results) with recall (capturing as many relevant items as possible), ensuring that customers see both the exact products they’re looking for and meaningful options for discovery.
According to the Instacart engineering team, the migration improved development velocity by removing the need to reconcile data between systems. The hybrid infrastructure also provided greater flexibility in handling dynamic inventory and complex user preferences, enabling the platform to process millions of search requests daily. Real-time updates to prices, availability, and discounts are reflected instantly, supporting a more efficient and personalized shopping experience for customers.
As Ankit Mittal, an engineer at Instacart, remarked:
A normalized data model allowed us to achieve a 10x reduction in write workload compared to the denormalized data model we used in Elasticsearch. This resulted in nearly 80% savings on storage and indexing costs, reduced dead-end searches, and improved the overall customer experience.
Previously, Elasticsearch handled full-text queries while transactional data was stored in Postgres. Maintaining two separate databases introduced synchronization challenges and higher operational costs. To add semantic search capabilities, the team initially implemented FAISS before transitioning to a hybrid model using the pgvector extension in Postgres. This approach allows both lexical and embedding-based retrieval to run in a single system, reducing data duplication and complexity.
Previous retrieval architecture with FAISS and Postgres (Source: Instacart Engineering Blog)
The redesigned architecture uses sharded Postgres instances with a normalized data model to scale horizontally. Each shard contains catalog and search indexes, and queries are routed through a service layer to the appropriate shard. According to Instacart engineers, leveraging Postgres GIN indexes and a modified ts_rank function achieved high-performance text matching, while the relational model allowed ML features and model coefficients to be stored in separate tables. Normalization reduced write workloads by tenfold compared to Elasticsearch, cutting storage and indexing costs, while supporting hundreds of gigabytes of ML feature data for more advanced retrieval models.
Hybrid retrieval architecture with pgvector and Postgres (Source: Instacart Engineering Blog)
Postgres extensions were central to the redesign. Features such as pg_trgm for trigram-based text search and pgvector for embedding-based search allow the database to handle both traditional keyword and semantic search. Queries pass through a routing layer to shards containing the necessary indexes, returning results efficiently without cross-system synchronization.