Google recently introduced a columnar engine for its globally distributed database, Spanner, intending to resolve the long-standing conflict between online transaction processing (OLTP) and analytical query processing (OLAP). The new feature, currently in preview, allows Spanner (Enterprise and Enterprise Plus editions) to handle both workloads simultaneously on a single database, eliminating the need for separate data warehouses and complex ETL (Extract, Transform, Load) pipelines.
Historically, organizations have used row-oriented databases for high-volume, low-latency OLTP workloads, while offloading analytics to separate data warehouses with columnar storage. With the columnar engine, the need for separation is not necessary as it features a hybrid architecture that transparently maintains a secondary copy of the data in a columnar format, optimized for analytical queries. When a query is executed, Spanner’s optimizer intelligently directs it to either the existing row-based storage for fast transactional lookups or the new columnar storage for large-scale scans and aggregations.
This dual-storage approach, combined with vectorized query execution that processes data in batches, allows the columnar engine to provide a significant performance boost. The authors of a Google Cloud blog post write:
Spanner columnar engine integrates a columnar format alongside its existing row-oriented storage. This unified transactional and analytical processing design allows Spanner to maintain its OLTP performance while accelerating analytical queries up to 200X on your live operational data.
According to Walter Lee, a principal engineer at Wells Fargo, Spanner’s columnar engine is a boon for AI applications, particularly those requiring real-time data for model training and inference. By enabling fast, scalable analytical queries on live transactional data, it supports AI workloads like real-time recommendation systems, predictive analytics, and anomaly detection. The engine’s ability to process large datasets efficiently with vectorized execution accelerates feature engineering and data preprocessing, which are critical for machine learning pipelines.
Google is not alone in its pursuit of hybrid transactional/analytical processing (HTAP). Other providers, such as AWS (Aurora), Microsoft (Azure Cosmos DB), and Snowflake, have also been adding integrated analytical capabilities to their platforms. Additionally, open-source projects like ClickHouse, Apache Doris, and PostgreSQL extensions are moving toward unified architectures.
The Spanner columnar engine currently supports the Google SQL interface and requires explicit query hints to enable columnar reads. While a free trial is available, billing for the columnar engine is based on the additional storage consumed by the columnar data.