By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Databricks Contributes Spark Declarative Pipelines to Apache Spark
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > News > Databricks Contributes Spark Declarative Pipelines to Apache Spark
News

Databricks Contributes Spark Declarative Pipelines to Apache Spark

News Room
Last updated: 2025/07/03 at 9:02 AM
News Room Published 3 July 2025
Share
SHARE

At the Databricks Data+AI Summit, held in San Francisco, USA, from June 10 to 12, Databricks announced that it is contributing the technology behind Delta Live Tables (DLT) to the Apache Spark project, where it will be called Spark Declarative Pipelines. This move will make it easier for Spark users to develop and maintain streaming pipelines, and furthers Databrick’s commitment to open source.

The new feature will allow developers to define data streaming pipelines without needing to create the usual imperative commands in Spark. While the changes simplify the task of writing and maintaining pipeline code, users will still need to understand the runtime behavior of Spark and be able to troubleshoot issues such as performance and correctness.

In a blog post that describes the new feature, Databricks wrote that pipelines could be defined using SQL syntax or via a simple Python SDK that declares the stream data sources, tables and their relationship, rather than writing imperative Spark commands. The company claims this will reduce the need for orchestrators such as Apache Airflow to manage pipelines.

Behind the scenes, the framework interprets the query then creates a dependency graph and optimized execution plan.

Declarative Pipelines supports streaming tables from stream data sources such as Apache Kafka topics, and materialized views for storing aggregates and results. The materialized views are updated automatically as new data arrives from the streaming tables.

Databricks provide an overview of the SQL syntax in their documentation. An excerpt is shown here. The example is based on the New York City TLC Trip Record Data data set.


-- Bronze layer: Raw data ingestion
CREATE OR REFRESH STREAMING TABLE taxi_raw_records 
(CONSTRAINT valid_distance EXPECT (trip_distance > 0.0) ON VIOLATION DROP ROW)
AS SELECT *
FROM STREAM(samples.nyctaxi.trips);

-- Silver layer 1: Flagged rides
CREATE OR REFRESH STREAMING TABLE flagged_rides 
AS SELECT
  date_trunc("week", tpep_pickup_datetime) as week,
  pickup_zip as zip, 
  fare_amount, trip_distance
FROM
  STREAM(LIVE.taxi_raw_records)
WHERE ((pickup_zip = dropoff_zip AND fare_amount > 50) OR
       (trip_distance < 5 AND fare_amount > 50));

The example shows how a pipeline can be built by defining streams, with the CREATE STREAMING TABLE command, and then consuming them with a FROM statement in subsequent queries.. Of note in the example is the ability to include data quality checks in the pipeline with the syntax CONSTRAIN … EXPECT … ON VIOLATION.

While the Apache Spark changes are not yet released, many articles already describe the experience of engineers using Databricks DLT. In an article in Medium titled “Why I Liked Delta Live Tables in Databricks,” Mariusz Kujawski describes the features of DLT and how they can best be used: “With DLT, you can build an ingestion pipeline in just a few hours, compared to the days required to develop a custom framework. Additionally, built-in data quality enforcement provides an extra layer of reliability.”

In addition to a declarative syntax for defining a pipeline, Spark Declarative Pipelines also supports change data capture (CDC), batch and stream logic, built in retry logic, and observability hooks.

Declarative pipelines are in the process of being merged into the Spark project. The feature is planned for the next Spark Release, 4.10, which is expected in January 2026. Progress can be followed on the Apache Jira Spark project in ticket SPARK-51727.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Samsung seems to have leaked its own trifold phone design
Next Article From Sensual Butt Songs to Santa’s Alleged Coke Habit: AI Slop Music Is Getting Harder to Avoid
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

The Best Computer Mice for 2025
News
Cardano (ADA) or Little Pepe (LILPEPE): Here’s the Best Crypto Under $1 to Invest in Today | HackerNoon
Computing
I slept cool at a festival during a scorching heatwave — here’s how I did it
News
Perplexity adds a Max tier just as expensive as its rivals
News

You Might also Like

News

The Best Computer Mice for 2025

30 Min Read
News

I slept cool at a festival during a scorching heatwave — here’s how I did it

10 Min Read
News

Perplexity adds a Max tier just as expensive as its rivals

2 Min Read
News

Meet Soham Parekh, the engineer burning through tech by working at three to four startups simultaneously

6 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?