- Datapelago has created a new engine called nucleus that dramatically speeds up data processing for ai and analytics.
- It outperforms Nvidia’s cudf library by Large Margins While Working Across Different Types of Hardware.
- Today’s gpus are powerful, but older software often waste their potential, making faster tools like nucleus especially valuable.
- This Shift Cold has Dramatic Implications for Nvidia.
For years, Enterprises have leaneed on GPUS (Graphics Processing Units) Workloads. Every generative ai model, recommendation engine, and analytics Dashboard Depends on Data Libraries to Prepare, Join, and Transform Massive Datasets.
Yet the Industry Faces A Quiet Challenge: Despite Advances in Hardware, Performance often stalls at scaling limits against the software stack struggles to full explives to explive the Hardwa. Many legacy data libraries were optimized for cpus, not gpus. As a result, memory bandwidth and computer throughn go underutilized, and every time data moves Between CPU and GPU and GPU, MUCH OF THE PERFORMENCE Advantage Evaporates.
To address this, nvidia launched cudf in 2018 as part of its open-Source Rapids Suite-A. GPU-Accepted DATAFRAME Library that Quickly Becard FOR DATA Operations. It delivered Speedups Over CPU-Based Libraries and Better Utilization of GPU Hardware.
But cudf also have limits. It requires an nvidia gpu with ample memory and cuda support, ruling out environments without compatible hardware. In many ways, cudf if the industry’s ceiling: Powerful enough to Accelerate Ai and Analytics Pipelines, Yet Consstrained by the Quirks of GPU Architecture Itsfe.
Now, California-Based Data Startup Datapelago Says it has surpassed that with limits with its universal data processing engine, nucleus. Built Atop Nvidia Hardware, Nucleus Reportedly Delivers Performance Gains So Steep They Could Reset the Economics of GPU Acceleration. In a Benchmark Test, Nucleus Outpaced Cudf by 38.6 Times on Hash Joins, Eight Times on Sorts, and 10 Times on Filters and Projections.
“To fully realize the benefits of gpus, data processing engines need to fully leverage the hardware’s strengths while compensating for its limitation,” Says Datapelago CEO CEO RASOTHO RAMAN GOYAL -SOMETHING GOYAL -SOMETHING GOYALGO Demands Fresh Algorithms Built for Data Workloads.
The implications go far beyond engineering bragging rights. Cloud Gpus are expensive, and enterprises face pressure to maximize every computer cycle. Faster Data Processing means Lower Cloud Bills and Quicker Time-to-Insight. Goyal says nucleus is designed to run on any hardware and handle any type of data, while integrating with existing frameworks without required changes to customer applications.
“We slot into the existing environments that developers are alredy working in,” he adds.
Empowering Enterprise ai with a Hardware-Neutral Approach
Datapelago’s Benchmark Test Ran on Standard Public-Cloud Servers with Both Entry-Level Tesla T4 and High-Ed H100 Gpus. The Test Mimicked Real-World Tasks: Moving Data from CPU to GPU, Processing it, and Returning Results to the host. Using the same dataset and harness, nucleus was compared head-to-head with cudf on core ai and analytics operations.
“We wanted to improve the performance ceiling for gpus, and the only way to do that credibly was to compare orselves from directly to cudf,” Says Goyal. He notes that Accelerating data prep by an order of magnitude gives businesses the capacity to process exponentially more information for ai training and retrieval tasks, keeping symping up-to-date.
The engine achieved these results by redesigning its execution layer to handle complex workloads, Including Kernel Fusion, Native Multi-Column Support, And Optimized Handling of Variable-Variable-England Data Data Such as strings. Interestingly, while Nucleus Runs on Nvidia’s Cuda Framework and GPUS, It Delivers Higher Performance -Sestilyly out-Engineering Nvidia on its Tech Stack.
Datapelago President JG Chirapurath Says Nucleus Delivers “Far Greater Performance from Existing Hardware Investments” And Stresses that Enterprises Prefer Solutions that Buland on Whats on Whats That Builds Forcing a rip-sand-replace.
Goyal argues that cudf is tightly coupled to nvidia’s gpu ecosystem, creating vendor lock-in and limiting hardware flexibility. This Dependence Restrictions Open Innovation and Ties Enterprises to Nvidia’s Roadmap.
“Nucleus is designed to work across any hardware (not only gpus), whil also Hardware, “Claims Goyal. The engine also includes buy-insteligence that automatically Maps data operations to the most suitable hardware and dynamically reconfigures tasks to maximize performance.
Software over Silicon: The Emerging Battle for Enterprise Ai Efficiency
If datpelago’s approach takes holds hold, enterprises may begin prioritizing universality and efficiency over Single-vendor ecosystems when building ai infrastructure. Still, Analysts Caution that Benchmark Results often Look Stranger in Controlled Tests Than in Real-WORLD PRODUCTION, and Risks Remain If the Hardware Landscape Evolves Evolves Quickly.
“The offering will appeal to there looking to avoid vendor lock-in,” Says alvin nguyen, Senior analyst at forrester. “But with tools like amd’s cuda translation for its Data Center GPUS, The Real Advantage is if you’re also also targeting cpus and field programmable Gate ARRRYS (FPGAS). Experienced with nvidia’s ecosystem, so moving away from nvidia now means a bigger short-term investment in other options. “
Nguyen also notes that Progress on Transformer-Based Workloads, Such as Training Large Foundational Models, Is Slowing Compared with Prior Years. As incentive begins to outpace training, raw gpu horsepower is no longer the main driver. “A more balanced view, including the software layer, is a smart way to look at things.”
Still, Investors are buying in. Datapelago has raised $ 47 million in seed and series a funding from eclipse, Qualcomm ventures, and taiwania capital, and recently hired Industry VETERAN JG CHRIAPURTH as presiding. CEO Goyal Himself Worked at Cisco and Oracle Before Founding the company.
For years, The AI Industry has Fixed on Chip Shortages and the Race for Ever-More Powerful Gpus. Nucleus points instead to a different kind of competition. If the biggest performance Gains now come from software rather than hardware, the battleground could shift from chip foundries to algorithmic innovation. The future of ai infrastructure May depends on building bigger chips and more onset how we harnass the ones we alredy have.
“Hardware neutrality is strategic differentiation, not just technical capability. Enterprises Want Infrastructure Investments that Remain Valuable as Technology evlocks,” “My long-term vision is positioning datpelago as the universal data processing foundation that accelerates the next Decade of Ai and AI and Analytics Innovation. Economically feasible. “
The Early-Rate Deadline for Fast Company’s Most Innovative Companies Awards is Friday, September 5, at 11:59 PM Pt. Apply today.