

Modern data platforms are under constant pressure to do more—process more data, move faster, and do it all without driving up cloud costs. Over the past several months, our Empower engineering team has been exploring a promising approach to that challenge: using GPU acceleration to dramatically speed up data transformation workloads without requiring code changes.
I’m excited to announce that we’re sharing what we’ve learned at NVIDIA GTC March 16–19, where Hitachi Solutions will be demonstrating this work live at the Microsoft booth #521. I’ll be there from 5-7 p.m. It’s an early but compelling example of how emerging GPU technologies can reshape the economics of large-scale data engineering.
Why GPUs for Data Engineering?
When people think about GPUs, they often think about graphics, gaming, or AI model training. But at their core, GPUs excel at massively parallel computation—which makes them well suited for the kinds of heavy, memory‑intensive operations common in modern data pipelines.
NVIDIA published research around RAPIDS, a GPU‑accelerated framework designed to speed up data processing and analytics workloads. The promise is straightforward: move expensive operations like joins and aggregations onto GPUs, run them faster, and finish jobs sooner—often offsetting the higher per‑hour cost of GPU infrastructure.
Our team wanted to see if that promise held up in real-world data engineering scenarios.
What We Tested with Empower
The Empower Data Platform is built on Azure Databricks and uses a configuration‑driven approach to orchestrate data pipelines across Bronze, Silver, and Gold layers. That architecture creates a unique opportunity: performance optimizations can be introduced at the infrastructure level without rewriting customer pipelines.
Using the TPC‑DS benchmark at 1TB scale, our engineering team compared standard CPU-based Databricks clusters with cost‑equivalent GPU clusters enabled with NVIDIA RAPIDS for Apache Spark.
The results validated the hypothesis:
- ~2× faster execution times for full pipeline runs
- 74% reduction in Databricks Units (DBUs) consumed
- An overall ~4× improvement in cost efficiency
- Zero code changes required—pipelines ran as-is, with acceleration enabled through configuration
The biggest gains showed up in heavy transformation workloads, particularly large fact table joins—exactly the areas that tend to dominate runtime and cost in enterprise data platforms.
Importantly, this work represents a proof of concept, not a generally available feature in Empower as of today. It’s early validation that GPU acceleration can deliver meaningful value when applied thoughtfully to data engineering workloads.
What This Means for Customers
If these capabilities are productized in the future, the implications are significant:
- Faster pipelines without refactoring existing data models
- Lower compute costs for large-scale transformation workloads
- A simpler path to adopting GPU acceleration without specialized tuning
- More predictable performance as data volumes continue to grow
From a customer perspective, the goal is simple: flip a switch, run faster, and spend less—without introducing new operational complexity.
See It Live at NVIDIA GTC
At NVIDIA GTC, we’re demonstrating this work side by side at the Microsoft booth #521, showing Empower running with and without GPU acceleration. The demo highlights how the same pipeline behaves under each configuration, making the performance and efficiency gains easy to see.
If you’re attending GTC, stop by the Microsoft booth to talk with our team about what we’re exploring – and where this could go next.
This work reflects our broader commitment to continuously evaluating emerging technologies—and validating them with real data—so customers can benefit from meaningful innovation, not just theoretical performance gains.