Unleashing the Power of Microsoft Azure on SAP Data

FEATURED On-Demand Webinar

Microsoft Azure Data & AI Roadmap

Microsoft roadmap for data warehousing and Power BI

Watch the On-Demand Webinar

If you have ever extracted data from an SAP system, you know it can quickly become a complex process. Transactional tables often contain millions or even billions of rows, and because the extraction process is a resource-intensive operation, daily processing of the entire dataset is usually impossible. 

But no longer. Hitachi Solutions was part of a collaborative team that worked directly with Microsoft to design and implement a solution using the SAP Change Data Capture (CDC) connector. Now, on-premises SAP workloads can be re-powered in the Azure platform quickly, efficiently, and at scale. It’s a game changer for anyone who has struggled with the intricacies of extracting SAP data.

Why SAP users need it

Everyone wants more capability out of their data sources, especially SAP systems that contain the transactional and master data that are critical for daily operations. To get at that on-premises data for analysis, companies need to consolidate and move it to a cloud-based modern data estate.

There’s a massive benefit to efficiently moving SAP data to the cloud—it makes important core business data available for analysis by industry-leading innovations like data lakes, Power BI, artificial intelligence, and machine learning. When you have those tools available to use on your SAP data, you can generate holistic business insights that can improve and accelerate business decision-making.

How it works

The SAP CDC solution in Azure Data Factory or Azure Synapse is a connector between SAP and Azure. It’s an efficient way to extract and process core SAP data from a wide range of source objects, making it an ideal invaluable tool for modern data analytics in an Azure environment.

The SAP side

The SAP ODP connector invokes the SAP Operational Data Provisioning API over standard Remote Function Call (RFC) modules to extract SAP transaction and master data, modeling the data and making it available for search. The CDC process monitors the delta in the source database and moves it downstream in real time to Azure. It specifically tracks the INSERT, UPDATE, and DELETE transactions performed on a table to capture data changes.

The Azure side

Once data is moved to Azure Data Factory or Azure Synapse, you’ll need to set up a pipeline and build the mapping data flow to transform and load the SAP data into a supported data store. The supported stores include Azure Data Lake Storage Gen2 or other databases like an Azure SQL Database. When results are loaded into a Data Lake Storage Gen2 in delta lake format, they can be queried using Synapse serverless SQL, Synapse Dedicated via Polybase, and Apache Spark pools. This flexibility lets data analysts and scientists work with data in a way that suits their needs and expertise.

Here’s a simplified diagram of the architecture:

Working with changed data only reduces resource consumption and provides a tremendous efficiency boost in high data volume scenarios with cleaner, more organized data management and governance. Before now, if you wanted to do this in an Azure Data Factory pipeline, you had to create complex logic that used watermarks to select relevant data, easily adding about 30 percent more time to the effort.

Why Partner with Hitachi Solutions for Your SAP Integrations?

At Hitachi Solutions, we’ve meticulously crafted a reusable framework with the necessary operational runbook, or how-to guide, tailored for various SAP environments including SAP S/4 HANA, SAP ECC, and SAP BW. Our data platform accelerators enable swift framework deployment, so you can start capturing data from your SAP environments in hours, rather than days or weeks.

Our low-code approach to change data capture and extensive experience with SAP integrations make us the ideal partner to help you begin taking advantage of your SAP data. Hitachi Solutions’ partnership with Microsoft ensures that we stay at the forefront, offering the best solutions for integrating SAP workloads in the Azure cloud— all using a step-by-step approach to prove that you can realize incremental business value as you move along.

Contact us today!

Learn more

Overview and architecture of the SAP CDC capabilities

SAP on Azure podcast

Set up a self-hosted integration runtime for the SAP CDC connector

Set up a linked service and source dataset for the SAP CDC connectorAnatomy of the Operational Delta Queues in SAP ODP Extractors

About the Author

Will Crayger is a seasoned data and analytics professional, currently serving as Director of Data & Analytics at Hitachi Solutions America. Will leads a growing team of 16 diverse analytics professionals while driving projects in challenging modern data estate situations. With a reputation for building and leading successful teams, he continues to drive innovation and excellence in the field of data and analytics.