SAP Dremio Acquisition: Apache Iceberg Lakehouse for Agentic AI 2026 | SAVIC

Key takeaways

SAP's May 2026 acquisition of Dremio transforms SAP Business Data Cloud into an Apache Iceberg-native enterprise lakehouse — eliminating ETL pipelines, unifying SAP and non-SAP data, and removing the data-readiness bottleneck that has stalled most enterprise agentic AI programmes.

Use the article below as a practical starting point for your SAP planning conversation.

Talk to SAVIC if you want help turning the guidance into an executable roadmap.

SAP Dremio acquisition 2026SAP Business Data Cloud Apache IcebergSAP agentic AI data lakehouseApache Iceberg SAP S4HANASAP Dremio enterprise lakehouseSAP non-SAP data unification 2026SAP Business Data Cloud 2026agentic AI data readiness SAPSAP open data lakehouse IndiaApache Polaris SAP catalogSAP HANA Cloud lakehouseSAP AI data strategy 2026

The Acquisition That Solves Enterprise AI's Biggest Problem

On May 4, 2026, SAP announced a definitive agreement to acquire Dremio Corporation, an Apache Iceberg-native data lakehouse platform. The deal, expected to close in Q3 2026, addresses what SAP CTO Philipp Herzig called the fundamental bottleneck of enterprise AI: "Enterprise AI doesn't stall because the models aren't good enough; it stalls because the data isn't ready for AI agents."

This single acquisition reshapes SAP's data architecture strategy for the next decade. SAP Business Data Cloud — SAP's unified data and analytics platform — will become an Apache Iceberg-native enterprise lakehouse, capable of providing AI agents with governed, unified, and semantically enriched access to both SAP and non-SAP data without data movement, ETL pipelines, or format conversion.

For enterprises running SAP S/4HANA alongside Oracle, Salesforce, third-party WMS or MES systems, and cloud data stores, this is the architecture that makes true cross-enterprise agentic AI feasible — not as a future aspiration, but as an implementable 2026–2027 roadmap.

What Dremio Is — And Why SAP Chose It

Dremio is an open, high-performance data lakehouse platform built specifically for analytical workloads and AI access patterns. Before this acquisition, it was already deployed by enterprises as a query acceleration and data virtualisation layer over cloud data lakes — enabling data scientists and analysts to query data in place (S3, ADLS, GCS) without copying it into a warehouse.

Three technical capabilities made Dremio the right acquisition target for SAP's agentic AI strategy:

Apache Iceberg-native architecture: Dremio is built on Apache Iceberg — the open table format that has become the industry standard for data lakehouses. Iceberg provides ACID transactions, schema evolution, time travel, and partition pruning at petabyte scale. By acquiring Dremio, SAP makes Iceberg the native storage format for SAP Business Data Cloud, eliminating proprietary lock-in and enabling direct interoperability with any Iceberg-compatible engine.
AI semantic layer: Dremio's semantic layer adds business context across data sources — meaning, relationships, access rights, and data lineage. For AI agents, this is critical: an agent querying sales data needs to understand not just the schema but the business meaning of "net revenue" versus "gross revenue," the organisational hierarchy behind "region," and the access rights governing what data it can consume.
Zero-ETL data access: Dremio enables governed access to enterprise data without ETL pipelines — data stays in place, and Dremio's query engine reaches across sources. For agentic AI, this means agents get access to current data, not yesterday's warehouse load.

The New SAP Business Data Cloud Architecture

Post-acquisition, SAP Business Data Cloud will be rebuilt around three integrated layers that together form the data foundation for agentic AI:

Apache Iceberg-Native Storage Layer: All enterprise data — SAP and non-SAP — is stored or mirrored in Apache Iceberg format. This is the single source of truth that all compute engines (SAP HANA Cloud, Dremio, Spark, Flink) read from and write to. No more siloed copies in separate warehouses. SAP is also committing to continued investment in open-source projects including Apache Polaris and Apache Arrow — ensuring the architecture remains open and interoperable.
Universal Open Catalog (Apache Polaris): A unified catalog built on the Apache Polaris standard and the Apache Iceberg REST Catalog API. This catalog serves as both discovery and semantic layer — providing business context (meaning, relationships, lineage, access rights) across all data assets. AI agents navigating enterprise data will use this catalog as their index and permission gate.
Multi-Engine Compute Layer: SAP HANA Cloud provides in-memory processing for real-time transactional analytics. Dremio's serverless, elastic lakehouse engine handles large-scale analytical queries, scaling automatically with demand. Customers are not locked into a single compute engine — the open catalog ensures any Iceberg-compatible tool can participate.

How This Changes the Agentic AI Equation for Enterprises

The most direct impact of the Dremio acquisition is on SAP's agentic AI programme. Joule agents — SAP's AI agent layer, now covering 35+ solutions with 2,500+ skills — face a fundamental constraint: they can only act on data they can access. For many enterprises, the most valuable data for agent decision-making is fragmented across:

SAP S/4HANA (ERP transactions, master data)
SAP SuccessFactors (workforce data)
Third-party CRM systems (Salesforce, Dynamics)
Legacy WMS and MES systems
Supplier portals and external data providers
Cloud data lakes built on AWS S3 or Azure ADLS

Building ETL pipelines to consolidate this data into a single warehouse for AI agents to query is expensive, slow, and creates stale data problems. Dremio eliminates this by enabling agents to query all these sources in their native locations, through the unified semantic layer, with consistent business context and access governance applied uniformly.

The practical result: a procurement agent assessing supplier risk can now query SAP Ariba transaction history, third-party financial risk data, external logistics performance records, and internal quality management data — in a single agent step, with data current as of today, governed by the same access rules applied to human users.

Apache Iceberg: Why the Open Standard Matters for Indian Enterprises

For Indian enterprises evaluating long-term data architecture strategies, SAP's commitment to Apache Iceberg as the foundation of Business Data Cloud is strategically significant beyond the immediate Dremio use case.

Apache Iceberg has emerged as the dominant open table format, supported by every major cloud provider (AWS, Google Cloud, Microsoft Azure), every major analytical database (Snowflake, Databricks, BigQuery, Redshift), and every major data engineering framework (Spark, Flink, Trino). SAP's adoption of Iceberg means:

No data lock-in: Data stored in Iceberg format on your cloud storage can be read by any Iceberg-compatible engine. If you run Business Data Cloud on Google Cloud today and want to add Databricks for data science workloads next year, the data format is already compatible — no migration required.
Cloud portability: Indian enterprises with multi-cloud strategies or regulatory requirements around data residency can host their Iceberg data on any compliant cloud storage, regardless of which SAP services they use for compute.
Partner ecosystem integration: The open catalog means SAP data is natively accessible to third-party analytics tools, data science platforms, and custom applications — without SAP-specific APIs or proprietary connectors.

SAP HANA Cloud Plus Dremio: Complementary, Not Competing

A question that will arise for existing SAP HANA Cloud customers is whether Dremio replaces HANA Cloud's in-memory analytical capabilities. The architecture makes clear they are complementary:

SAP HANA Cloud continues to be optimised for real-time operational analytics — the sub-second transactional intelligence that drives live operational dashboards, real-time inventory availability, and order-to-cash process monitoring. Its in-memory architecture is unmatched for latency-sensitive analytical queries on live SAP data.
Dremio's lakehouse engine handles large-scale historical analytics, cross-enterprise data queries, AI model training workloads, and data science exploration — where elasticity and breadth of data access matter more than sub-millisecond latency.

Together, they form a complete analytical data platform — real-time operations through HANA Cloud, large-scale intelligence through the Iceberg lakehouse — under a single unified catalog and governance model.

The SAP Knowledge Graph: Dremio's Role in Long-Term AI Intelligence

SAP's roadmap points to an ambitious long-term capability: the SAP Knowledge Graph — a structured representation of enterprise business knowledge that AI agents can navigate to understand not just data values but business relationships, process dependencies, and organisational context.

Dremio's semantic layer is the data infrastructure foundation for this knowledge graph. By maintaining consistent business context across all data sources — understanding that a "customer" in Salesforce is the same entity as a "sold-to party" in S/4HANA, or that a cost centre hierarchy in SuccessFactors maps to the company code structure in Finance — Dremio creates the semantic fabric that makes a cross-enterprise knowledge graph possible.

For AI agents, this means moving from data retrieval to genuine business reasoning — understanding the relationships between entities, the dependencies between processes, and the governance rules that govern what actions are permissible in what context.

What SAP Customers Should Do Now

The Dremio acquisition is announced but not yet closed. However, enterprises with active SAP Business Data Cloud implementations or roadmap plans should begin acting on the architectural implications immediately:

Audit your current data integration architecture: Identify all ETL pipelines currently feeding data into analytical layers. Each one is a candidate for elimination through Dremio's zero-ETL access model — and understanding the current landscape is the first step to rationalising it.
Evaluate Apache Iceberg adoption: If you are planning new data lake or lakehouse investments, choose Apache Iceberg as your table format now. Any Iceberg-native investment you make today will be directly compatible with SAP Business Data Cloud's post-Dremio architecture.
Map your non-SAP data sources: The business case for the Dremio integration is strongest for enterprises with significant non-SAP data that must be combined with SAP data for AI decision-making. Document your key non-SAP data systems and the business questions that require cross-system analysis — this becomes your BDC value case.
Plan your SAP Knowledge Graph readiness: Begin master data governance initiatives that create consistent entity definitions across your SAP and non-SAP systems. The semantic layer can only be as good as the underlying data quality and consistency.

SAVIC's Perspective: Data Readiness Is the AI Competitive Advantage

SAP's twin May 2026 acquisitions — Prior Labs for Tabular Foundation Models and Dremio for data lakehouse architecture — tell a coherent story: the AI capabilities that will differentiate enterprises in the next three to five years are not model-level; they are data-level. The enterprises that win with AI will be those whose data is unified, governed, semantically enriched, and AI-ready.

SAVIC's data architecture practice helps Indian enterprises build exactly this foundation — through SAP Business Data Cloud implementations, Apache Iceberg migration planning, master data governance programmes, and SAP AI Core integration on BTP. As India's No. 1 SAP Platinum Partner, SAVIC combines SAP platform depth with data engineering expertise to help enterprises close the data readiness gap and activate the full potential of SAP's agentic AI vision. Speak with our Data & Analytics team to assess your organisation's readiness for the post-Dremio SAP data architecture.

Frequently Asked Questions

How does SAVIC approach SAP implementation projects?

SAVIC follows a structured One Piece Flow methodology — delivering SAP projects in focused, iterative waves that reduce risk, accelerate time-to-value, and keep business disruption minimal. Each phase is scoped, tested, and signed off before the next begins.

What industries does SAVIC serve with SAP solutions?

SAVIC serves 12+ industries including manufacturing, automotive, consumer products, retail, life sciences, chemicals, oil & gas, real estate, and financial services — across India, UAE, Singapore, the US, UK, Nigeria, and Kenya.

How long does a typical SAP S/4HANA implementation take with SAVIC?

Timelines vary by scope. GROW with SAP public cloud deployments can go live in 8–12 weeks using SAVIC's pre-configured accelerators. Full RISE with SAP private cloud transformations typically take 6–18 months depending on landscape complexity, data migration volume, and custom code remediation.

Does SAVIC provide post-go-live SAP support?

Yes. SAVIC's MAXCare managed services programme provides post-go-live application management, Basis & infrastructure support, continuous improvement, and defined SLA-backed support across all SAP modules — with 24/7 coverage options for critical production environments.

SAP Acquires Dremio: How an Apache Iceberg Lakehouse Unlocks Agentic AI for S/4HANA Enterprises