It can be said that an open data lakehouse is a data management architecture built on open-source technologies—combining the flexibility of a data lake with the structure and performance of a data warehouse.
IBM watsonx.data is more than the architecture itself. It is a real-world, enterprise-grade implementation of a hybrid and open data lakehouse.
Because of this, we can create a solution that uses IBM watsonx.data, Snowflake, and AWS S3 to form a truly hybrid setup—one that delivers a tangible benefit.
In this article, I’ll walk through how IBM watsonx.data components can be used together with Snowflake and AWS S3, and demonstrate how to configure them to work in harmony.
Hybrid Open Data Lakehouse
The image below shows the components of IBM watsonx.data. In this article and demo, the components highlighted are those that make up our hybrid open data lakehouse solution.
These components together form the foundation of a hybrid, interoperable data platform:
- 3rd Party Engines – Snowflake is integrated as the query engine.
- Unified Metadata – The Iceberg REST API is used to access watsonx.data’s metadata service.
- Open-Source Data Formats – Apache Iceberg and Apache Parquet ensure interoperability between different query engines.
- AWS S3 – The underlying storage for all data.
- Hybrid Infrastructure – Spanning multiple clouds and environments (see below).
These elements make the hybrid open data lakehouse not just an idea, but a reality.
Demo Scenario
Let’s imagine a use case where Snowflake serves as the query engine for users and applications, while IBM watsonx.data handles data ingestion, transformation, and writes to AWS S3 storage.
In the video, you saw how IBM watsonx.data brings the open data lakehouse to life—using open-source Apache Iceberg as the enabler for integration between watsonx.data and Snowflake.
A Tangible Benefit
Looking again at the scenario overview, the benefit of this hybrid solution with IBM watsonx.data, Snowflake, and AWS S3 is so clear that it doesn’t need to be said. (Hint: smart architectures tend to pay off.)
#watsonx.data