watsonx.data

Put your data to work, wherever it resides, with the hybrid, open data lakehouse for AI and analytics

View Only

Back to Blog List

Exciting Tech Previews in watsonx.data: Introducing Spark C++ and Materialized Views

By David Paul posted Mon October 28, 2024 01:10 PM

IBM’s watsonx.data is stepping up with two exciting features in tech preview: Spark C++ (Gluten) and Materialized Views. Here’s what you need to know about these features and how you can be one of the first few to try it.

What’s New?

1. Materialized View

We’re thrilled to announce that Materialized View marks the first public preview for watsonx.data, exclusively available in the SaaS environment.So, what are Materialized Views? They are pre-calculated results of complex queries. By storing these results, watsonx.data can significantly speed up query execution, especially for those involving joins and aggregations. Instead of re-running the original query each time, watsonx.data can directly use the already computed Materialized View for faster query responses and improved overall performance. To use Materialized view, it's essential to use Presto C++ with the query optimizer enabled.

Benefits of Materialized Views:

Improved Query Performance: Materialized Views significantly reduce execution times for frequently used complex queries.
Reduced Resource Consumption: By avoiding repeated calculations, Materialized Views lower the workload on resources, leading to cost savings.
Transparent Integration: Materialized Views are seamlessly integrated into the query process, requiring minimal user intervention.

2. Spark C++ (Gluten)

On the other hand, Spark C++ is currently in private preview, available only for on-premise environments. The new Spark C++ engine is designed to maximize performance by leveraging native libraries. It integrates two open-source components: Velox, a high-performance C++ library optimized for complex data processing, and Gluten,which serves as the glue connecting Spark to Velox while maintaining Spark's familiar control flow. This synergy allows Spark C++ to offload demanding workloads to Velox, resulting in significantly improved execution speeds and efficiency for data analytics.

Benefits of Spark C++:

Faster Data Ingestion: The Spark C++ engine enables quicker ingestion of data in Parquet and CSV formats into watsonx.data in Iceberg format compared to the Spark (Java) engine.
Effortless Integration: Transitioning to the Spark C++ engine is straightforward, allowing users to maintain their existing workflows with minimal adjustments.

How to Get Started

Both features are free to use! If you’re eager to try out these tech previews, simply fill out this form. If your primary interest lies in Materialized Views, you can find a detailed guide on how to get started here.

For new users interested in the public preview of Materialized Views, signing up for the Lite Plan is essential. This plan provides access for 30 days or up to 2000 Resource Units (RUs) to explore the watsonx.data.

Don’t miss the chance to experience these new features firsthand! Whether you’re looking to speed up query performance or fasten your data ingestion tasks, these tech previews offer a compelling opportunity to advance your analytics strategy.

Sign up today to explore the future of data with watsonx.data!

#watsonx.data

0 comments

22 views

Permalink

https://community.ibm.com/community/user/blogs/david-paul/2024/10/28/exciting-tech-previews-in-watsonxdata-introducing

watsonx.data

watsonx.data

Exciting Tech Previews in watsonx.data: Introducing Spark C++ and Materialized Views

By David Paul posted Mon October 28, 2024 01:10 PM

What’s New?

How to Get Started

Permalink

Additional
Resources

Office

Quick Links

watsonx.data

watsonx.data

Exciting Tech Previews in watsonx.data: Introducing Spark C++ and Materialized Views

By David Paul posted Mon October 28, 2024 01:10 PM

What’s New?

How to Get Started

Permalink

Additional Resources

Office

Quick Links

Additional
Resources