File and Object Storage

Software-defined storage for building a global AI, HPC and analytics data platform

View Only

Back to Blog List

How to configure and performance tuning Spark workloads on IBM Spectrum Scale Sharing Nothing Cluster

By Archive User posted Mon November 27, 2017 02:17 AM

IBM Spectrum Scale Sharing Nothing Cluster performance tuning guide has been posted and please refer to link before you doing the below change.

Here is the tuning steps.
Step1: Configure spark.shuffle.file.buffer
By default, this must be configured on $SPARK_HOME/conf/spark-defaults.conf.
To optimize Spark workloads on an IBM Spectrum Scale filesystem, the key tuning value to set is the 'spark.shuffle.file.buffer' configuration option used by Spark (defined in a spark config file) which must be set to match the block size of the IBM Spectrum Scale filesystem being used.

The user can query the size of the blocksize for an IBM Spectrum Scale filesystem by running: 'mmlsfs
#cognitivecomputing
#Real-timeanalytics
#Softwaredefinedstorage
#Customerexperienceandengagement
#sparkworkloadtuning
#Data-centricdesign
#Workloadandresourceoptimization
#FPO

0 comments

0 views

Permalink

https://community.ibm.com/community/user/blogs/archive-user/2017/11/27/how-to-configure-and-performance-tuning-spark-workloads-on-ibm-spectrum-scale-sharing-nothing-cluster

File and Object Storage

File and Object Storage

How to configure and performance tuning Spark workloads on IBM Spectrum Scale Sharing Nothing Cluster

By Archive User posted Mon November 27, 2017 02:17 AM

Permalink

Additional
Resources

Office

Quick Links

File and Object Storage

File and Object Storage

How to configure and performance tuning Spark workloads on IBM Spectrum Scale Sharing Nothing Cluster

By Archive User posted Mon November 27, 2017 02:17 AM

Permalink

Additional Resources

Office

Quick Links

Additional
Resources