Welcome to the IBM Community, a place to collaborate, share knowledge, & support one another in everyday challenges. Connect with your fellow members through forums, blogs, files, & face-to-face networking.
Sign In
Search Options
Search Options
Data Science
Watson Studio
Skip to main content (Press Enter).
Sign in
Skip auxiliary navigation (Press Enter).
Data Science
Topic groups
Centers for Advanced Studies
Global Data Science
Decision Optimization
SPSS Modeler
SPSS Statistics
Watson Studio
Data and AI Learning
User groups
Events
Upcoming Events
On Demand Webinars
IBM Expert TV
Virtual Community Events
All IBM Community Events
Participate
Gamification Program
Post to Forum
Share a Resource
Share Your Expertise
Blogging on the Community
Connect with Data Science Users
All IBM Community Users
Resources
Community Front Porch
IBM Champions
IBM Cloud Support
IBM Documentation
IBM Support
IBM Technology Zone
IBM Training
Data Science Elite
Marketplace
Marketplace
Watson Studio
Join the conversation.
Join / sign up
Explore Watson Studio
Skip main navigation (Press Enter).
Toggle navigation
Search Options
Feed
News
Group resources
Learn
Support
Data Science Community
Participate
Blogs
Blog Viewer
Watson Studio
Data Science
View Only
Group Home
Discussion
308
Library
62
Blogs
68
Events
0
Members
968
Back to Blog List
Spark + SPSS Modeler: Boosted Trees, K-Means, and Naive Bayes
By
Archive User
posted
Mon April 11, 2016 04:28 PM
0
Like
We are excited to announce the release of 3 new extensions for SPSS Modeler using MLlib implemented algorithms and PySpark. These three extensions are Gradient-Boosted Trees, K-Means Clustering, and Multinomial Naive Bayes. Niall McCarroll, IBM SPSS Analytic Server Software Engineer, and I developed these extensions in Modeler version 18, where it is now possible to run PySpark algorithms locally. This means that users who have Modeler 18 with Server Enablement can use these extensions to build models using local data or distributed data in a Spark cluster on Analytic Server.
Gradient-Boosted Trees
- Supervised learning algorithm that can be used for either binary classification or regression tasks. Learn more about the implementation
here
.
K-Means Clustering
- Unsupervised clustering technique accepting a user defined number of clusters (k). Learn more about the implementation
here
.
Multinomial Naive Bayes
- Supervised learning variation of Naive Bayes used for classification. The inputs used for this algorithm should be frequencies. A classic example is using a term-document frequency matrix to perform document classification. Learn more about the implementation
here
.
Ready to get the extensions and try them out? Great! Search for the extensions by name in the Extension Hub in Modeler 18, or visit the repository for each extension:
Gradient-Boosted Trees
K-Means Clustering
Multinomial Naive Bayes
#spss
#Algorithms
#python
#Spark
#SPSSModeler
#Programmability
0 comments
8 views
Permalink
Data Science
Topic groups
Centers for Advanced Studies
Global Data Science
Decision Optimization
SPSS Modeler
SPSS Statistics
Watson Studio
Data and AI Learning
User groups
Events
Upcoming Events
On Demand Webinars
IBM Expert TV
Virtual Community Events
All IBM Community Events
Participate
Gamification Program
Post to Forum
Share a Resource
Share Your Expertise
Blogging on the Community
Connect with Data Science Users
All IBM Community Users
Resources
Community Front Porch
IBM Champions
IBM Cloud Support
IBM Documentation
IBM Support
IBM Technology Zone
IBM Training
Data Science Elite
Marketplace
Marketplace
Copyright © 2019 IBM Data Science Community. All rights reserved.
Powered by Higher Logic