Community
IBM Community Home
Business Analytics
Business Automation
Cloud Pak for Data
Data Science
DataOps
Hybrid Data Management
IBM Japan
IBM Z & LinuxONE
Internet of Things
Middleware
Power Systems
Public Cloud
Security
Storage
Supply Chain
Watson Apps
Sign In
Announcements
Blogs
Groups
Discussions
Events
Glossary
Site Content
Libraries
on this day
between these dates
Posted by
Announcements
Blogs
Groups
Discussions
Events
Glossary
Site Content
Libraries
on this day
between these dates
Posted by
Data Science
Watson Studio
Skip to main content (Press Enter).
Sign in
Skip auxiliary navigation (Press Enter).
Data Science
Topic groups
Global Data Science
Decision Optimization
SPSS Statistics
Watson Studio
Data and AI Learning
User groups
Events
Upcoming Events
On Demand Webinars
IBM Expert TV
Virtual Community Events
All IBM Community Events
Participate
Post to Forum
Share a Resource
Blogging on the Community
Connect with Data Science Users
All IBM Community Users
Data Science Elite
Resources
IBM Support
IBM Cloud Support
IBM Champions
Demos
Marketplace
Marketplace
Watson Studio and Machine Learning
Join the conversation.
Join / sign up
Explore Watson Studio
Skip main navigation (Press Enter).
Toggle navigation
Content types
Announcements
Blogs
Groups
Discussions
Events
Glossary
Site Content
Libraries
Date range
on this day
between these dates
Posted by
Feed
News
Group resources
Learn
Support
Data Science Community
Participate
Blogs
Blog Viewer
Watson Studio
View Only
Group Home
Discussion
270
Library
58
Blogs
56
Events
0
Members
692
Back to Blog List
Spark + SPSS Modeler: Boosted Trees, K-Means, and Naive Bayes
By
Archive User
posted
Mon April 11, 2016 04:28 PM
Options Dropdown
Mark as Inappropriate
0
Recommend
We are excited to announce the release of 3 new extensions for SPSS Modeler using MLlib implemented algorithms and PySpark. These three extensions are Gradient-Boosted Trees, K-Means Clustering, and Multinomial Naive Bayes. Niall McCarroll, IBM SPSS Analytic Server Software Engineer, and I developed these extensions in Modeler version 18, where it is now possible to run PySpark algorithms locally. This means that users who have Modeler 18 with Server Enablement can use these extensions to build models using local data or distributed data in a Spark cluster on Analytic Server.
Gradient-Boosted Trees
- Supervised learning algorithm that can be used for either binary classification or regression tasks. Learn more about the implementation
here
.
K-Means Clustering
- Unsupervised clustering technique accepting a user defined number of clusters (k). Learn more about the implementation
here
.
Multinomial Naive Bayes
- Supervised learning variation of Naive Bayes used for classification. The inputs used for this algorithm should be frequencies. A classic example is using a term-document frequency matrix to perform document classification. Learn more about the implementation
here
.
Ready to get the extensions and try them out? Great! Search for the extensions by name in the Extension Hub in Modeler 18, or visit the repository for each extension:
Gradient-Boosted Trees
K-Means Clustering
Multinomial Naive Bayes
#spss
#Algorithms
#python
#Spark
#SPSSModeler
#Programmability
0 comments
8 views
×
Reason for Moderation
Describe the reason this content should be moderated (required)
Permalink
Data Science
Topic groups
Global Data Science
Decision Optimization
SPSS Statistics
Watson Studio
Data and AI Learning
User groups
Events
Upcoming Events
On Demand Webinars
IBM Expert TV
Virtual Community Events
All IBM Community Events
Participate
Post to Forum
Share a Resource
Blogging on the Community
Connect with Data Science Users
All IBM Community Users
Data Science Elite
Resources
IBM Support
IBM Cloud Support
IBM Champions
Demos
Marketplace
Marketplace
Copyright © 2019 IBM Data Science Community. All rights reserved.
Powered by Higher Logic