SPSS Statistics

Your hub for statistical analysis, data management, and data documentation. Connect, learn, and share with your peers!

View Only

Back to discussions

Expand all | Collapse all

Using train dataset to predict the test dataset

1. Using train dataset to predict the test dataset

Like
Hesham Dabbas
Posted Wed May 22, 2024 11:29 AM

Reply
good morning,

I am sure you are familiar with the titanic dataset predicting survival rate. I created regression model, one linear discriminant analysis model, one Naive Bayes model, and one K-Nearest Neighbor model to predict survival using the train set. I am suppose to predict the test set data using all four models. However, the test does not include the Survived variable. How can I complete this task?

------------------------------
Hesham Dabbas
------------------------------
2. RE: Using train dataset to predict the test dataset

Like
David Dwyer
Posted Thu May 23, 2024 11:07 AM

Reply
Hi @Hesham Dabbas,

What is your source for the titanic.sav dataset?

I googled for it and found
https://github.com/datasciencedojo/datasets/blob/master/titanic.csv

The second variable in the file is "Survived", dichotomous, coded 0 and 1. Is this the same dataset you are referencing?

------------------------------
David Dwyer
SPSS Technical Support
IBM Software
------------------------------

Original Message
3. RE: Using train dataset to predict the test dataset

Like
Hesham Dabbas
Posted Thu May 23, 2024 11:35 AM
| view attached (2)

Reply
Thank you. The instructor provided two separate files: train and test. The train include the Survived variable but the test does not. Please see attached.

------------------------------
Hesham Dabbas
------------------------------

Attachment(s)

test.csv 29 KB 1 version

train.csv 56 KB 1 version

Original Message
4. RE: Using train dataset to predict the test dataset

Like
Jon Peck

IBM Champion
Posted Thu May 23, 2024 12:48 PM

Reply
Well, you can predict the outcomes in the test dataset (using the Scoring Wizard), and you can summarize those outcomes, but since you can't evaluate the quality of the predictions without the survived variables, maybe the instructor wants your predictions dataset.

You can compare the predictions across the different estimated models. It would be interesting to see how well they agree.

------------------------------
Jon Peck
------------------------------

Original Message
5. RE: Using train dataset to predict the test dataset

Like
Hesham Dabbas
Posted Fri May 24, 2024 08:35 AM

Reply
Thank you. Do you have or recommend instruction on how to use the scoring wizard?

------------------------------
Hesham Dabbas
------------------------------

Original Message
6. RE: Using train dataset to predict the test dataset

Like
Jon Peck

IBM Champion
Posted Fri May 24, 2024 09:44 AM

Reply
In the estimation procedure, save the model (Export Model Information to XML file).
Then use Utilities > Scoring Wizard after switching to the test dataset, select the model you saved and the variables you want to create.

--
Jon K Peck
jkpeck@gmail.com

Original Message

SPSS Statistics

SPSS Statistics

Using train dataset to predict the test dataset

Hesham DabbasWed May 22, 2024 11:29 AM

David DwyerThu May 23, 2024 11:07 AM

Hesham DabbasThu May 23, 2024 11:35 AM

Jon PeckThu May 23, 2024 12:48 PM

Hesham DabbasFri May 24, 2024 08:35 AM

Jon PeckFri May 24, 2024 09:44 AM

1. Using train dataset to predict the test dataset

2. RE: Using train dataset to predict the test dataset

3. RE: Using train dataset to predict the test dataset

4. RE: Using train dataset to predict the test dataset

5. RE: Using train dataset to predict the test dataset

6. RE: Using train dataset to predict the test dataset

Additional
Resources

Office

Quick Links

SPSS Statistics

SPSS Statistics

Using train dataset to predict the test dataset

Hesham DabbasWed May 22, 2024 11:29 AM

David DwyerThu May 23, 2024 11:07 AM

Hesham DabbasThu May 23, 2024 11:35 AM

Jon PeckThu May 23, 2024 12:48 PM

Hesham DabbasFri May 24, 2024 08:35 AM

Jon PeckFri May 24, 2024 09:44 AM

1. Using train dataset to predict the test dataset

2. RE: Using train dataset to predict the test dataset

3. RE: Using train dataset to predict the test dataset

4. RE: Using train dataset to predict the test dataset

5. RE: Using train dataset to predict the test dataset

6. RE: Using train dataset to predict the test dataset

Additional Resources

Office

Quick Links

Additional
Resources