Multiple Imputation

6. RE: Multiple Imputation

Like

Courtney B Francis

Posted Mon September 19, 2022 04:17 PM

Hi @Rick Marcantonio,!

They are being imputed (The amount of cases have increased considerably) and the amount of missingness has decreased from the original dataset, but it was my understanding that once all the variables of interest were imputed, there should no longer be any missingness-especially in the pooled dataset.

------------------------------
Courtney B Francis
------------------------------

Original Message

7. RE: Multiple Imputation

Like

Rick Marcantonio

Posted Mon September 19, 2022 04:46 PM

Yes, I understand what you mean.

I'm curious if the student weight variable (STUDWGT) has any 0 or missing values...

------------------------------
Rick Marcantonio
Quality Assurance
IBM
------------------------------

Original Message

8. RE: Multiple Imputation

Like

IBM Champion

Jon Peck

Posted Mon September 19, 2022 05:04 PM

But did the tables that the MI procedure produces show that some variables/values could not be imputed?

--

Jon K Peck
jkpeck@gmail.com

Original Message

9. RE: Multiple Imputation

Like

Rick Marcantonio

Posted Mon September 19, 2022 05:12 PM

For example, Courtney, this table. What do you have, for "Not imputed"?

------------------------------
Rick Marcantonio
Quality Assurance
IBM
------------------------------

Original Message

10. RE: Multiple Imputation

Like

Courtney B Francis

Posted Mon September 19, 2022 08:17 PM

Hi Rick!

That section was blank just as it appears in your image. There were three variables that were "not imputed" due to do missing values" Bur the "Not Imputed (too Many Missing Values) was blank as yours appears above.

Best,
Courtney B. Francis

------------------------------
Courtney B Francis
------------------------------

Original Message

11. RE: Multiple Imputation

Like

Rick Marcantonio

Posted Mon September 19, 2022 08:47 PM

OK, maybe we've narrowed it down to those 3 variables that are not imputed due to missing values.

Try adding /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0)

Do those three variables appear at the top of the list in the Variable Summary Table?

------------------------------
Rick Marcantonio
Quality Assurance
IBM
------------------------------

Original Message

12. RE: Multiple Imputation

Like

Courtney B Francis

Posted Mon September 19, 2022 09:15 PM

HI @Rick Marcantonio,

I realized a typo in my reply to your previous message. I meant to say: There were three variables in the box below labeled: "not imputed (no missing values)"
Are you suggesting I add /MISSINGSUMMARIES OVERALL VARIABLES (MAXVARS=100 MINPCTMISSING=0) to the overall syntax and rerun the model.

Those three variables do not appear at the top of the list in the variable summary table.

------------------------------
Courtney B Francis
------------------------------

Original Message

13. RE: Multiple Imputation

Like

Rick Marcantonio

Posted Mon September 19, 2022 10:10 PM

Well, no, they'd be at the bottom of the list, since they have no missing data.

I am suggesting that you re-run your original syntax, just please change /MISSINGSUMMARIES NONE to /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0).

I'm trying to get some idea what the "missingness" looks like in these data.

------------------------------
Rick Marcantonio
Quality Assurance
IBM
------------------------------

Original Message

14. RE: Multiple Imputation

Like

Courtney B Francis

Posted Mon September 19, 2022 10:25 PM

Okay @Rick Marcantonio, I will give that a try by including that line in my syntax and share the results.

Best,

------------------------------
Courtney B Francis
------------------------------

Original Message

15. RE: Multiple Imputation

Like

Courtney B Francis

Posted Mon September 19, 2022 11:49 PM
Edited by System Test Fri January 20, 2023 04:45 PM

Hi @Rick Marcantonio,

I ran the model again (less iterations and imputation for the sake of time), and I received the same situation. I still have missingness in my pooled dataset for ALL variables.

I used the same syntax, except at the bottom, instead of MISSINGSUMMARIES = NONE, I changed it to: /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0).

If you have any insight into what the issue is, please let me know!

Best,

------------------------------
Courtney B Francis
------------------------------

Original Message

16. RE: Multiple Imputation

Like

Rick Marcantonio

Posted Mon September 19, 2022 11:58 PM

I can't see the table I want to see - the Variable Summary.

------------------------------
Rick Marcantonio
Quality Assurance
IBM
------------------------------

Original Message

17. RE: Multiple Imputation

Like

Courtney B Francis

Posted Tue September 20, 2022 12:03 AM

I pasted it below:

Variable Summary
	Missing	Valid N	Mean	Std. Deviation
	N	Percent	Mean	Std. Deviation
ACT_FINAL	122581	25.4%	359361	24.7110	16.94311
WHITE_XACT	69418	14.4%	412524	13.2600	40.98339
RECODE of DISAB01_LEARNING DISABILITY	63094	13.1%	418848	.0438	.63238
RECODE of INCOME (Parental Income)	49476	10.3%	432466	1.97	2.875
RECODE of DISAB01_CHRONIC ILLNESS	47338	9.8%	434604	.0356	.57287
RECODE of DISAB01_PSYCHOLOGICAL DISORDER	46662	9.7%	435280	.0221	.45473
RECODE of DISAB01_OTHER DISABILITY	46533	9.7%	435409	.0745	.81162
RECODE of DISAB01_ADHD	45443	9.4%	436499	.0567	.71469
RECODE of DISAB01_LEARNING DISABILITY	45336	9.4%	436606	.0328	.55037
TFS Likelihood of College Involvement Score	44385	9.2%	437557	49.4252	25.04735
DEGREE ASPIRATIONS DICHOTOMOUS	42437	8.8%	439505	.7639	1.31083
BLACK_BROWN_XACT	41513	8.6%	440429	2.6916	22.11014
WHITE_XINCOME	36537	7.6%	445405	1.2194	3.98811
AUTISM_XACT	31898	6.6%	450044	.1061	5.15740
AUTISM_XINCOME	31462	6.5%	450480	.0103	.48015
AUTISM_XCOLLINV	31354	6.5%	450588	.2608	11.01939
AUTISM_XSSCG	31210	6.5%	450732	.0095	.41915
AUTISM_XASCG	31203	6.5%	450739	.0117	.50244
AUTISM_XHSGPA	31183	6.5%	450759	.0367	1.50114
AUTISM_XHOMG	31168	6.5%	450774	.0120	.50833
RECODE of AUTISM	31116	6.5%	450826	.0061	.23762
WHITE_XDEGASP	30886	6.4%	451056	.4121	1.51961
WHITE_XCOLLINV	30022	6.2%	451920	27.2186	76.92240
AUTISM_XDEGASP	25605	5.3%	456337	.0036	.18283
TFS Social Self-Concept Group	21697	4.5%	460245	1.92	2.366
TFS Academic Self-Concept Group	20658	4.3%	461284	1.94	2.226
AUTISM_XWHITE	18863	3.9%	463079	.0036	.18470
BLACK_BROWN_XCOLLINV	18023	3.7%	463919	8.5708	58.87801
BLACK_BROWN_XINCOME	17799	3.7%	464143	.2465	1.86394
WHITE_XSSCG	17458	3.6%	464484	1.0970	3.41879
WHITE_XASCG	16917	3.5%	465025	1.1271	3.43757
BLACK_BROWN_XDEGASP	16077	3.3%	465865	.1395	1.06979
AUTISM_XSEX	15996	3.3%	465946	.0018	.12976
RECODE of FIRSTGEN (First generation status based on parent(s) with less than 's	13686	2.8%	468256	.19	1.217
BLACK_BROWN_XSSCG	13200	2.7%	468742	.3558	2.54229
BLACK_BROWN_XASCG	12949	2.7%	468993	.3372	2.39447
WHITE_XHOMG	12777	2.7%	469165	1.1525	3.51353
WHITE_XHSGPA	11872	2.5%	470070	3.8132	10.48926
TFS Habits of Mind Group	11731	2.4%	470211	1.99	2.335
BLACK_BROWN_XHOMG	11189	2.3%	470753	.3650	2.56174
AUTISM_XFIRSTGEN	11000	2.3%	470942	.0008	.08888
BLACK_BROWN_XHSGPA	10521	2.2%	471421	1.0982	7.29505
AUTISM_XBLACK_BROWN	10239	2.1%	471703	.0007	.08021
WHITE_XFIRSTGEN	10102	2.1%	471840	.0649	.75961
This is the group of ASAIN & HISPAINIC & OTHER	9515	2.0%	472427	.1108	.96923
This is the Black and Brown combined variable	9515	2.0%	472427	.1909	1.21370
This is the Hispanic Race Code	9515	2.0%	472427	.1019	.93436
This is the first Recode of the Black Variable	9515	2.0%	472427	.0889	.87919
This is the White Race Code	9515	2.0%	472427	.5818	1.52343
BLACK_BROWN_XFIRSTGEN	8582	1.8%	473360	.0821	.84733
What was your average grade in high school?	5275	1.1%	476667	6.39	4.275
WHITE_XSEX	4808	1.0%	477134	.3077	1.42634
BLACK_BROWN_XSEX	4808	1.0%	477134	.1096	.96532
ED INST TYPE (UNIVERSITY =0 & 4 YEAR COLLEGE =1)	794	0.2%	481148	.5016	1.54678
RECODE of SEX (Your sex:)	0	0.0%	481942	.54	1.540
BLACK_BROWN_AUTISM	0	0.0%	481942	.0007	.07935
INSTITUTIONAL CONTROL	0	0.0%	481942	1.32	1.447

------------------------------
Courtney B Francis
------------------------------

Original Message

18. RE: Multiple Imputation

Like

Rick Marcantonio

Posted Tue September 20, 2022 12:39 AM

OK, I'm just about out of bullets.

Try this. In a syntax window, paste this and run it:

COUNT num_missing= INSTDICO INSTCONT WHITE BLACK Brown Black_Brown_Total
    BLACK_BROWN_AUTISM MINORITIZED_POC RECODEAUTISM HSGPA
    COLLEGE_INVOLVEMENT HABITS_OF_MIND_GRP ACADEMIC_SELFCONCEPT_GRP
    SOCIAL_SELFCONCEPT_GRP DEGASPDICO SEX FIRSTGEN INCOME ACT_FINAL
    BLACK_BROWN_XINCOME BLACK_BROWN_XHSGPA BLACK_BROWN_XHOMG
    BLACK_BROWN_XCOLLINV BLACK_BROWN_XASCG BLACK_BROWN_XSSCG
    BLACK_BROWN_XACT BLACK_BROWN_XSEX BLACK_BROWN_XFIRSTGEN
    BLACK_BROWN_XDEGASP WHITE_XINCOME WHITE_XHSGPA WHITE_XHOMG
    WHITE_XCOLLINV WHITE_XASCG WHITE_XSSCG WHITE_XACT WHITE_XSEX
    WHITE_XFIRSTGEN WHITE_XDEGASP AUTISM_XINCOME AUTISM_XHSGPA
    AUTISM_XHOMG AUTISM_XCOLLINV AUTISM_XASCG AUTISM_XSSCG
    AUTISM_XACT AUTISM_XSEX AUTISM_XFIRSTGEN AUTISM_XDEGASP
    AUTISM_XBLACK_BROWN AUTISM_XWHITE RECODE_DISAB01
    RECODE_DISAB02 RECODE_DISAB04 RECODE_DISAB05 RECODE_DISAB06
    RECODE_DISAB07 (MISSING, SYSMIS).

FRE VAR num_missing.

I think I would also like to see the correlation matrix of these variables.

Maybe you could just send me the dataset.

marcantr@us.ibm.com

------------------------------
Rick Marcantonio
Quality Assurance
IBM
------------------------------

Original Message

19. RE: Multiple Imputation

Like

Courtney B Francis

Posted Tue September 20, 2022 08:34 AM

Should I be running this syntax on the imputed data set? Or the original dataset?

Best,

------------------------------
Courtney B Francis
------------------------------

Original Message

Original Message:
Sent: Tue September 20, 2022 12:38 AM
From: Rick Marcantonio
Subject: Multiple Imputation

OK, I'm just about out of bullets.

Try this. In a syntax window, paste this and run it:

COUNT num_missing= INSTDICO INSTCONT WHITE BLACK Brown Black_Brown_Total    BLACK_BROWN_AUTISM MINORITIZED_POC RECODEAUTISM HSGPA    COLLEGE_INVOLVEMENT HABITS_OF_MIND_GRP ACADEMIC_SELFCONCEPT_GRP    SOCIAL_SELFCONCEPT_GRP DEGASPDICO SEX FIRSTGEN INCOME ACT_FINAL    BLACK_BROWN_XINCOME BLACK_BROWN_XHSGPA BLACK_BROWN_XHOMG    BLACK_BROWN_XCOLLINV BLACK_BROWN_XASCG BLACK_BROWN_XSSCG    BLACK_BROWN_XACT BLACK_BROWN_XSEX BLACK_BROWN_XFIRSTGEN    BLACK_BROWN_XDEGASP WHITE_XINCOME WHITE_XHSGPA WHITE_XHOMG    WHITE_XCOLLINV WHITE_XASCG WHITE_XSSCG WHITE_XACT WHITE_XSEX    WHITE_XFIRSTGEN WHITE_XDEGASP AUTISM_XINCOME AUTISM_XHSGPA    AUTISM_XHOMG AUTISM_XCOLLINV AUTISM_XASCG AUTISM_XSSCG    AUTISM_XACT AUTISM_XSEX AUTISM_XFIRSTGEN AUTISM_XDEGASP    AUTISM_XBLACK_BROWN AUTISM_XWHITE RECODE_DISAB01    RECODE_DISAB02 RECODE_DISAB04 RECODE_DISAB05 RECODE_DISAB06    RECODE_DISAB07 (MISSING, SYSMIS).FRE VAR num_missing.

I think I would also like to see the correlation matrix of these variables.

Maybe you could just send me the dataset.

marcantr@us.ibm.com

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 12:02 AM
From: Courtney B Francis
Subject: Multiple Imputation

I pasted it below:

Variable Summary
	Missing	Valid N	Mean	Std. Deviation
	N	Percent	Mean	Std. Deviation
ACT_FINAL	122581	25.4%	359361	24.7110	16.94311
WHITE_XACT	69418	14.4%	412524	13.2600	40.98339
RECODE of DISAB01_LEARNING DISABILITY	63094	13.1%	418848	.0438	.63238
RECODE of INCOME (Parental Income)	49476	10.3%	432466	1.97	2.875
RECODE of DISAB01_CHRONIC ILLNESS	47338	9.8%	434604	.0356	.57287
RECODE of DISAB01_PSYCHOLOGICAL DISORDER	46662	9.7%	435280	.0221	.45473
RECODE of DISAB01_OTHER DISABILITY	46533	9.7%	435409	.0745	.81162
RECODE of DISAB01_ADHD	45443	9.4%	436499	.0567	.71469
RECODE of DISAB01_LEARNING DISABILITY	45336	9.4%	436606	.0328	.55037
TFS Likelihood of College Involvement Score	44385	9.2%	437557	49.4252	25.04735
DEGREE ASPIRATIONS DICHOTOMOUS	42437	8.8%	439505	.7639	1.31083
BLACK_BROWN_XACT	41513	8.6%	440429	2.6916	22.11014
WHITE_XINCOME	36537	7.6%	445405	1.2194	3.98811
AUTISM_XACT	31898	6.6%	450044	.1061	5.15740
AUTISM_XINCOME	31462	6.5%	450480	.0103	.48015
AUTISM_XCOLLINV	31354	6.5%	450588	.2608	11.01939
AUTISM_XSSCG	31210	6.5%	450732	.0095	.41915
AUTISM_XASCG	31203	6.5%	450739	.0117	.50244
AUTISM_XHSGPA	31183	6.5%	450759	.0367	1.50114
AUTISM_XHOMG	31168	6.5%	450774	.0120	.50833
RECODE of AUTISM	31116	6.5%	450826	.0061	.23762
WHITE_XDEGASP	30886	6.4%	451056	.4121	1.51961
WHITE_XCOLLINV	30022	6.2%	451920	27.2186	76.92240
AUTISM_XDEGASP	25605	5.3%	456337	.0036	.18283
TFS Social Self-Concept Group	21697	4.5%	460245	1.92	2.366
TFS Academic Self-Concept Group	20658	4.3%	461284	1.94	2.226
AUTISM_XWHITE	18863	3.9%	463079	.0036	.18470
BLACK_BROWN_XCOLLINV	18023	3.7%	463919	8.5708	58.87801
BLACK_BROWN_XINCOME	17799	3.7%	464143	.2465	1.86394
WHITE_XSSCG	17458	3.6%	464484	1.0970	3.41879
WHITE_XASCG	16917	3.5%	465025	1.1271	3.43757
BLACK_BROWN_XDEGASP	16077	3.3%	465865	.1395	1.06979
AUTISM_XSEX	15996	3.3%	465946	.0018	.12976
RECODE of FIRSTGEN (First generation status based on parent(s) with less than 's	13686	2.8%	468256	.19	1.217
BLACK_BROWN_XSSCG	13200	2.7%	468742	.3558	2.54229
BLACK_BROWN_XASCG	12949	2.7%	468993	.3372	2.39447
WHITE_XHOMG	12777	2.7%	469165	1.1525	3.51353
WHITE_XHSGPA	11872	2.5%	470070	3.8132	10.48926
TFS Habits of Mind Group	11731	2.4%	470211	1.99	2.335
BLACK_BROWN_XHOMG	11189	2.3%	470753	.3650	2.56174
AUTISM_XFIRSTGEN	11000	2.3%	470942	.0008	.08888
BLACK_BROWN_XHSGPA	10521	2.2%	471421	1.0982	7.29505
AUTISM_XBLACK_BROWN	10239	2.1%	471703	.0007	.08021
WHITE_XFIRSTGEN	10102	2.1%	471840	.0649	.75961
This is the group of ASAIN & HISPAINIC & OTHER	9515	2.0%	472427	.1108	.96923
This is the Black and Brown combined variable	9515	2.0%	472427	.1909	1.21370
This is the Hispanic Race Code	9515	2.0%	472427	.1019	.93436
This is the first Recode of the Black Variable	9515	2.0%	472427	.0889	.87919
This is the White Race Code	9515	2.0%	472427	.5818	1.52343
BLACK_BROWN_XFIRSTGEN	8582	1.8%	473360	.0821	.84733
What was your average grade in high school?	5275	1.1%	476667	6.39	4.275
WHITE_XSEX	4808	1.0%	477134	.3077	1.42634
BLACK_BROWN_XSEX	4808	1.0%	477134	.1096	.96532
ED INST TYPE (UNIVERSITY =0 & 4 YEAR COLLEGE =1)	794	0.2%	481148	.5016	1.54678
RECODE of SEX (Your sex:)	0	0.0%	481942	.54	1.540
BLACK_BROWN_AUTISM	0	0.0%	481942	.0007	.07935
INSTITUTIONAL CONTROL	0	0.0%	481942	1.32	1.447

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 11:57 PM
From: Rick Marcantonio
Subject: Multiple Imputation

I can't see the table I want to see - the Variable Summary.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 11:49 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,

I ran the model again (less iterations and imputation for the sake of time), and I received the same situation. I still have missingness in my pooled dataset for ALL variables.

I used the same syntax, except at the bottom, instead of MISSINGSUMMARIES = NONE, I changed it to: /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0).

If you have any insight into what the issue is, please let me know!

Best,

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 10:10 PM
From: Rick Marcantonio
Subject: Multiple Imputation

Well, no, they'd be at the bottom of the list, since they have no missing data.

I am suggesting that you re-run your original syntax, just please change /MISSINGSUMMARIES NONE to /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0).

I'm trying to get some idea what the "missingness" looks like in these data.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 09:14 PM
From: Courtney B Francis
Subject: Multiple Imputation

HI @Rick Marcantonio,

I realized a typo in my reply to your previous message. I meant to say: There were three variables in the box below labeled: "not imputed (no missing values)"
Are you suggesting I add /MISSINGSUMMARIES OVERALL VARIABLES (MAXVARS=100 MINPCTMISSING=0) to the overall syntax and rerun the model.

Those three variables do not appear at the top of the list in the variable summary table.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 08:46 PM
From: Rick Marcantonio
Subject: Multiple Imputation

OK, maybe we've narrowed it down to those 3 variables that are not imputed due to missing values.

Try adding /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0)

Do those three variables appear at the top of the list in the Variable Summary Table?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 08:17 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi Rick!

That section was blank just as it appears in your image. There were three variables that were "not imputed" due to do missing values" Bur the "Not Imputed (too Many Missing Values) was blank as yours appears above.

Best,
Courtney B. Francis

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 05:12 PM
From: Rick Marcantonio
Subject: Multiple Imputation

For example, Courtney, this table. What do you have, for "Not imputed"?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 05:03 PM
From: Jon Peck
Subject: Multiple Imputation

But did the tables that the MI procedure produces show that some variables/values could not be imputed?

--

Jon K Peck
jkpeck@gmail.com

Original Message:
Sent: 9/19/2022 4:46:00 PM
From: Rick Marcantonio
Subject: RE: Multiple Imputation

Yes, I understand what you mean.

I'm curious if the student weight variable (STUDWGT) has any 0 or missing values...

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 04:16 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,!

They are being imputed (The amount of cases have increased considerably) and the amount of missingness has decreased from the original dataset, but it was my understanding that once all the variables of interest were imputed, there should no longer be any missingness-especially in the pooled dataset.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 04:11 PM
From: Rick Marcantonio
Subject: Multiple Imputation

So then, no values are being imputed at all?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 04:04 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,

Thank you for your response!

All of the variables listed -including the ones in the "/constraints... as Role =IND" have missing data. In all the imputed datasets including the pooled data set. And the amount of missingess is the same across all imputed datasets.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 01:27 PM
From: Rick Marcantonio
Subject: Multiple Imputation

Hi.

Do you mean apart from the variables specified in /CONSTRAINTS as ROLE=IND?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 11:32 AM
From: Courtney B Francis
Subject: Multiple Imputation

Hello!

I have attempted to run Multiple Imputation on my dataset. After the imputation ran I still had missing data in the pooled data set.

Do you know what might have caused this issue? I'm not sure if this is common or if there is an issue in my syntax or Data that may be creating this issue. Please see my syntax below:

SET THREADS = 4.

USE ALL.

FILTER OFF.

SORT CASES BY YEAR SUBJID.

EXECUTE.

SET SEED=20220913.

MULTIPLE IMPUTATION

INSTDICO

INSTCONT

WHITE

BLACK

Brown

Black_Brown_Total

BLACK_BROWN_AUTISM

MINORITIZED_POC

RECODEAUTISM

HSGPA

COLLEGE_INVOLVEMENT

HABITS_OF_MIND_GRP

ACADEMIC_SELFCONCEPT_GRP

SOCIAL_SELFCONCEPT_GRP

DEGASPDICO

SEX

FIRSTGEN

INCOME

ACT_FINAL

BLACK_BROWN_XINCOME

BLACK_BROWN_XHSGPA

BLACK_BROWN_XHOMG

BLACK_BROWN_XCOLLINV

BLACK_BROWN_XASCG

BLACK_BROWN_XSSCG

BLACK_BROWN_XACT

BLACK_BROWN_XSEX

BLACK_BROWN_XFIRSTGEN

BLACK_BROWN_XDEGASP

WHITE_XINCOME

WHITE_XHSGPA

WHITE_XHOMG

WHITE_XCOLLINV

WHITE_XASCG

WHITE_XSSCG

WHITE_XACT

WHITE_XSEX

WHITE_XFIRSTGEN

WHITE_XDEGASP

AUTISM_XINCOME

AUTISM_XHSGPA

AUTISM_XHOMG

AUTISM_XCOLLINV

AUTISM_XASCG

AUTISM_XSSCG

AUTISM_XACT

AUTISM_XSEX

AUTISM_XFIRSTGEN

AUTISM_XDEGASP

AUTISM_XBLACK_BROWN

AUTISM_XWHITE

RECODE_DISAB01

RECODE_DISAB02

RECODE_DISAB04

RECODE_DISAB05

RECODE_DISAB06

RECODE_DISAB07

/ANALYSISWEIGHT STUDWGT

/IMPUTE METHOD=FCS MAXITER= 100 NIMPUTATIONS=10 SCALEMODEL=LINEAR INTERACTIONS=NONE

SINGULAR=1E-012 MAXPCTMISSING=NONE MAXMODELPARAM =10000

/CONSTRAINTS BLACK_BROWN_XINCOME( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XHSGPA( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XHOMG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XCOLLINV( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XASCG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XSSCG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XACT( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XSEX( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XDEGASP( ROLE=IND)

/CONSTRAINTS WHITE_XINCOME( ROLE=IND)

/CONSTRAINTS WHITE_XHSGPA( ROLE=IND)

/CONSTRAINTS WHITE_XHOMG( ROLE=IND)

/CONSTRAINTS WHITE_XCOLLINV( ROLE=IND)

/CONSTRAINTS WHITE_XASCG( ROLE=IND)

/CONSTRAINTS WHITE_XSSCG( ROLE=IND)

/CONSTRAINTS WHITE_XACT( ROLE=IND)

/CONSTRAINTS WHITE_XSEX( ROLE=IND)

/CONSTRAINTS WHITE_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS WHITE_XDEGASP ( ROLE=IND)

/CONSTRAINTS AUTISM_XINCOME( ROLE=IND)

/CONSTRAINTS AUTISM_XHSGPA( ROLE=IND)

/CONSTRAINTS AUTISM_XHOMG( ROLE=IND)

/CONSTRAINTS AUTISM_XCOLLINV( ROLE=IND)

/CONSTRAINTS AUTISM_XASCG( ROLE=IND)

/CONSTRAINTS AUTISM_XSSCG( ROLE=IND)

/CONSTRAINTS AUTISM_XACT( ROLE=IND)

/CONSTRAINTS AUTISM_XSEX( ROLE=IND)

/CONSTRAINTS AUTISM_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS AUTISM_XDEGASP ( ROLE=IND)

/CONSTRAINTS AUTISM_XBLACK_BROWN( ROLE=IND)

/CONSTRAINTS AUTISM_XWHITE( ROLE=IND)

/CONSTRAINTS DEGASP (RND=1 MIN=0 MAX=1)

/MISSINGSUMMARIES NONE

/IMPUTATIONSUMMARIES MODELS DESCRIPTIVES

/OUTFILE IMPUTATIONS=courtney_syntax_9_14_22.sav FCSITERATIONS=iteration_history.

------------------------------
Courtney B Francis
------------------------------
#SPSSStatistics

20. RE: Multiple Imputation

Like

Rick Marcantonio

Posted Tue September 20, 2022 08:36 AM

The original data, before imputation.

Also, it looks like a lot (the majority, perhaps) of these variables are binary (0/1). Is that true?

------------------------------
Rick Marcantonio
Quality Assurance
IBM
------------------------------

Original Message

Original Message:
Sent: Tue September 20, 2022 08:33 AM
From: Courtney B Francis
Subject: Multiple Imputation

Should I be running this syntax on the imputed data set? Or the original dataset?

Best,

------------------------------
Courtney B Francis

Original Message:
Sent: Tue September 20, 2022 12:38 AM
From: Rick Marcantonio
Subject: Multiple Imputation

OK, I'm just about out of bullets.

Try this. In a syntax window, paste this and run it:

COUNT num_missing= INSTDICO INSTCONT WHITE BLACK Brown Black_Brown_Total    BLACK_BROWN_AUTISM MINORITIZED_POC RECODEAUTISM HSGPA    COLLEGE_INVOLVEMENT HABITS_OF_MIND_GRP ACADEMIC_SELFCONCEPT_GRP    SOCIAL_SELFCONCEPT_GRP DEGASPDICO SEX FIRSTGEN INCOME ACT_FINAL    BLACK_BROWN_XINCOME BLACK_BROWN_XHSGPA BLACK_BROWN_XHOMG    BLACK_BROWN_XCOLLINV BLACK_BROWN_XASCG BLACK_BROWN_XSSCG    BLACK_BROWN_XACT BLACK_BROWN_XSEX BLACK_BROWN_XFIRSTGEN    BLACK_BROWN_XDEGASP WHITE_XINCOME WHITE_XHSGPA WHITE_XHOMG    WHITE_XCOLLINV WHITE_XASCG WHITE_XSSCG WHITE_XACT WHITE_XSEX    WHITE_XFIRSTGEN WHITE_XDEGASP AUTISM_XINCOME AUTISM_XHSGPA    AUTISM_XHOMG AUTISM_XCOLLINV AUTISM_XASCG AUTISM_XSSCG    AUTISM_XACT AUTISM_XSEX AUTISM_XFIRSTGEN AUTISM_XDEGASP    AUTISM_XBLACK_BROWN AUTISM_XWHITE RECODE_DISAB01    RECODE_DISAB02 RECODE_DISAB04 RECODE_DISAB05 RECODE_DISAB06    RECODE_DISAB07 (MISSING, SYSMIS).FRE VAR num_missing.

I think I would also like to see the correlation matrix of these variables.

Maybe you could just send me the dataset.

marcantr@us.ibm.com

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 12:02 AM
From: Courtney B Francis
Subject: Multiple Imputation

I pasted it below:

Variable Summary
	Missing	Valid N	Mean	Std. Deviation
	N	Percent	Mean	Std. Deviation
ACT_FINAL	122581	25.4%	359361	24.7110	16.94311
WHITE_XACT	69418	14.4%	412524	13.2600	40.98339
RECODE of DISAB01_LEARNING DISABILITY	63094	13.1%	418848	.0438	.63238
RECODE of INCOME (Parental Income)	49476	10.3%	432466	1.97	2.875
RECODE of DISAB01_CHRONIC ILLNESS	47338	9.8%	434604	.0356	.57287
RECODE of DISAB01_PSYCHOLOGICAL DISORDER	46662	9.7%	435280	.0221	.45473
RECODE of DISAB01_OTHER DISABILITY	46533	9.7%	435409	.0745	.81162
RECODE of DISAB01_ADHD	45443	9.4%	436499	.0567	.71469
RECODE of DISAB01_LEARNING DISABILITY	45336	9.4%	436606	.0328	.55037
TFS Likelihood of College Involvement Score	44385	9.2%	437557	49.4252	25.04735
DEGREE ASPIRATIONS DICHOTOMOUS	42437	8.8%	439505	.7639	1.31083
BLACK_BROWN_XACT	41513	8.6%	440429	2.6916	22.11014
WHITE_XINCOME	36537	7.6%	445405	1.2194	3.98811
AUTISM_XACT	31898	6.6%	450044	.1061	5.15740
AUTISM_XINCOME	31462	6.5%	450480	.0103	.48015
AUTISM_XCOLLINV	31354	6.5%	450588	.2608	11.01939
AUTISM_XSSCG	31210	6.5%	450732	.0095	.41915
AUTISM_XASCG	31203	6.5%	450739	.0117	.50244
AUTISM_XHSGPA	31183	6.5%	450759	.0367	1.50114
AUTISM_XHOMG	31168	6.5%	450774	.0120	.50833
RECODE of AUTISM	31116	6.5%	450826	.0061	.23762
WHITE_XDEGASP	30886	6.4%	451056	.4121	1.51961
WHITE_XCOLLINV	30022	6.2%	451920	27.2186	76.92240
AUTISM_XDEGASP	25605	5.3%	456337	.0036	.18283
TFS Social Self-Concept Group	21697	4.5%	460245	1.92	2.366
TFS Academic Self-Concept Group	20658	4.3%	461284	1.94	2.226
AUTISM_XWHITE	18863	3.9%	463079	.0036	.18470
BLACK_BROWN_XCOLLINV	18023	3.7%	463919	8.5708	58.87801
BLACK_BROWN_XINCOME	17799	3.7%	464143	.2465	1.86394
WHITE_XSSCG	17458	3.6%	464484	1.0970	3.41879
WHITE_XASCG	16917	3.5%	465025	1.1271	3.43757
BLACK_BROWN_XDEGASP	16077	3.3%	465865	.1395	1.06979
AUTISM_XSEX	15996	3.3%	465946	.0018	.12976
RECODE of FIRSTGEN (First generation status based on parent(s) with less than 's	13686	2.8%	468256	.19	1.217
BLACK_BROWN_XSSCG	13200	2.7%	468742	.3558	2.54229
BLACK_BROWN_XASCG	12949	2.7%	468993	.3372	2.39447
WHITE_XHOMG	12777	2.7%	469165	1.1525	3.51353
WHITE_XHSGPA	11872	2.5%	470070	3.8132	10.48926
TFS Habits of Mind Group	11731	2.4%	470211	1.99	2.335
BLACK_BROWN_XHOMG	11189	2.3%	470753	.3650	2.56174
AUTISM_XFIRSTGEN	11000	2.3%	470942	.0008	.08888
BLACK_BROWN_XHSGPA	10521	2.2%	471421	1.0982	7.29505
AUTISM_XBLACK_BROWN	10239	2.1%	471703	.0007	.08021
WHITE_XFIRSTGEN	10102	2.1%	471840	.0649	.75961
This is the group of ASAIN & HISPAINIC & OTHER	9515	2.0%	472427	.1108	.96923
This is the Black and Brown combined variable	9515	2.0%	472427	.1909	1.21370
This is the Hispanic Race Code	9515	2.0%	472427	.1019	.93436
This is the first Recode of the Black Variable	9515	2.0%	472427	.0889	.87919
This is the White Race Code	9515	2.0%	472427	.5818	1.52343
BLACK_BROWN_XFIRSTGEN	8582	1.8%	473360	.0821	.84733
What was your average grade in high school?	5275	1.1%	476667	6.39	4.275
WHITE_XSEX	4808	1.0%	477134	.3077	1.42634
BLACK_BROWN_XSEX	4808	1.0%	477134	.1096	.96532
ED INST TYPE (UNIVERSITY =0 & 4 YEAR COLLEGE =1)	794	0.2%	481148	.5016	1.54678
RECODE of SEX (Your sex:)	0	0.0%	481942	.54	1.540
BLACK_BROWN_AUTISM	0	0.0%	481942	.0007	.07935
INSTITUTIONAL CONTROL	0	0.0%	481942	1.32	1.447

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 11:57 PM
From: Rick Marcantonio
Subject: Multiple Imputation

I can't see the table I want to see - the Variable Summary.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 11:49 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,

I ran the model again (less iterations and imputation for the sake of time), and I received the same situation. I still have missingness in my pooled dataset for ALL variables.

I used the same syntax, except at the bottom, instead of MISSINGSUMMARIES = NONE, I changed it to: /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0).

If you have any insight into what the issue is, please let me know!

Best,

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 10:10 PM
From: Rick Marcantonio
Subject: Multiple Imputation

Well, no, they'd be at the bottom of the list, since they have no missing data.

I am suggesting that you re-run your original syntax, just please change /MISSINGSUMMARIES NONE to /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0).

I'm trying to get some idea what the "missingness" looks like in these data.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 09:14 PM
From: Courtney B Francis
Subject: Multiple Imputation

HI @Rick Marcantonio,

I realized a typo in my reply to your previous message. I meant to say: There were three variables in the box below labeled: "not imputed (no missing values)"
Are you suggesting I add /MISSINGSUMMARIES OVERALL VARIABLES (MAXVARS=100 MINPCTMISSING=0) to the overall syntax and rerun the model.

Those three variables do not appear at the top of the list in the variable summary table.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 08:46 PM
From: Rick Marcantonio
Subject: Multiple Imputation

OK, maybe we've narrowed it down to those 3 variables that are not imputed due to missing values.

Try adding /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0)

Do those three variables appear at the top of the list in the Variable Summary Table?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 08:17 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi Rick!

That section was blank just as it appears in your image. There were three variables that were "not imputed" due to do missing values" Bur the "Not Imputed (too Many Missing Values) was blank as yours appears above.

Best,
Courtney B. Francis

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 05:12 PM
From: Rick Marcantonio
Subject: Multiple Imputation

For example, Courtney, this table. What do you have, for "Not imputed"?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 05:03 PM
From: Jon Peck
Subject: Multiple Imputation

But did the tables that the MI procedure produces show that some variables/values could not be imputed?

--

Jon K Peck
jkpeck@gmail.com

Original Message:
Sent: 9/19/2022 4:46:00 PM
From: Rick Marcantonio
Subject: RE: Multiple Imputation

Yes, I understand what you mean.

I'm curious if the student weight variable (STUDWGT) has any 0 or missing values...

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 04:16 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,!

They are being imputed (The amount of cases have increased considerably) and the amount of missingness has decreased from the original dataset, but it was my understanding that once all the variables of interest were imputed, there should no longer be any missingness-especially in the pooled dataset.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 04:11 PM
From: Rick Marcantonio
Subject: Multiple Imputation

So then, no values are being imputed at all?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 04:04 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,

Thank you for your response!

All of the variables listed -including the ones in the "/constraints... as Role =IND" have missing data. In all the imputed datasets including the pooled data set. And the amount of missingess is the same across all imputed datasets.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 01:27 PM
From: Rick Marcantonio
Subject: Multiple Imputation

Hi.

Do you mean apart from the variables specified in /CONSTRAINTS as ROLE=IND?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 11:32 AM
From: Courtney B Francis
Subject: Multiple Imputation

Hello!

I have attempted to run Multiple Imputation on my dataset. After the imputation ran I still had missing data in the pooled data set.

Do you know what might have caused this issue? I'm not sure if this is common or if there is an issue in my syntax or Data that may be creating this issue. Please see my syntax below:

SET THREADS = 4.

USE ALL.

FILTER OFF.

SORT CASES BY YEAR SUBJID.

EXECUTE.

SET SEED=20220913.

MULTIPLE IMPUTATION

INSTDICO

INSTCONT

WHITE

BLACK

Brown

Black_Brown_Total

BLACK_BROWN_AUTISM

MINORITIZED_POC

RECODEAUTISM

HSGPA

COLLEGE_INVOLVEMENT

HABITS_OF_MIND_GRP

ACADEMIC_SELFCONCEPT_GRP

SOCIAL_SELFCONCEPT_GRP

DEGASPDICO

SEX

FIRSTGEN

INCOME

ACT_FINAL

BLACK_BROWN_XINCOME

BLACK_BROWN_XHSGPA

BLACK_BROWN_XHOMG

BLACK_BROWN_XCOLLINV

BLACK_BROWN_XASCG

BLACK_BROWN_XSSCG

BLACK_BROWN_XACT

BLACK_BROWN_XSEX

BLACK_BROWN_XFIRSTGEN

BLACK_BROWN_XDEGASP

WHITE_XINCOME

WHITE_XHSGPA

WHITE_XHOMG

WHITE_XCOLLINV

WHITE_XASCG

WHITE_XSSCG

WHITE_XACT

WHITE_XSEX

WHITE_XFIRSTGEN

WHITE_XDEGASP

AUTISM_XINCOME

AUTISM_XHSGPA

AUTISM_XHOMG

AUTISM_XCOLLINV

AUTISM_XASCG

AUTISM_XSSCG

AUTISM_XACT

AUTISM_XSEX

AUTISM_XFIRSTGEN

AUTISM_XDEGASP

AUTISM_XBLACK_BROWN

AUTISM_XWHITE

RECODE_DISAB01

RECODE_DISAB02

RECODE_DISAB04

RECODE_DISAB05

RECODE_DISAB06

RECODE_DISAB07

/ANALYSISWEIGHT STUDWGT

/IMPUTE METHOD=FCS MAXITER= 100 NIMPUTATIONS=10 SCALEMODEL=LINEAR INTERACTIONS=NONE

SINGULAR=1E-012 MAXPCTMISSING=NONE MAXMODELPARAM =10000

/CONSTRAINTS BLACK_BROWN_XINCOME( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XHSGPA( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XHOMG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XCOLLINV( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XASCG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XSSCG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XACT( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XSEX( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XDEGASP( ROLE=IND)

/CONSTRAINTS WHITE_XINCOME( ROLE=IND)

/CONSTRAINTS WHITE_XHSGPA( ROLE=IND)

/CONSTRAINTS WHITE_XHOMG( ROLE=IND)

/CONSTRAINTS WHITE_XCOLLINV( ROLE=IND)

/CONSTRAINTS WHITE_XASCG( ROLE=IND)

/CONSTRAINTS WHITE_XSSCG( ROLE=IND)

/CONSTRAINTS WHITE_XACT( ROLE=IND)

/CONSTRAINTS WHITE_XSEX( ROLE=IND)

/CONSTRAINTS WHITE_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS WHITE_XDEGASP ( ROLE=IND)

/CONSTRAINTS AUTISM_XINCOME( ROLE=IND)

/CONSTRAINTS AUTISM_XHSGPA( ROLE=IND)

/CONSTRAINTS AUTISM_XHOMG( ROLE=IND)

/CONSTRAINTS AUTISM_XCOLLINV( ROLE=IND)

/CONSTRAINTS AUTISM_XASCG( ROLE=IND)

/CONSTRAINTS AUTISM_XSSCG( ROLE=IND)

/CONSTRAINTS AUTISM_XACT( ROLE=IND)

/CONSTRAINTS AUTISM_XSEX( ROLE=IND)

/CONSTRAINTS AUTISM_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS AUTISM_XDEGASP ( ROLE=IND)

/CONSTRAINTS AUTISM_XBLACK_BROWN( ROLE=IND)

/CONSTRAINTS AUTISM_XWHITE( ROLE=IND)

/CONSTRAINTS DEGASP (RND=1 MIN=0 MAX=1)

/MISSINGSUMMARIES NONE

/IMPUTATIONSUMMARIES MODELS DESCRIPTIVES

/OUTFILE IMPUTATIONS=courtney_syntax_9_14_22.sav FCSITERATIONS=iteration_history.

------------------------------
Courtney B Francis
------------------------------
#SPSSStatistics

21. RE: Multiple Imputation

Like

Courtney B Francis

Posted Tue September 20, 2022 08:39 AM

Yes!

majority is binary.

------------------------------
Courtney B Francis
------------------------------

Original Message

Original Message:
Sent: Tue September 20, 2022 08:35 AM
From: Rick Marcantonio
Subject: Multiple Imputation

The original data, before imputation.

Also, it looks like a lot (the majority, perhaps) of these variables are binary (0/1). Is that true?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 08:33 AM
From: Courtney B Francis
Subject: Multiple Imputation

Should I be running this syntax on the imputed data set? Or the original dataset?

Best,

------------------------------
Courtney B Francis

Original Message:
Sent: Tue September 20, 2022 12:38 AM
From: Rick Marcantonio
Subject: Multiple Imputation

OK, I'm just about out of bullets.

Try this. In a syntax window, paste this and run it:

COUNT num_missing= INSTDICO INSTCONT WHITE BLACK Brown Black_Brown_Total    BLACK_BROWN_AUTISM MINORITIZED_POC RECODEAUTISM HSGPA    COLLEGE_INVOLVEMENT HABITS_OF_MIND_GRP ACADEMIC_SELFCONCEPT_GRP    SOCIAL_SELFCONCEPT_GRP DEGASPDICO SEX FIRSTGEN INCOME ACT_FINAL    BLACK_BROWN_XINCOME BLACK_BROWN_XHSGPA BLACK_BROWN_XHOMG    BLACK_BROWN_XCOLLINV BLACK_BROWN_XASCG BLACK_BROWN_XSSCG    BLACK_BROWN_XACT BLACK_BROWN_XSEX BLACK_BROWN_XFIRSTGEN    BLACK_BROWN_XDEGASP WHITE_XINCOME WHITE_XHSGPA WHITE_XHOMG    WHITE_XCOLLINV WHITE_XASCG WHITE_XSSCG WHITE_XACT WHITE_XSEX    WHITE_XFIRSTGEN WHITE_XDEGASP AUTISM_XINCOME AUTISM_XHSGPA    AUTISM_XHOMG AUTISM_XCOLLINV AUTISM_XASCG AUTISM_XSSCG    AUTISM_XACT AUTISM_XSEX AUTISM_XFIRSTGEN AUTISM_XDEGASP    AUTISM_XBLACK_BROWN AUTISM_XWHITE RECODE_DISAB01    RECODE_DISAB02 RECODE_DISAB04 RECODE_DISAB05 RECODE_DISAB06    RECODE_DISAB07 (MISSING, SYSMIS).FRE VAR num_missing.

I think I would also like to see the correlation matrix of these variables.

Maybe you could just send me the dataset.

marcantr@us.ibm.com

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 12:02 AM
From: Courtney B Francis
Subject: Multiple Imputation

I pasted it below:

Variable Summary
	Missing	Valid N	Mean	Std. Deviation
	N	Percent	Mean	Std. Deviation
ACT_FINAL	122581	25.4%	359361	24.7110	16.94311
WHITE_XACT	69418	14.4%	412524	13.2600	40.98339
RECODE of DISAB01_LEARNING DISABILITY	63094	13.1%	418848	.0438	.63238
RECODE of INCOME (Parental Income)	49476	10.3%	432466	1.97	2.875
RECODE of DISAB01_CHRONIC ILLNESS	47338	9.8%	434604	.0356	.57287
RECODE of DISAB01_PSYCHOLOGICAL DISORDER	46662	9.7%	435280	.0221	.45473
RECODE of DISAB01_OTHER DISABILITY	46533	9.7%	435409	.0745	.81162
RECODE of DISAB01_ADHD	45443	9.4%	436499	.0567	.71469
RECODE of DISAB01_LEARNING DISABILITY	45336	9.4%	436606	.0328	.55037
TFS Likelihood of College Involvement Score	44385	9.2%	437557	49.4252	25.04735
DEGREE ASPIRATIONS DICHOTOMOUS	42437	8.8%	439505	.7639	1.31083
BLACK_BROWN_XACT	41513	8.6%	440429	2.6916	22.11014
WHITE_XINCOME	36537	7.6%	445405	1.2194	3.98811
AUTISM_XACT	31898	6.6%	450044	.1061	5.15740
AUTISM_XINCOME	31462	6.5%	450480	.0103	.48015
AUTISM_XCOLLINV	31354	6.5%	450588	.2608	11.01939
AUTISM_XSSCG	31210	6.5%	450732	.0095	.41915
AUTISM_XASCG	31203	6.5%	450739	.0117	.50244
AUTISM_XHSGPA	31183	6.5%	450759	.0367	1.50114
AUTISM_XHOMG	31168	6.5%	450774	.0120	.50833
RECODE of AUTISM	31116	6.5%	450826	.0061	.23762
WHITE_XDEGASP	30886	6.4%	451056	.4121	1.51961
WHITE_XCOLLINV	30022	6.2%	451920	27.2186	76.92240
AUTISM_XDEGASP	25605	5.3%	456337	.0036	.18283
TFS Social Self-Concept Group	21697	4.5%	460245	1.92	2.366
TFS Academic Self-Concept Group	20658	4.3%	461284	1.94	2.226
AUTISM_XWHITE	18863	3.9%	463079	.0036	.18470
BLACK_BROWN_XCOLLINV	18023	3.7%	463919	8.5708	58.87801
BLACK_BROWN_XINCOME	17799	3.7%	464143	.2465	1.86394
WHITE_XSSCG	17458	3.6%	464484	1.0970	3.41879
WHITE_XASCG	16917	3.5%	465025	1.1271	3.43757
BLACK_BROWN_XDEGASP	16077	3.3%	465865	.1395	1.06979
AUTISM_XSEX	15996	3.3%	465946	.0018	.12976
RECODE of FIRSTGEN (First generation status based on parent(s) with less than 's	13686	2.8%	468256	.19	1.217
BLACK_BROWN_XSSCG	13200	2.7%	468742	.3558	2.54229
BLACK_BROWN_XASCG	12949	2.7%	468993	.3372	2.39447
WHITE_XHOMG	12777	2.7%	469165	1.1525	3.51353
WHITE_XHSGPA	11872	2.5%	470070	3.8132	10.48926
TFS Habits of Mind Group	11731	2.4%	470211	1.99	2.335
BLACK_BROWN_XHOMG	11189	2.3%	470753	.3650	2.56174
AUTISM_XFIRSTGEN	11000	2.3%	470942	.0008	.08888
BLACK_BROWN_XHSGPA	10521	2.2%	471421	1.0982	7.29505
AUTISM_XBLACK_BROWN	10239	2.1%	471703	.0007	.08021
WHITE_XFIRSTGEN	10102	2.1%	471840	.0649	.75961
This is the group of ASAIN & HISPAINIC & OTHER	9515	2.0%	472427	.1108	.96923
This is the Black and Brown combined variable	9515	2.0%	472427	.1909	1.21370
This is the Hispanic Race Code	9515	2.0%	472427	.1019	.93436
This is the first Recode of the Black Variable	9515	2.0%	472427	.0889	.87919
This is the White Race Code	9515	2.0%	472427	.5818	1.52343
BLACK_BROWN_XFIRSTGEN	8582	1.8%	473360	.0821	.84733
What was your average grade in high school?	5275	1.1%	476667	6.39	4.275
WHITE_XSEX	4808	1.0%	477134	.3077	1.42634
BLACK_BROWN_XSEX	4808	1.0%	477134	.1096	.96532
ED INST TYPE (UNIVERSITY =0 & 4 YEAR COLLEGE =1)	794	0.2%	481148	.5016	1.54678
RECODE of SEX (Your sex:)	0	0.0%	481942	.54	1.540
BLACK_BROWN_AUTISM	0	0.0%	481942	.0007	.07935
INSTITUTIONAL CONTROL	0	0.0%	481942	1.32	1.447

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 11:57 PM
From: Rick Marcantonio
Subject: Multiple Imputation

I can't see the table I want to see - the Variable Summary.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 11:49 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,

I ran the model again (less iterations and imputation for the sake of time), and I received the same situation. I still have missingness in my pooled dataset for ALL variables.

I used the same syntax, except at the bottom, instead of MISSINGSUMMARIES = NONE, I changed it to: /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0).

If you have any insight into what the issue is, please let me know!

Best,

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 10:10 PM
From: Rick Marcantonio
Subject: Multiple Imputation

Well, no, they'd be at the bottom of the list, since they have no missing data.

I am suggesting that you re-run your original syntax, just please change /MISSINGSUMMARIES NONE to /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0).

I'm trying to get some idea what the "missingness" looks like in these data.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 09:14 PM
From: Courtney B Francis
Subject: Multiple Imputation

HI @Rick Marcantonio,

I realized a typo in my reply to your previous message. I meant to say: There were three variables in the box below labeled: "not imputed (no missing values)"
Are you suggesting I add /MISSINGSUMMARIES OVERALL VARIABLES (MAXVARS=100 MINPCTMISSING=0) to the overall syntax and rerun the model.

Those three variables do not appear at the top of the list in the variable summary table.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 08:46 PM
From: Rick Marcantonio
Subject: Multiple Imputation

OK, maybe we've narrowed it down to those 3 variables that are not imputed due to missing values.

Try adding /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0)

Do those three variables appear at the top of the list in the Variable Summary Table?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 08:17 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi Rick!

That section was blank just as it appears in your image. There were three variables that were "not imputed" due to do missing values" Bur the "Not Imputed (too Many Missing Values) was blank as yours appears above.

Best,
Courtney B. Francis

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 05:12 PM
From: Rick Marcantonio
Subject: Multiple Imputation

For example, Courtney, this table. What do you have, for "Not imputed"?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 05:03 PM
From: Jon Peck
Subject: Multiple Imputation

But did the tables that the MI procedure produces show that some variables/values could not be imputed?

--

Jon K Peck
jkpeck@gmail.com

Original Message:
Sent: 9/19/2022 4:46:00 PM
From: Rick Marcantonio
Subject: RE: Multiple Imputation

Yes, I understand what you mean.

I'm curious if the student weight variable (STUDWGT) has any 0 or missing values...

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 04:16 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,!

They are being imputed (The amount of cases have increased considerably) and the amount of missingness has decreased from the original dataset, but it was my understanding that once all the variables of interest were imputed, there should no longer be any missingness-especially in the pooled dataset.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 04:11 PM
From: Rick Marcantonio
Subject: Multiple Imputation

So then, no values are being imputed at all?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 04:04 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,

Thank you for your response!

All of the variables listed -including the ones in the "/constraints... as Role =IND" have missing data. In all the imputed datasets including the pooled data set. And the amount of missingess is the same across all imputed datasets.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 01:27 PM
From: Rick Marcantonio
Subject: Multiple Imputation

Hi.

Do you mean apart from the variables specified in /CONSTRAINTS as ROLE=IND?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 11:32 AM
From: Courtney B Francis
Subject: Multiple Imputation

Hello!

I have attempted to run Multiple Imputation on my dataset. After the imputation ran I still had missing data in the pooled data set.

Do you know what might have caused this issue? I'm not sure if this is common or if there is an issue in my syntax or Data that may be creating this issue. Please see my syntax below:

SET THREADS = 4.

USE ALL.

FILTER OFF.

SORT CASES BY YEAR SUBJID.

EXECUTE.

SET SEED=20220913.

MULTIPLE IMPUTATION

INSTDICO

INSTCONT

WHITE

BLACK

Brown

Black_Brown_Total

BLACK_BROWN_AUTISM

MINORITIZED_POC

RECODEAUTISM

HSGPA

COLLEGE_INVOLVEMENT

HABITS_OF_MIND_GRP

ACADEMIC_SELFCONCEPT_GRP

SOCIAL_SELFCONCEPT_GRP

DEGASPDICO

SEX

FIRSTGEN

INCOME

ACT_FINAL

BLACK_BROWN_XINCOME

BLACK_BROWN_XHSGPA

BLACK_BROWN_XHOMG

BLACK_BROWN_XCOLLINV

BLACK_BROWN_XASCG

BLACK_BROWN_XSSCG

BLACK_BROWN_XACT

BLACK_BROWN_XSEX

BLACK_BROWN_XFIRSTGEN

BLACK_BROWN_XDEGASP

WHITE_XINCOME

WHITE_XHSGPA

WHITE_XHOMG

WHITE_XCOLLINV

WHITE_XASCG

WHITE_XSSCG

WHITE_XACT

WHITE_XSEX

WHITE_XFIRSTGEN

WHITE_XDEGASP

AUTISM_XINCOME

AUTISM_XHSGPA

AUTISM_XHOMG

AUTISM_XCOLLINV

AUTISM_XASCG

AUTISM_XSSCG

AUTISM_XACT

AUTISM_XSEX

AUTISM_XFIRSTGEN

AUTISM_XDEGASP

AUTISM_XBLACK_BROWN

AUTISM_XWHITE

RECODE_DISAB01

RECODE_DISAB02

RECODE_DISAB04

RECODE_DISAB05

RECODE_DISAB06

RECODE_DISAB07

/ANALYSISWEIGHT STUDWGT

/IMPUTE METHOD=FCS MAXITER= 100 NIMPUTATIONS=10 SCALEMODEL=LINEAR INTERACTIONS=NONE

SINGULAR=1E-012 MAXPCTMISSING=NONE MAXMODELPARAM =10000

/CONSTRAINTS BLACK_BROWN_XINCOME( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XHSGPA( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XHOMG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XCOLLINV( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XASCG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XSSCG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XACT( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XSEX( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XDEGASP( ROLE=IND)

/CONSTRAINTS WHITE_XINCOME( ROLE=IND)

/CONSTRAINTS WHITE_XHSGPA( ROLE=IND)

/CONSTRAINTS WHITE_XHOMG( ROLE=IND)

/CONSTRAINTS WHITE_XCOLLINV( ROLE=IND)

/CONSTRAINTS WHITE_XASCG( ROLE=IND)

/CONSTRAINTS WHITE_XSSCG( ROLE=IND)

/CONSTRAINTS WHITE_XACT( ROLE=IND)

/CONSTRAINTS WHITE_XSEX( ROLE=IND)

/CONSTRAINTS WHITE_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS WHITE_XDEGASP ( ROLE=IND)

/CONSTRAINTS AUTISM_XINCOME( ROLE=IND)

/CONSTRAINTS AUTISM_XHSGPA( ROLE=IND)

/CONSTRAINTS AUTISM_XHOMG( ROLE=IND)

/CONSTRAINTS AUTISM_XCOLLINV( ROLE=IND)

/CONSTRAINTS AUTISM_XASCG( ROLE=IND)

/CONSTRAINTS AUTISM_XSSCG( ROLE=IND)

/CONSTRAINTS AUTISM_XACT( ROLE=IND)

/CONSTRAINTS AUTISM_XSEX( ROLE=IND)

/CONSTRAINTS AUTISM_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS AUTISM_XDEGASP ( ROLE=IND)

/CONSTRAINTS AUTISM_XBLACK_BROWN( ROLE=IND)

/CONSTRAINTS AUTISM_XWHITE( ROLE=IND)

/CONSTRAINTS DEGASP (RND=1 MIN=0 MAX=1)

/MISSINGSUMMARIES NONE

/IMPUTATIONSUMMARIES MODELS DESCRIPTIVES

/OUTFILE IMPUTATIONS=courtney_syntax_9_14_22.sav FCSITERATIONS=iteration_history.

------------------------------
Courtney B Francis
------------------------------
#SPSSStatistics

22. RE: Multiple Imputation

Like

Rick Marcantonio

Posted Tue September 20, 2022 09:26 AM

Courtney;

I think I see what is happening. Cases that are completely missing (e.g., let's say that case #5 has no valid data at all for any of the analysis variables) are not imputed. Here is an example. Open a new syntax window and run it.

preserve.
*output close all.
set undefined=nowarn.
dataset close all.
new file.
DATA LIST FREE /id a b c d e f g studwgt.
begin data.
01 1 4 3 5 6 7 5 12
02 8 6 7 5 6 4 . 21
03 4 5 . 5 4 6 3 10
04 2 1 4 4 3 6 7 21
05 . . . . . . . 14
06 7 7 8 6 8 7 4 11
07 5 4 6 7 8 9 1 16
08 7 6 . 7 . 6 5 14
09 5 3 4 3 2 1 3 16
10 6 6 5 7 . 4 . 12
11 3 3 4 2 3 1 2 15
12 . . . . . . . 20
end data.
restore.
recode a b c e g (lo thru 5=0) (6 thru hi=1).
variable level a to g (scale).
dataset declare data.
MULTIPLE IMPUTATION a to g
/IMPUTE METHOD=FCS MAXITER= 100 NIMPUTATIONS=10 SCALEMODEL=LINEAR INTERACTIONS=NONE
SINGULAR=1E-012 MAXPCTMISSING=NONE MAXMODELPARAM =10000
/MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0)
/IMPUTATIONSUMMARIES MODELS DESCRIPTIVES
/ANALYSISWEIGHT STUDWGT
/OUTFILE IMPUTATIONS=data.

dataset activate data.
des var all.
***.

Go down to the DESCRIPTIVES output. You will see that the 2 cases with no data receive no imputed values.

I missed that in the manual but it is there:

"Cases that have a missing value for each analysis variable are included in analyses of missingness but are excluded from imputation. Specifically, values of such cases are not imputed and are excluded when when building imputation models. The determination of which cases are completely missing is made after any variables are filtered out of the imputation model by the MAXPCTMISSING keyword."

------------------------------
Rick Marcantonio
Quality Assurance
IBM
------------------------------

Original Message

Original Message:
Sent: Tue September 20, 2022 08:39 AM
From: Courtney B Francis
Subject: Multiple Imputation

Yes!

majority is binary.

------------------------------
Courtney B Francis

Original Message:
Sent: Tue September 20, 2022 08:35 AM
From: Rick Marcantonio
Subject: Multiple Imputation

The original data, before imputation.

Also, it looks like a lot (the majority, perhaps) of these variables are binary (0/1). Is that true?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 08:33 AM
From: Courtney B Francis
Subject: Multiple Imputation

Should I be running this syntax on the imputed data set? Or the original dataset?

Best,

------------------------------
Courtney B Francis

Original Message:
Sent: Tue September 20, 2022 12:38 AM
From: Rick Marcantonio
Subject: Multiple Imputation

OK, I'm just about out of bullets.

Try this. In a syntax window, paste this and run it:

COUNT num_missing= INSTDICO INSTCONT WHITE BLACK Brown Black_Brown_Total    BLACK_BROWN_AUTISM MINORITIZED_POC RECODEAUTISM HSGPA    COLLEGE_INVOLVEMENT HABITS_OF_MIND_GRP ACADEMIC_SELFCONCEPT_GRP    SOCIAL_SELFCONCEPT_GRP DEGASPDICO SEX FIRSTGEN INCOME ACT_FINAL    BLACK_BROWN_XINCOME BLACK_BROWN_XHSGPA BLACK_BROWN_XHOMG    BLACK_BROWN_XCOLLINV BLACK_BROWN_XASCG BLACK_BROWN_XSSCG    BLACK_BROWN_XACT BLACK_BROWN_XSEX BLACK_BROWN_XFIRSTGEN    BLACK_BROWN_XDEGASP WHITE_XINCOME WHITE_XHSGPA WHITE_XHOMG    WHITE_XCOLLINV WHITE_XASCG WHITE_XSSCG WHITE_XACT WHITE_XSEX    WHITE_XFIRSTGEN WHITE_XDEGASP AUTISM_XINCOME AUTISM_XHSGPA    AUTISM_XHOMG AUTISM_XCOLLINV AUTISM_XASCG AUTISM_XSSCG    AUTISM_XACT AUTISM_XSEX AUTISM_XFIRSTGEN AUTISM_XDEGASP    AUTISM_XBLACK_BROWN AUTISM_XWHITE RECODE_DISAB01    RECODE_DISAB02 RECODE_DISAB04 RECODE_DISAB05 RECODE_DISAB06    RECODE_DISAB07 (MISSING, SYSMIS).FRE VAR num_missing.

I think I would also like to see the correlation matrix of these variables.

Maybe you could just send me the dataset.

marcantr@us.ibm.com

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 12:02 AM
From: Courtney B Francis
Subject: Multiple Imputation

I pasted it below:

Variable Summary
	Missing	Valid N	Mean	Std. Deviation
	N	Percent	Mean	Std. Deviation
ACT_FINAL	122581	25.4%	359361	24.7110	16.94311
WHITE_XACT	69418	14.4%	412524	13.2600	40.98339
RECODE of DISAB01_LEARNING DISABILITY	63094	13.1%	418848	.0438	.63238
RECODE of INCOME (Parental Income)	49476	10.3%	432466	1.97	2.875
RECODE of DISAB01_CHRONIC ILLNESS	47338	9.8%	434604	.0356	.57287
RECODE of DISAB01_PSYCHOLOGICAL DISORDER	46662	9.7%	435280	.0221	.45473
RECODE of DISAB01_OTHER DISABILITY	46533	9.7%	435409	.0745	.81162
RECODE of DISAB01_ADHD	45443	9.4%	436499	.0567	.71469
RECODE of DISAB01_LEARNING DISABILITY	45336	9.4%	436606	.0328	.55037
TFS Likelihood of College Involvement Score	44385	9.2%	437557	49.4252	25.04735
DEGREE ASPIRATIONS DICHOTOMOUS	42437	8.8%	439505	.7639	1.31083
BLACK_BROWN_XACT	41513	8.6%	440429	2.6916	22.11014
WHITE_XINCOME	36537	7.6%	445405	1.2194	3.98811
AUTISM_XACT	31898	6.6%	450044	.1061	5.15740
AUTISM_XINCOME	31462	6.5%	450480	.0103	.48015
AUTISM_XCOLLINV	31354	6.5%	450588	.2608	11.01939
AUTISM_XSSCG	31210	6.5%	450732	.0095	.41915
AUTISM_XASCG	31203	6.5%	450739	.0117	.50244
AUTISM_XHSGPA	31183	6.5%	450759	.0367	1.50114
AUTISM_XHOMG	31168	6.5%	450774	.0120	.50833
RECODE of AUTISM	31116	6.5%	450826	.0061	.23762
WHITE_XDEGASP	30886	6.4%	451056	.4121	1.51961
WHITE_XCOLLINV	30022	6.2%	451920	27.2186	76.92240
AUTISM_XDEGASP	25605	5.3%	456337	.0036	.18283
TFS Social Self-Concept Group	21697	4.5%	460245	1.92	2.366
TFS Academic Self-Concept Group	20658	4.3%	461284	1.94	2.226
AUTISM_XWHITE	18863	3.9%	463079	.0036	.18470
BLACK_BROWN_XCOLLINV	18023	3.7%	463919	8.5708	58.87801
BLACK_BROWN_XINCOME	17799	3.7%	464143	.2465	1.86394
WHITE_XSSCG	17458	3.6%	464484	1.0970	3.41879
WHITE_XASCG	16917	3.5%	465025	1.1271	3.43757
BLACK_BROWN_XDEGASP	16077	3.3%	465865	.1395	1.06979
AUTISM_XSEX	15996	3.3%	465946	.0018	.12976
RECODE of FIRSTGEN (First generation status based on parent(s) with less than 's	13686	2.8%	468256	.19	1.217
BLACK_BROWN_XSSCG	13200	2.7%	468742	.3558	2.54229
BLACK_BROWN_XASCG	12949	2.7%	468993	.3372	2.39447
WHITE_XHOMG	12777	2.7%	469165	1.1525	3.51353
WHITE_XHSGPA	11872	2.5%	470070	3.8132	10.48926
TFS Habits of Mind Group	11731	2.4%	470211	1.99	2.335
BLACK_BROWN_XHOMG	11189	2.3%	470753	.3650	2.56174
AUTISM_XFIRSTGEN	11000	2.3%	470942	.0008	.08888
BLACK_BROWN_XHSGPA	10521	2.2%	471421	1.0982	7.29505
AUTISM_XBLACK_BROWN	10239	2.1%	471703	.0007	.08021
WHITE_XFIRSTGEN	10102	2.1%	471840	.0649	.75961
This is the group of ASAIN & HISPAINIC & OTHER	9515	2.0%	472427	.1108	.96923
This is the Black and Brown combined variable	9515	2.0%	472427	.1909	1.21370
This is the Hispanic Race Code	9515	2.0%	472427	.1019	.93436
This is the first Recode of the Black Variable	9515	2.0%	472427	.0889	.87919
This is the White Race Code	9515	2.0%	472427	.5818	1.52343
BLACK_BROWN_XFIRSTGEN	8582	1.8%	473360	.0821	.84733
What was your average grade in high school?	5275	1.1%	476667	6.39	4.275
WHITE_XSEX	4808	1.0%	477134	.3077	1.42634
BLACK_BROWN_XSEX	4808	1.0%	477134	.1096	.96532
ED INST TYPE (UNIVERSITY =0 & 4 YEAR COLLEGE =1)	794	0.2%	481148	.5016	1.54678
RECODE of SEX (Your sex:)	0	0.0%	481942	.54	1.540
BLACK_BROWN_AUTISM	0	0.0%	481942	.0007	.07935
INSTITUTIONAL CONTROL	0	0.0%	481942	1.32	1.447

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 11:57 PM
From: Rick Marcantonio
Subject: Multiple Imputation

I can't see the table I want to see - the Variable Summary.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 11:49 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,

I ran the model again (less iterations and imputation for the sake of time), and I received the same situation. I still have missingness in my pooled dataset for ALL variables.

I used the same syntax, except at the bottom, instead of MISSINGSUMMARIES = NONE, I changed it to: /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0).

If you have any insight into what the issue is, please let me know!

Best,

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 10:10 PM
From: Rick Marcantonio
Subject: Multiple Imputation

Well, no, they'd be at the bottom of the list, since they have no missing data.

I am suggesting that you re-run your original syntax, just please change /MISSINGSUMMARIES NONE to /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0).

I'm trying to get some idea what the "missingness" looks like in these data.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 09:14 PM
From: Courtney B Francis
Subject: Multiple Imputation

HI @Rick Marcantonio,

I realized a typo in my reply to your previous message. I meant to say: There were three variables in the box below labeled: "not imputed (no missing values)"
Are you suggesting I add /MISSINGSUMMARIES OVERALL VARIABLES (MAXVARS=100 MINPCTMISSING=0) to the overall syntax and rerun the model.

Those three variables do not appear at the top of the list in the variable summary table.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 08:46 PM
From: Rick Marcantonio
Subject: Multiple Imputation

OK, maybe we've narrowed it down to those 3 variables that are not imputed due to missing values.

Try adding /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0)

Do those three variables appear at the top of the list in the Variable Summary Table?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 08:17 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi Rick!

That section was blank just as it appears in your image. There were three variables that were "not imputed" due to do missing values" Bur the "Not Imputed (too Many Missing Values) was blank as yours appears above.

Best,
Courtney B. Francis

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 05:12 PM
From: Rick Marcantonio
Subject: Multiple Imputation

For example, Courtney, this table. What do you have, for "Not imputed"?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 05:03 PM
From: Jon Peck
Subject: Multiple Imputation

But did the tables that the MI procedure produces show that some variables/values could not be imputed?

--

Jon K Peck
jkpeck@gmail.com

Original Message:
Sent: 9/19/2022 4:46:00 PM
From: Rick Marcantonio
Subject: RE: Multiple Imputation

Yes, I understand what you mean.

I'm curious if the student weight variable (STUDWGT) has any 0 or missing values...

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 04:16 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,!

They are being imputed (The amount of cases have increased considerably) and the amount of missingness has decreased from the original dataset, but it was my understanding that once all the variables of interest were imputed, there should no longer be any missingness-especially in the pooled dataset.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 04:11 PM
From: Rick Marcantonio
Subject: Multiple Imputation

So then, no values are being imputed at all?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 04:04 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,

Thank you for your response!

All of the variables listed -including the ones in the "/constraints... as Role =IND" have missing data. In all the imputed datasets including the pooled data set. And the amount of missingess is the same across all imputed datasets.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 01:27 PM
From: Rick Marcantonio
Subject: Multiple Imputation

Hi.

Do you mean apart from the variables specified in /CONSTRAINTS as ROLE=IND?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 11:32 AM
From: Courtney B Francis
Subject: Multiple Imputation

Hello!

I have attempted to run Multiple Imputation on my dataset. After the imputation ran I still had missing data in the pooled data set.

Do you know what might have caused this issue? I'm not sure if this is common or if there is an issue in my syntax or Data that may be creating this issue. Please see my syntax below:

SET THREADS = 4.

USE ALL.

FILTER OFF.

SORT CASES BY YEAR SUBJID.

EXECUTE.

SET SEED=20220913.

MULTIPLE IMPUTATION

INSTDICO

INSTCONT

WHITE

BLACK

Brown

Black_Brown_Total

BLACK_BROWN_AUTISM

MINORITIZED_POC

RECODEAUTISM

HSGPA

COLLEGE_INVOLVEMENT

HABITS_OF_MIND_GRP

ACADEMIC_SELFCONCEPT_GRP

SOCIAL_SELFCONCEPT_GRP

DEGASPDICO

SEX

FIRSTGEN

INCOME

ACT_FINAL

BLACK_BROWN_XINCOME

BLACK_BROWN_XHSGPA

BLACK_BROWN_XHOMG

BLACK_BROWN_XCOLLINV

BLACK_BROWN_XASCG

BLACK_BROWN_XSSCG

BLACK_BROWN_XACT

BLACK_BROWN_XSEX

BLACK_BROWN_XFIRSTGEN

BLACK_BROWN_XDEGASP

WHITE_XINCOME

WHITE_XHSGPA

WHITE_XHOMG

WHITE_XCOLLINV

WHITE_XASCG

WHITE_XSSCG

WHITE_XACT

WHITE_XSEX

WHITE_XFIRSTGEN

WHITE_XDEGASP

AUTISM_XINCOME

AUTISM_XHSGPA

AUTISM_XHOMG

AUTISM_XCOLLINV

AUTISM_XASCG

AUTISM_XSSCG

AUTISM_XACT

AUTISM_XSEX

AUTISM_XFIRSTGEN

AUTISM_XDEGASP

AUTISM_XBLACK_BROWN

AUTISM_XWHITE

RECODE_DISAB01

RECODE_DISAB02

RECODE_DISAB04

RECODE_DISAB05

RECODE_DISAB06

RECODE_DISAB07

/ANALYSISWEIGHT STUDWGT

/IMPUTE METHOD=FCS MAXITER= 100 NIMPUTATIONS=10 SCALEMODEL=LINEAR INTERACTIONS=NONE

SINGULAR=1E-012 MAXPCTMISSING=NONE MAXMODELPARAM =10000

/CONSTRAINTS BLACK_BROWN_XINCOME( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XHSGPA( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XHOMG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XCOLLINV( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XASCG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XSSCG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XACT( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XSEX( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XDEGASP( ROLE=IND)

/CONSTRAINTS WHITE_XINCOME( ROLE=IND)

/CONSTRAINTS WHITE_XHSGPA( ROLE=IND)

/CONSTRAINTS WHITE_XHOMG( ROLE=IND)

/CONSTRAINTS WHITE_XCOLLINV( ROLE=IND)

/CONSTRAINTS WHITE_XASCG( ROLE=IND)

/CONSTRAINTS WHITE_XSSCG( ROLE=IND)

/CONSTRAINTS WHITE_XACT( ROLE=IND)

/CONSTRAINTS WHITE_XSEX( ROLE=IND)

/CONSTRAINTS WHITE_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS WHITE_XDEGASP ( ROLE=IND)

/CONSTRAINTS AUTISM_XINCOME( ROLE=IND)

/CONSTRAINTS AUTISM_XHSGPA( ROLE=IND)

/CONSTRAINTS AUTISM_XHOMG( ROLE=IND)

/CONSTRAINTS AUTISM_XCOLLINV( ROLE=IND)

/CONSTRAINTS AUTISM_XASCG( ROLE=IND)

/CONSTRAINTS AUTISM_XSSCG( ROLE=IND)

/CONSTRAINTS AUTISM_XACT( ROLE=IND)

/CONSTRAINTS AUTISM_XSEX( ROLE=IND)

/CONSTRAINTS AUTISM_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS AUTISM_XDEGASP ( ROLE=IND)

/CONSTRAINTS AUTISM_XBLACK_BROWN( ROLE=IND)

/CONSTRAINTS AUTISM_XWHITE( ROLE=IND)

/CONSTRAINTS DEGASP (RND=1 MIN=0 MAX=1)

/MISSINGSUMMARIES NONE

/IMPUTATIONSUMMARIES MODELS DESCRIPTIVES

/OUTFILE IMPUTATIONS=courtney_syntax_9_14_22.sav FCSITERATIONS=iteration_history.

------------------------------
Courtney B Francis
------------------------------
#SPSSStatistics

23. RE: Multiple Imputation

Like

Courtney B Francis

Posted Tue September 20, 2022 03:42 PM

Hi again @Rick Marcantonio,

Based on what you shared from the manual, what is the solution? How do I go about examining for "cases that have a missing value for each analysis variable?" Would I use the sort by function to find all the missing values and then remove those particular cases somehow from my model?

OR

Are you saying the MAXPCTMISSING keyword will filter these cases out? If so I'm not seeing were it does that?

Please let me know your thoughts.

Please forgive the delay in response!

------------------------------
Courtney B Francis
------------------------------

Original Message

Original Message:
Sent: Tue September 20, 2022 09:25 AM
From: Rick Marcantonio
Subject: Multiple Imputation

Courtney;

I think I see what is happening. Cases that are completely missing (e.g., let's say that case #5 has no valid data at all for any of the analysis variables) are not imputed. Here is an example. Open a new syntax window and run it.

preserve.
*output close all.
set undefined=nowarn.
dataset close all.
new file.
DATA LIST FREE /id a b c d e f g studwgt.
begin data.
01 1 4 3 5 6 7 5 12
02 8 6 7 5 6 4 . 21
03 4 5 . 5 4 6 3 10
04 2 1 4 4 3 6 7 21
05 . . . . . . . 14
06 7 7 8 6 8 7 4 11
07 5 4 6 7 8 9 1 16
08 7 6 . 7 . 6 5 14
09 5 3 4 3 2 1 3 16
10 6 6 5 7 . 4 . 12
11 3 3 4 2 3 1 2 15
12 . . . . . . . 20
end data.
restore.
recode a b c e g (lo thru 5=0) (6 thru hi=1).
variable level a to g (scale).
dataset declare data.
MULTIPLE IMPUTATION a to g
/IMPUTE METHOD=FCS MAXITER= 100 NIMPUTATIONS=10 SCALEMODEL=LINEAR INTERACTIONS=NONE
SINGULAR=1E-012 MAXPCTMISSING=NONE MAXMODELPARAM =10000
/MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0)
/IMPUTATIONSUMMARIES MODELS DESCRIPTIVES
/ANALYSISWEIGHT STUDWGT
/OUTFILE IMPUTATIONS=data.

dataset activate data.
des var all.
***.

Go down to the DESCRIPTIVES output. You will see that the 2 cases with no data receive no imputed values.

I missed that in the manual but it is there:

"Cases that have a missing value for each analysis variable are included in analyses of missingness but are excluded from imputation. Specifically, values of such cases are not imputed and are excluded when when building imputation models. The determination of which cases are completely missing is made after any variables are filtered out of the imputation model by the MAXPCTMISSING keyword."

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 08:39 AM
From: Courtney B Francis
Subject: Multiple Imputation

Yes!

majority is binary.

------------------------------
Courtney B Francis

Original Message:
Sent: Tue September 20, 2022 08:35 AM
From: Rick Marcantonio
Subject: Multiple Imputation

The original data, before imputation.

Also, it looks like a lot (the majority, perhaps) of these variables are binary (0/1). Is that true?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 08:33 AM
From: Courtney B Francis
Subject: Multiple Imputation

Should I be running this syntax on the imputed data set? Or the original dataset?

Best,

------------------------------
Courtney B Francis

Original Message:
Sent: Tue September 20, 2022 12:38 AM
From: Rick Marcantonio
Subject: Multiple Imputation

OK, I'm just about out of bullets.

Try this. In a syntax window, paste this and run it:

COUNT num_missing= INSTDICO INSTCONT WHITE BLACK Brown Black_Brown_Total    BLACK_BROWN_AUTISM MINORITIZED_POC RECODEAUTISM HSGPA    COLLEGE_INVOLVEMENT HABITS_OF_MIND_GRP ACADEMIC_SELFCONCEPT_GRP    SOCIAL_SELFCONCEPT_GRP DEGASPDICO SEX FIRSTGEN INCOME ACT_FINAL    BLACK_BROWN_XINCOME BLACK_BROWN_XHSGPA BLACK_BROWN_XHOMG    BLACK_BROWN_XCOLLINV BLACK_BROWN_XASCG BLACK_BROWN_XSSCG    BLACK_BROWN_XACT BLACK_BROWN_XSEX BLACK_BROWN_XFIRSTGEN    BLACK_BROWN_XDEGASP WHITE_XINCOME WHITE_XHSGPA WHITE_XHOMG    WHITE_XCOLLINV WHITE_XASCG WHITE_XSSCG WHITE_XACT WHITE_XSEX    WHITE_XFIRSTGEN WHITE_XDEGASP AUTISM_XINCOME AUTISM_XHSGPA    AUTISM_XHOMG AUTISM_XCOLLINV AUTISM_XASCG AUTISM_XSSCG    AUTISM_XACT AUTISM_XSEX AUTISM_XFIRSTGEN AUTISM_XDEGASP    AUTISM_XBLACK_BROWN AUTISM_XWHITE RECODE_DISAB01    RECODE_DISAB02 RECODE_DISAB04 RECODE_DISAB05 RECODE_DISAB06    RECODE_DISAB07 (MISSING, SYSMIS).FRE VAR num_missing.

I think I would also like to see the correlation matrix of these variables.

Maybe you could just send me the dataset.

marcantr@us.ibm.com

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 12:02 AM
From: Courtney B Francis
Subject: Multiple Imputation

I pasted it below:

Variable Summary
	Missing	Valid N	Mean	Std. Deviation
	N	Percent	Mean	Std. Deviation
ACT_FINAL	122581	25.4%	359361	24.7110	16.94311
WHITE_XACT	69418	14.4%	412524	13.2600	40.98339
RECODE of DISAB01_LEARNING DISABILITY	63094	13.1%	418848	.0438	.63238
RECODE of INCOME (Parental Income)	49476	10.3%	432466	1.97	2.875
RECODE of DISAB01_CHRONIC ILLNESS	47338	9.8%	434604	.0356	.57287
RECODE of DISAB01_PSYCHOLOGICAL DISORDER	46662	9.7%	435280	.0221	.45473
RECODE of DISAB01_OTHER DISABILITY	46533	9.7%	435409	.0745	.81162
RECODE of DISAB01_ADHD	45443	9.4%	436499	.0567	.71469
RECODE of DISAB01_LEARNING DISABILITY	45336	9.4%	436606	.0328	.55037
TFS Likelihood of College Involvement Score	44385	9.2%	437557	49.4252	25.04735
DEGREE ASPIRATIONS DICHOTOMOUS	42437	8.8%	439505	.7639	1.31083
BLACK_BROWN_XACT	41513	8.6%	440429	2.6916	22.11014
WHITE_XINCOME	36537	7.6%	445405	1.2194	3.98811
AUTISM_XACT	31898	6.6%	450044	.1061	5.15740
AUTISM_XINCOME	31462	6.5%	450480	.0103	.48015
AUTISM_XCOLLINV	31354	6.5%	450588	.2608	11.01939
AUTISM_XSSCG	31210	6.5%	450732	.0095	.41915
AUTISM_XASCG	31203	6.5%	450739	.0117	.50244
AUTISM_XHSGPA	31183	6.5%	450759	.0367	1.50114
AUTISM_XHOMG	31168	6.5%	450774	.0120	.50833
RECODE of AUTISM	31116	6.5%	450826	.0061	.23762
WHITE_XDEGASP	30886	6.4%	451056	.4121	1.51961
WHITE_XCOLLINV	30022	6.2%	451920	27.2186	76.92240
AUTISM_XDEGASP	25605	5.3%	456337	.0036	.18283
TFS Social Self-Concept Group	21697	4.5%	460245	1.92	2.366
TFS Academic Self-Concept Group	20658	4.3%	461284	1.94	2.226
AUTISM_XWHITE	18863	3.9%	463079	.0036	.18470
BLACK_BROWN_XCOLLINV	18023	3.7%	463919	8.5708	58.87801
BLACK_BROWN_XINCOME	17799	3.7%	464143	.2465	1.86394
WHITE_XSSCG	17458	3.6%	464484	1.0970	3.41879
WHITE_XASCG	16917	3.5%	465025	1.1271	3.43757
BLACK_BROWN_XDEGASP	16077	3.3%	465865	.1395	1.06979
AUTISM_XSEX	15996	3.3%	465946	.0018	.12976
RECODE of FIRSTGEN (First generation status based on parent(s) with less than 's	13686	2.8%	468256	.19	1.217
BLACK_BROWN_XSSCG	13200	2.7%	468742	.3558	2.54229
BLACK_BROWN_XASCG	12949	2.7%	468993	.3372	2.39447
WHITE_XHOMG	12777	2.7%	469165	1.1525	3.51353
WHITE_XHSGPA	11872	2.5%	470070	3.8132	10.48926
TFS Habits of Mind Group	11731	2.4%	470211	1.99	2.335
BLACK_BROWN_XHOMG	11189	2.3%	470753	.3650	2.56174
AUTISM_XFIRSTGEN	11000	2.3%	470942	.0008	.08888
BLACK_BROWN_XHSGPA	10521	2.2%	471421	1.0982	7.29505
AUTISM_XBLACK_BROWN	10239	2.1%	471703	.0007	.08021
WHITE_XFIRSTGEN	10102	2.1%	471840	.0649	.75961
This is the group of ASAIN & HISPAINIC & OTHER	9515	2.0%	472427	.1108	.96923
This is the Black and Brown combined variable	9515	2.0%	472427	.1909	1.21370
This is the Hispanic Race Code	9515	2.0%	472427	.1019	.93436
This is the first Recode of the Black Variable	9515	2.0%	472427	.0889	.87919
This is the White Race Code	9515	2.0%	472427	.5818	1.52343
BLACK_BROWN_XFIRSTGEN	8582	1.8%	473360	.0821	.84733
What was your average grade in high school?	5275	1.1%	476667	6.39	4.275
WHITE_XSEX	4808	1.0%	477134	.3077	1.42634
BLACK_BROWN_XSEX	4808	1.0%	477134	.1096	.96532
ED INST TYPE (UNIVERSITY =0 & 4 YEAR COLLEGE =1)	794	0.2%	481148	.5016	1.54678
RECODE of SEX (Your sex:)	0	0.0%	481942	.54	1.540
BLACK_BROWN_AUTISM	0	0.0%	481942	.0007	.07935
INSTITUTIONAL CONTROL	0	0.0%	481942	1.32	1.447

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 11:57 PM
From: Rick Marcantonio
Subject: Multiple Imputation

I can't see the table I want to see - the Variable Summary.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 11:49 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,

I ran the model again (less iterations and imputation for the sake of time), and I received the same situation. I still have missingness in my pooled dataset for ALL variables.

I used the same syntax, except at the bottom, instead of MISSINGSUMMARIES = NONE, I changed it to: /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0).

If you have any insight into what the issue is, please let me know!

Best,

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 10:10 PM
From: Rick Marcantonio
Subject: Multiple Imputation

Well, no, they'd be at the bottom of the list, since they have no missing data.

I am suggesting that you re-run your original syntax, just please change /MISSINGSUMMARIES NONE to /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0).

I'm trying to get some idea what the "missingness" looks like in these data.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 09:14 PM
From: Courtney B Francis
Subject: Multiple Imputation

HI @Rick Marcantonio,

I realized a typo in my reply to your previous message. I meant to say: There were three variables in the box below labeled: "not imputed (no missing values)"
Are you suggesting I add /MISSINGSUMMARIES OVERALL VARIABLES (MAXVARS=100 MINPCTMISSING=0) to the overall syntax and rerun the model.

Those three variables do not appear at the top of the list in the variable summary table.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 08:46 PM
From: Rick Marcantonio
Subject: Multiple Imputation

OK, maybe we've narrowed it down to those 3 variables that are not imputed due to missing values.

Try adding /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0)

Do those three variables appear at the top of the list in the Variable Summary Table?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 08:17 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi Rick!

That section was blank just as it appears in your image. There were three variables that were "not imputed" due to do missing values" Bur the "Not Imputed (too Many Missing Values) was blank as yours appears above.

Best,
Courtney B. Francis

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 05:12 PM
From: Rick Marcantonio
Subject: Multiple Imputation

For example, Courtney, this table. What do you have, for "Not imputed"?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 05:03 PM
From: Jon Peck
Subject: Multiple Imputation

But did the tables that the MI procedure produces show that some variables/values could not be imputed?

--

Jon K Peck
jkpeck@gmail.com

Original Message:
Sent: 9/19/2022 4:46:00 PM
From: Rick Marcantonio
Subject: RE: Multiple Imputation

Yes, I understand what you mean.

I'm curious if the student weight variable (STUDWGT) has any 0 or missing values...

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 04:16 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,!

They are being imputed (The amount of cases have increased considerably) and the amount of missingness has decreased from the original dataset, but it was my understanding that once all the variables of interest were imputed, there should no longer be any missingness-especially in the pooled dataset.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 04:11 PM
From: Rick Marcantonio
Subject: Multiple Imputation

So then, no values are being imputed at all?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 04:04 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,

Thank you for your response!

All of the variables listed -including the ones in the "/constraints... as Role =IND" have missing data. In all the imputed datasets including the pooled data set. And the amount of missingess is the same across all imputed datasets.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 01:27 PM
From: Rick Marcantonio
Subject: Multiple Imputation

Hi.

Do you mean apart from the variables specified in /CONSTRAINTS as ROLE=IND?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 11:32 AM
From: Courtney B Francis
Subject: Multiple Imputation

Hello!

I have attempted to run Multiple Imputation on my dataset. After the imputation ran I still had missing data in the pooled data set.

Do you know what might have caused this issue? I'm not sure if this is common or if there is an issue in my syntax or Data that may be creating this issue. Please see my syntax below:

SET THREADS = 4.

USE ALL.

FILTER OFF.

SORT CASES BY YEAR SUBJID.

EXECUTE.

SET SEED=20220913.

MULTIPLE IMPUTATION

INSTDICO

INSTCONT

WHITE

BLACK

Brown

Black_Brown_Total

BLACK_BROWN_AUTISM

MINORITIZED_POC

RECODEAUTISM

HSGPA

COLLEGE_INVOLVEMENT

HABITS_OF_MIND_GRP

ACADEMIC_SELFCONCEPT_GRP

SOCIAL_SELFCONCEPT_GRP

DEGASPDICO

SEX

FIRSTGEN

INCOME

ACT_FINAL

BLACK_BROWN_XINCOME

BLACK_BROWN_XHSGPA

BLACK_BROWN_XHOMG

BLACK_BROWN_XCOLLINV

BLACK_BROWN_XASCG

BLACK_BROWN_XSSCG

BLACK_BROWN_XACT

BLACK_BROWN_XSEX

BLACK_BROWN_XFIRSTGEN

BLACK_BROWN_XDEGASP

WHITE_XINCOME

WHITE_XHSGPA

WHITE_XHOMG

WHITE_XCOLLINV

WHITE_XASCG

WHITE_XSSCG

WHITE_XACT

WHITE_XSEX

WHITE_XFIRSTGEN

WHITE_XDEGASP

AUTISM_XINCOME

AUTISM_XHSGPA

AUTISM_XHOMG

AUTISM_XCOLLINV

AUTISM_XASCG

AUTISM_XSSCG

AUTISM_XACT

AUTISM_XSEX

AUTISM_XFIRSTGEN

AUTISM_XDEGASP

AUTISM_XBLACK_BROWN

AUTISM_XWHITE

RECODE_DISAB01

RECODE_DISAB02

RECODE_DISAB04

RECODE_DISAB05

RECODE_DISAB06

RECODE_DISAB07

/ANALYSISWEIGHT STUDWGT

/IMPUTE METHOD=FCS MAXITER= 100 NIMPUTATIONS=10 SCALEMODEL=LINEAR INTERACTIONS=NONE

SINGULAR=1E-012 MAXPCTMISSING=NONE MAXMODELPARAM =10000

/CONSTRAINTS BLACK_BROWN_XINCOME( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XHSGPA( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XHOMG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XCOLLINV( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XASCG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XSSCG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XACT( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XSEX( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XDEGASP( ROLE=IND)

/CONSTRAINTS WHITE_XINCOME( ROLE=IND)

/CONSTRAINTS WHITE_XHSGPA( ROLE=IND)

/CONSTRAINTS WHITE_XHOMG( ROLE=IND)

/CONSTRAINTS WHITE_XCOLLINV( ROLE=IND)

/CONSTRAINTS WHITE_XASCG( ROLE=IND)

/CONSTRAINTS WHITE_XSSCG( ROLE=IND)

/CONSTRAINTS WHITE_XACT( ROLE=IND)

/CONSTRAINTS WHITE_XSEX( ROLE=IND)

/CONSTRAINTS WHITE_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS WHITE_XDEGASP ( ROLE=IND)

/CONSTRAINTS AUTISM_XINCOME( ROLE=IND)

/CONSTRAINTS AUTISM_XHSGPA( ROLE=IND)

/CONSTRAINTS AUTISM_XHOMG( ROLE=IND)

/CONSTRAINTS AUTISM_XCOLLINV( ROLE=IND)

/CONSTRAINTS AUTISM_XASCG( ROLE=IND)

/CONSTRAINTS AUTISM_XSSCG( ROLE=IND)

/CONSTRAINTS AUTISM_XACT( ROLE=IND)

/CONSTRAINTS AUTISM_XSEX( ROLE=IND)

/CONSTRAINTS AUTISM_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS AUTISM_XDEGASP ( ROLE=IND)

/CONSTRAINTS AUTISM_XBLACK_BROWN( ROLE=IND)

/CONSTRAINTS AUTISM_XWHITE( ROLE=IND)

/CONSTRAINTS DEGASP (RND=1 MIN=0 MAX=1)

/MISSINGSUMMARIES NONE

/IMPUTATIONSUMMARIES MODELS DESCRIPTIVES

/OUTFILE IMPUTATIONS=courtney_syntax_9_14_22.sav FCSITERATIONS=iteration_history.

------------------------------
Courtney B Francis
------------------------------
#SPSSStatistics

24. RE: Multiple Imputation

Like

Rick Marcantonio

Posted Tue September 20, 2022 04:01 PM

I started suspecting this last night, which is why I asked you to create the COUNT variable. Then I was going to find those completely missing cases by sorting the data in descending order by that variable and looking at it.

That's an interesting exercise, but that's about all - unless you have good reason to believe that there were some kind of data entry errors and those cases should have some observed data. But that's not a statistical question.

As for doing something, there really is nothing to "do." If a person gave no data at all for the variables in your imputation model, then they gave no data... that's that. The good news is that you have plenty of data that was complete and/or imputed; more than enough to draw some solid research conclusions. The "empty" cases are causing no harm by being there. Statistically, we cannot give any analysis degrees of freedom it does not deserve by (essentially) "making up" entire cases, no matter how well-intentioned we are in wanting to.

------------------------------
Rick Marcantonio
Quality Assurance
IBM
------------------------------

Original Message

Original Message:
Sent: Tue September 20, 2022 03:41 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi again @Rick Marcantonio,

Based on what you shared from the manual, what is the solution? How do I go about examining for "cases that have a missing value for each analysis variable?" Would I use the sort by function to find all the missing values and then remove those particular cases somehow from my model?

OR

Are you saying the MAXPCTMISSING keyword will filter these cases out? If so I'm not seeing were it does that?

Please let me know your thoughts.

Please forgive the delay in response!

------------------------------
Courtney B Francis

Original Message:
Sent: Tue September 20, 2022 09:25 AM
From: Rick Marcantonio
Subject: Multiple Imputation

Courtney;

I think I see what is happening. Cases that are completely missing (e.g., let's say that case #5 has no valid data at all for any of the analysis variables) are not imputed. Here is an example. Open a new syntax window and run it.

preserve.
*output close all.
set undefined=nowarn.
dataset close all.
new file.
DATA LIST FREE /id a b c d e f g studwgt.
begin data.
01 1 4 3 5 6 7 5 12
02 8 6 7 5 6 4 . 21
03 4 5 . 5 4 6 3 10
04 2 1 4 4 3 6 7 21
05 . . . . . . . 14
06 7 7 8 6 8 7 4 11
07 5 4 6 7 8 9 1 16
08 7 6 . 7 . 6 5 14
09 5 3 4 3 2 1 3 16
10 6 6 5 7 . 4 . 12
11 3 3 4 2 3 1 2 15
12 . . . . . . . 20
end data.
restore.
recode a b c e g (lo thru 5=0) (6 thru hi=1).
variable level a to g (scale).
dataset declare data.
MULTIPLE IMPUTATION a to g
/IMPUTE METHOD=FCS MAXITER= 100 NIMPUTATIONS=10 SCALEMODEL=LINEAR INTERACTIONS=NONE
SINGULAR=1E-012 MAXPCTMISSING=NONE MAXMODELPARAM =10000
/MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0)
/IMPUTATIONSUMMARIES MODELS DESCRIPTIVES
/ANALYSISWEIGHT STUDWGT
/OUTFILE IMPUTATIONS=data.

dataset activate data.
des var all.
***.

Go down to the DESCRIPTIVES output. You will see that the 2 cases with no data receive no imputed values.

I missed that in the manual but it is there:

"Cases that have a missing value for each analysis variable are included in analyses of missingness but are excluded from imputation. Specifically, values of such cases are not imputed and are excluded when when building imputation models. The determination of which cases are completely missing is made after any variables are filtered out of the imputation model by the MAXPCTMISSING keyword."

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 08:39 AM
From: Courtney B Francis
Subject: Multiple Imputation

Yes!

majority is binary.

------------------------------
Courtney B Francis

Original Message:
Sent: Tue September 20, 2022 08:35 AM
From: Rick Marcantonio
Subject: Multiple Imputation

The original data, before imputation.

Also, it looks like a lot (the majority, perhaps) of these variables are binary (0/1). Is that true?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 08:33 AM
From: Courtney B Francis
Subject: Multiple Imputation

Should I be running this syntax on the imputed data set? Or the original dataset?

Best,

------------------------------
Courtney B Francis

Original Message:
Sent: Tue September 20, 2022 12:38 AM
From: Rick Marcantonio
Subject: Multiple Imputation

OK, I'm just about out of bullets.

Try this. In a syntax window, paste this and run it:

COUNT num_missing= INSTDICO INSTCONT WHITE BLACK Brown Black_Brown_Total    BLACK_BROWN_AUTISM MINORITIZED_POC RECODEAUTISM HSGPA    COLLEGE_INVOLVEMENT HABITS_OF_MIND_GRP ACADEMIC_SELFCONCEPT_GRP    SOCIAL_SELFCONCEPT_GRP DEGASPDICO SEX FIRSTGEN INCOME ACT_FINAL    BLACK_BROWN_XINCOME BLACK_BROWN_XHSGPA BLACK_BROWN_XHOMG    BLACK_BROWN_XCOLLINV BLACK_BROWN_XASCG BLACK_BROWN_XSSCG    BLACK_BROWN_XACT BLACK_BROWN_XSEX BLACK_BROWN_XFIRSTGEN    BLACK_BROWN_XDEGASP WHITE_XINCOME WHITE_XHSGPA WHITE_XHOMG    WHITE_XCOLLINV WHITE_XASCG WHITE_XSSCG WHITE_XACT WHITE_XSEX    WHITE_XFIRSTGEN WHITE_XDEGASP AUTISM_XINCOME AUTISM_XHSGPA    AUTISM_XHOMG AUTISM_XCOLLINV AUTISM_XASCG AUTISM_XSSCG    AUTISM_XACT AUTISM_XSEX AUTISM_XFIRSTGEN AUTISM_XDEGASP    AUTISM_XBLACK_BROWN AUTISM_XWHITE RECODE_DISAB01    RECODE_DISAB02 RECODE_DISAB04 RECODE_DISAB05 RECODE_DISAB06    RECODE_DISAB07 (MISSING, SYSMIS).FRE VAR num_missing.

I think I would also like to see the correlation matrix of these variables.

Maybe you could just send me the dataset.

marcantr@us.ibm.com

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 12:02 AM
From: Courtney B Francis
Subject: Multiple Imputation

I pasted it below:

Variable Summary
	Missing	Valid N	Mean	Std. Deviation
	N	Percent	Mean	Std. Deviation
ACT_FINAL	122581	25.4%	359361	24.7110	16.94311
WHITE_XACT	69418	14.4%	412524	13.2600	40.98339
RECODE of DISAB01_LEARNING DISABILITY	63094	13.1%	418848	.0438	.63238
RECODE of INCOME (Parental Income)	49476	10.3%	432466	1.97	2.875
RECODE of DISAB01_CHRONIC ILLNESS	47338	9.8%	434604	.0356	.57287
RECODE of DISAB01_PSYCHOLOGICAL DISORDER	46662	9.7%	435280	.0221	.45473
RECODE of DISAB01_OTHER DISABILITY	46533	9.7%	435409	.0745	.81162
RECODE of DISAB01_ADHD	45443	9.4%	436499	.0567	.71469
RECODE of DISAB01_LEARNING DISABILITY	45336	9.4%	436606	.0328	.55037
TFS Likelihood of College Involvement Score	44385	9.2%	437557	49.4252	25.04735
DEGREE ASPIRATIONS DICHOTOMOUS	42437	8.8%	439505	.7639	1.31083
BLACK_BROWN_XACT	41513	8.6%	440429	2.6916	22.11014
WHITE_XINCOME	36537	7.6%	445405	1.2194	3.98811
AUTISM_XACT	31898	6.6%	450044	.1061	5.15740
AUTISM_XINCOME	31462	6.5%	450480	.0103	.48015
AUTISM_XCOLLINV	31354	6.5%	450588	.2608	11.01939
AUTISM_XSSCG	31210	6.5%	450732	.0095	.41915
AUTISM_XASCG	31203	6.5%	450739	.0117	.50244
AUTISM_XHSGPA	31183	6.5%	450759	.0367	1.50114
AUTISM_XHOMG	31168	6.5%	450774	.0120	.50833
RECODE of AUTISM	31116	6.5%	450826	.0061	.23762
WHITE_XDEGASP	30886	6.4%	451056	.4121	1.51961
WHITE_XCOLLINV	30022	6.2%	451920	27.2186	76.92240
AUTISM_XDEGASP	25605	5.3%	456337	.0036	.18283
TFS Social Self-Concept Group	21697	4.5%	460245	1.92	2.366
TFS Academic Self-Concept Group	20658	4.3%	461284	1.94	2.226
AUTISM_XWHITE	18863	3.9%	463079	.0036	.18470
BLACK_BROWN_XCOLLINV	18023	3.7%	463919	8.5708	58.87801
BLACK_BROWN_XINCOME	17799	3.7%	464143	.2465	1.86394
WHITE_XSSCG	17458	3.6%	464484	1.0970	3.41879
WHITE_XASCG	16917	3.5%	465025	1.1271	3.43757
BLACK_BROWN_XDEGASP	16077	3.3%	465865	.1395	1.06979
AUTISM_XSEX	15996	3.3%	465946	.0018	.12976
RECODE of FIRSTGEN (First generation status based on parent(s) with less than 's	13686	2.8%	468256	.19	1.217
BLACK_BROWN_XSSCG	13200	2.7%	468742	.3558	2.54229
BLACK_BROWN_XASCG	12949	2.7%	468993	.3372	2.39447
WHITE_XHOMG	12777	2.7%	469165	1.1525	3.51353
WHITE_XHSGPA	11872	2.5%	470070	3.8132	10.48926
TFS Habits of Mind Group	11731	2.4%	470211	1.99	2.335
BLACK_BROWN_XHOMG	11189	2.3%	470753	.3650	2.56174
AUTISM_XFIRSTGEN	11000	2.3%	470942	.0008	.08888
BLACK_BROWN_XHSGPA	10521	2.2%	471421	1.0982	7.29505
AUTISM_XBLACK_BROWN	10239	2.1%	471703	.0007	.08021
WHITE_XFIRSTGEN	10102	2.1%	471840	.0649	.75961
This is the group of ASAIN & HISPAINIC & OTHER	9515	2.0%	472427	.1108	.96923
This is the Black and Brown combined variable	9515	2.0%	472427	.1909	1.21370
This is the Hispanic Race Code	9515	2.0%	472427	.1019	.93436
This is the first Recode of the Black Variable	9515	2.0%	472427	.0889	.87919
This is the White Race Code	9515	2.0%	472427	.5818	1.52343
BLACK_BROWN_XFIRSTGEN	8582	1.8%	473360	.0821	.84733
What was your average grade in high school?	5275	1.1%	476667	6.39	4.275
WHITE_XSEX	4808	1.0%	477134	.3077	1.42634
BLACK_BROWN_XSEX	4808	1.0%	477134	.1096	.96532
ED INST TYPE (UNIVERSITY =0 & 4 YEAR COLLEGE =1)	794	0.2%	481148	.5016	1.54678
RECODE of SEX (Your sex:)	0	0.0%	481942	.54	1.540
BLACK_BROWN_AUTISM	0	0.0%	481942	.0007	.07935
INSTITUTIONAL CONTROL	0	0.0%	481942	1.32	1.447

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 11:57 PM
From: Rick Marcantonio
Subject: Multiple Imputation

I can't see the table I want to see - the Variable Summary.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 11:49 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,

I ran the model again (less iterations and imputation for the sake of time), and I received the same situation. I still have missingness in my pooled dataset for ALL variables.

I used the same syntax, except at the bottom, instead of MISSINGSUMMARIES = NONE, I changed it to: /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0).

If you have any insight into what the issue is, please let me know!

Best,

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 10:10 PM
From: Rick Marcantonio
Subject: Multiple Imputation

Well, no, they'd be at the bottom of the list, since they have no missing data.

I am suggesting that you re-run your original syntax, just please change /MISSINGSUMMARIES NONE to /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0).

I'm trying to get some idea what the "missingness" looks like in these data.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 09:14 PM
From: Courtney B Francis
Subject: Multiple Imputation

HI @Rick Marcantonio,

I realized a typo in my reply to your previous message. I meant to say: There were three variables in the box below labeled: "not imputed (no missing values)"
Are you suggesting I add /MISSINGSUMMARIES OVERALL VARIABLES (MAXVARS=100 MINPCTMISSING=0) to the overall syntax and rerun the model.

Those three variables do not appear at the top of the list in the variable summary table.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 08:46 PM
From: Rick Marcantonio
Subject: Multiple Imputation

OK, maybe we've narrowed it down to those 3 variables that are not imputed due to missing values.

Try adding /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0)

Do those three variables appear at the top of the list in the Variable Summary Table?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 08:17 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi Rick!

That section was blank just as it appears in your image. There were three variables that were "not imputed" due to do missing values" Bur the "Not Imputed (too Many Missing Values) was blank as yours appears above.

Best,
Courtney B. Francis

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 05:12 PM
From: Rick Marcantonio
Subject: Multiple Imputation

For example, Courtney, this table. What do you have, for "Not imputed"?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 05:03 PM
From: Jon Peck
Subject: Multiple Imputation

But did the tables that the MI procedure produces show that some variables/values could not be imputed?

--

Jon K Peck
jkpeck@gmail.com

Original Message:
Sent: 9/19/2022 4:46:00 PM
From: Rick Marcantonio
Subject: RE: Multiple Imputation

Yes, I understand what you mean.

I'm curious if the student weight variable (STUDWGT) has any 0 or missing values...

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 04:16 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,!

They are being imputed (The amount of cases have increased considerably) and the amount of missingness has decreased from the original dataset, but it was my understanding that once all the variables of interest were imputed, there should no longer be any missingness-especially in the pooled dataset.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 04:11 PM
From: Rick Marcantonio
Subject: Multiple Imputation

So then, no values are being imputed at all?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 04:04 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,

Thank you for your response!

All of the variables listed -including the ones in the "/constraints... as Role =IND" have missing data. In all the imputed datasets including the pooled data set. And the amount of missingess is the same across all imputed datasets.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 01:27 PM
From: Rick Marcantonio
Subject: Multiple Imputation

Hi.

Do you mean apart from the variables specified in /CONSTRAINTS as ROLE=IND?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 11:32 AM
From: Courtney B Francis
Subject: Multiple Imputation

Hello!

I have attempted to run Multiple Imputation on my dataset. After the imputation ran I still had missing data in the pooled data set.

Do you know what might have caused this issue? I'm not sure if this is common or if there is an issue in my syntax or Data that may be creating this issue. Please see my syntax below:

SET THREADS = 4.

USE ALL.

FILTER OFF.

SORT CASES BY YEAR SUBJID.

EXECUTE.

SET SEED=20220913.

MULTIPLE IMPUTATION

INSTDICO

INSTCONT

WHITE

BLACK

Brown

Black_Brown_Total

BLACK_BROWN_AUTISM

MINORITIZED_POC

RECODEAUTISM

HSGPA

COLLEGE_INVOLVEMENT

HABITS_OF_MIND_GRP

ACADEMIC_SELFCONCEPT_GRP

SOCIAL_SELFCONCEPT_GRP

DEGASPDICO

SEX

FIRSTGEN

INCOME

ACT_FINAL

BLACK_BROWN_XINCOME

BLACK_BROWN_XHSGPA

BLACK_BROWN_XHOMG

BLACK_BROWN_XCOLLINV

BLACK_BROWN_XASCG

BLACK_BROWN_XSSCG

BLACK_BROWN_XACT

BLACK_BROWN_XSEX

BLACK_BROWN_XFIRSTGEN

BLACK_BROWN_XDEGASP

WHITE_XINCOME

WHITE_XHSGPA

WHITE_XHOMG

WHITE_XCOLLINV

WHITE_XASCG

WHITE_XSSCG

WHITE_XACT

WHITE_XSEX

WHITE_XFIRSTGEN

WHITE_XDEGASP

AUTISM_XINCOME

AUTISM_XHSGPA

AUTISM_XHOMG

AUTISM_XCOLLINV

AUTISM_XASCG

AUTISM_XSSCG

AUTISM_XACT

AUTISM_XSEX

AUTISM_XFIRSTGEN

AUTISM_XDEGASP

AUTISM_XBLACK_BROWN

AUTISM_XWHITE

RECODE_DISAB01

RECODE_DISAB02

RECODE_DISAB04

RECODE_DISAB05

RECODE_DISAB06

RECODE_DISAB07

/ANALYSISWEIGHT STUDWGT

/IMPUTE METHOD=FCS MAXITER= 100 NIMPUTATIONS=10 SCALEMODEL=LINEAR INTERACTIONS=NONE

SINGULAR=1E-012 MAXPCTMISSING=NONE MAXMODELPARAM =10000

/CONSTRAINTS BLACK_BROWN_XINCOME( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XHSGPA( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XHOMG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XCOLLINV( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XASCG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XSSCG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XACT( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XSEX( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XDEGASP( ROLE=IND)

/CONSTRAINTS WHITE_XINCOME( ROLE=IND)

/CONSTRAINTS WHITE_XHSGPA( ROLE=IND)

/CONSTRAINTS WHITE_XHOMG( ROLE=IND)

/CONSTRAINTS WHITE_XCOLLINV( ROLE=IND)

/CONSTRAINTS WHITE_XASCG( ROLE=IND)

/CONSTRAINTS WHITE_XSSCG( ROLE=IND)

/CONSTRAINTS WHITE_XACT( ROLE=IND)

/CONSTRAINTS WHITE_XSEX( ROLE=IND)

/CONSTRAINTS WHITE_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS WHITE_XDEGASP ( ROLE=IND)

/CONSTRAINTS AUTISM_XINCOME( ROLE=IND)

/CONSTRAINTS AUTISM_XHSGPA( ROLE=IND)

/CONSTRAINTS AUTISM_XHOMG( ROLE=IND)

/CONSTRAINTS AUTISM_XCOLLINV( ROLE=IND)

/CONSTRAINTS AUTISM_XASCG( ROLE=IND)

/CONSTRAINTS AUTISM_XSSCG( ROLE=IND)

/CONSTRAINTS AUTISM_XACT( ROLE=IND)

/CONSTRAINTS AUTISM_XSEX( ROLE=IND)

/CONSTRAINTS AUTISM_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS AUTISM_XDEGASP ( ROLE=IND)

/CONSTRAINTS AUTISM_XBLACK_BROWN( ROLE=IND)

/CONSTRAINTS AUTISM_XWHITE( ROLE=IND)

/CONSTRAINTS DEGASP (RND=1 MIN=0 MAX=1)

/MISSINGSUMMARIES NONE

/IMPUTATIONSUMMARIES MODELS DESCRIPTIVES

/OUTFILE IMPUTATIONS=courtney_syntax_9_14_22.sav FCSITERATIONS=iteration_history.

------------------------------
Courtney B Francis
------------------------------
#SPSSStatistics

25. RE: Multiple Imputation

Like

Courtney B Francis

Posted Tue September 20, 2022 04:49 PM

Okay,

that makes sense @Rick Marcantonio

So the "missing" data I'm seeing in the pooled cases is just going to "be there"? I guess I've never seen that in all my readings, so it has me a bit alarmed.

------------------------------
Courtney B Francis
------------------------------

Original Message

Original Message:
Sent: Tue September 20, 2022 04:00 PM
From: Rick Marcantonio
Subject: Multiple Imputation

I started suspecting this last night, which is why I asked you to create the COUNT variable. Then I was going to find those completely missing cases by sorting the data in descending order by that variable and looking at it.

That's an interesting exercise, but that's about all - unless you have good reason to believe that there were some kind of data entry errors and those cases should have some observed data. But that's not a statistical question.

As for doing something, there really is nothing to "do." If a person gave no data at all for the variables in your imputation model, then they gave no data... that's that. The good news is that you have plenty of data that was complete and/or imputed; more than enough to draw some solid research conclusions. The "empty" cases are causing no harm by being there. Statistically, we cannot give any analysis degrees of freedom it does not deserve by (essentially) "making up" entire cases, no matter how well-intentioned we are in wanting to.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 03:41 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi again @Rick Marcantonio,

Based on what you shared from the manual, what is the solution? How do I go about examining for "cases that have a missing value for each analysis variable?" Would I use the sort by function to find all the missing values and then remove those particular cases somehow from my model?

OR

Are you saying the MAXPCTMISSING keyword will filter these cases out? If so I'm not seeing were it does that?

Please let me know your thoughts.

Please forgive the delay in response!

------------------------------
Courtney B Francis

Original Message:
Sent: Tue September 20, 2022 09:25 AM
From: Rick Marcantonio
Subject: Multiple Imputation

Courtney;

I think I see what is happening. Cases that are completely missing (e.g., let's say that case #5 has no valid data at all for any of the analysis variables) are not imputed. Here is an example. Open a new syntax window and run it.

preserve.
*output close all.
set undefined=nowarn.
dataset close all.
new file.
DATA LIST FREE /id a b c d e f g studwgt.
begin data.
01 1 4 3 5 6 7 5 12
02 8 6 7 5 6 4 . 21
03 4 5 . 5 4 6 3 10
04 2 1 4 4 3 6 7 21
05 . . . . . . . 14
06 7 7 8 6 8 7 4 11
07 5 4 6 7 8 9 1 16
08 7 6 . 7 . 6 5 14
09 5 3 4 3 2 1 3 16
10 6 6 5 7 . 4 . 12
11 3 3 4 2 3 1 2 15
12 . . . . . . . 20
end data.
restore.
recode a b c e g (lo thru 5=0) (6 thru hi=1).
variable level a to g (scale).
dataset declare data.
MULTIPLE IMPUTATION a to g
/IMPUTE METHOD=FCS MAXITER= 100 NIMPUTATIONS=10 SCALEMODEL=LINEAR INTERACTIONS=NONE
SINGULAR=1E-012 MAXPCTMISSING=NONE MAXMODELPARAM =10000
/MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0)
/IMPUTATIONSUMMARIES MODELS DESCRIPTIVES
/ANALYSISWEIGHT STUDWGT
/OUTFILE IMPUTATIONS=data.

dataset activate data.
des var all.
***.

Go down to the DESCRIPTIVES output. You will see that the 2 cases with no data receive no imputed values.

I missed that in the manual but it is there:

"Cases that have a missing value for each analysis variable are included in analyses of missingness but are excluded from imputation. Specifically, values of such cases are not imputed and are excluded when when building imputation models. The determination of which cases are completely missing is made after any variables are filtered out of the imputation model by the MAXPCTMISSING keyword."

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 08:39 AM
From: Courtney B Francis
Subject: Multiple Imputation

Yes!

majority is binary.

------------------------------
Courtney B Francis

Original Message:
Sent: Tue September 20, 2022 08:35 AM
From: Rick Marcantonio
Subject: Multiple Imputation

The original data, before imputation.

Also, it looks like a lot (the majority, perhaps) of these variables are binary (0/1). Is that true?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 08:33 AM
From: Courtney B Francis
Subject: Multiple Imputation

Should I be running this syntax on the imputed data set? Or the original dataset?

Best,

------------------------------
Courtney B Francis

Original Message:
Sent: Tue September 20, 2022 12:38 AM
From: Rick Marcantonio
Subject: Multiple Imputation

OK, I'm just about out of bullets.

Try this. In a syntax window, paste this and run it:

COUNT num_missing= INSTDICO INSTCONT WHITE BLACK Brown Black_Brown_Total    BLACK_BROWN_AUTISM MINORITIZED_POC RECODEAUTISM HSGPA    COLLEGE_INVOLVEMENT HABITS_OF_MIND_GRP ACADEMIC_SELFCONCEPT_GRP    SOCIAL_SELFCONCEPT_GRP DEGASPDICO SEX FIRSTGEN INCOME ACT_FINAL    BLACK_BROWN_XINCOME BLACK_BROWN_XHSGPA BLACK_BROWN_XHOMG    BLACK_BROWN_XCOLLINV BLACK_BROWN_XASCG BLACK_BROWN_XSSCG    BLACK_BROWN_XACT BLACK_BROWN_XSEX BLACK_BROWN_XFIRSTGEN    BLACK_BROWN_XDEGASP WHITE_XINCOME WHITE_XHSGPA WHITE_XHOMG    WHITE_XCOLLINV WHITE_XASCG WHITE_XSSCG WHITE_XACT WHITE_XSEX    WHITE_XFIRSTGEN WHITE_XDEGASP AUTISM_XINCOME AUTISM_XHSGPA    AUTISM_XHOMG AUTISM_XCOLLINV AUTISM_XASCG AUTISM_XSSCG    AUTISM_XACT AUTISM_XSEX AUTISM_XFIRSTGEN AUTISM_XDEGASP    AUTISM_XBLACK_BROWN AUTISM_XWHITE RECODE_DISAB01    RECODE_DISAB02 RECODE_DISAB04 RECODE_DISAB05 RECODE_DISAB06    RECODE_DISAB07 (MISSING, SYSMIS).FRE VAR num_missing.

I think I would also like to see the correlation matrix of these variables.

Maybe you could just send me the dataset.

marcantr@us.ibm.com

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 12:02 AM
From: Courtney B Francis
Subject: Multiple Imputation

I pasted it below:

Variable Summary
	Missing	Valid N	Mean	Std. Deviation
	N	Percent	Mean	Std. Deviation
ACT_FINAL	122581	25.4%	359361	24.7110	16.94311
WHITE_XACT	69418	14.4%	412524	13.2600	40.98339
RECODE of DISAB01_LEARNING DISABILITY	63094	13.1%	418848	.0438	.63238
RECODE of INCOME (Parental Income)	49476	10.3%	432466	1.97	2.875
RECODE of DISAB01_CHRONIC ILLNESS	47338	9.8%	434604	.0356	.57287
RECODE of DISAB01_PSYCHOLOGICAL DISORDER	46662	9.7%	435280	.0221	.45473
RECODE of DISAB01_OTHER DISABILITY	46533	9.7%	435409	.0745	.81162
RECODE of DISAB01_ADHD	45443	9.4%	436499	.0567	.71469
RECODE of DISAB01_LEARNING DISABILITY	45336	9.4%	436606	.0328	.55037
TFS Likelihood of College Involvement Score	44385	9.2%	437557	49.4252	25.04735
DEGREE ASPIRATIONS DICHOTOMOUS	42437	8.8%	439505	.7639	1.31083
BLACK_BROWN_XACT	41513	8.6%	440429	2.6916	22.11014
WHITE_XINCOME	36537	7.6%	445405	1.2194	3.98811
AUTISM_XACT	31898	6.6%	450044	.1061	5.15740
AUTISM_XINCOME	31462	6.5%	450480	.0103	.48015
AUTISM_XCOLLINV	31354	6.5%	450588	.2608	11.01939
AUTISM_XSSCG	31210	6.5%	450732	.0095	.41915
AUTISM_XASCG	31203	6.5%	450739	.0117	.50244
AUTISM_XHSGPA	31183	6.5%	450759	.0367	1.50114
AUTISM_XHOMG	31168	6.5%	450774	.0120	.50833
RECODE of AUTISM	31116	6.5%	450826	.0061	.23762
WHITE_XDEGASP	30886	6.4%	451056	.4121	1.51961
WHITE_XCOLLINV	30022	6.2%	451920	27.2186	76.92240
AUTISM_XDEGASP	25605	5.3%	456337	.0036	.18283
TFS Social Self-Concept Group	21697	4.5%	460245	1.92	2.366
TFS Academic Self-Concept Group	20658	4.3%	461284	1.94	2.226
AUTISM_XWHITE	18863	3.9%	463079	.0036	.18470
BLACK_BROWN_XCOLLINV	18023	3.7%	463919	8.5708	58.87801
BLACK_BROWN_XINCOME	17799	3.7%	464143	.2465	1.86394
WHITE_XSSCG	17458	3.6%	464484	1.0970	3.41879
WHITE_XASCG	16917	3.5%	465025	1.1271	3.43757
BLACK_BROWN_XDEGASP	16077	3.3%	465865	.1395	1.06979
AUTISM_XSEX	15996	3.3%	465946	.0018	.12976
RECODE of FIRSTGEN (First generation status based on parent(s) with less than 's	13686	2.8%	468256	.19	1.217
BLACK_BROWN_XSSCG	13200	2.7%	468742	.3558	2.54229
BLACK_BROWN_XASCG	12949	2.7%	468993	.3372	2.39447
WHITE_XHOMG	12777	2.7%	469165	1.1525	3.51353
WHITE_XHSGPA	11872	2.5%	470070	3.8132	10.48926
TFS Habits of Mind Group	11731	2.4%	470211	1.99	2.335
BLACK_BROWN_XHOMG	11189	2.3%	470753	.3650	2.56174
AUTISM_XFIRSTGEN	11000	2.3%	470942	.0008	.08888
BLACK_BROWN_XHSGPA	10521	2.2%	471421	1.0982	7.29505
AUTISM_XBLACK_BROWN	10239	2.1%	471703	.0007	.08021
WHITE_XFIRSTGEN	10102	2.1%	471840	.0649	.75961
This is the group of ASAIN & HISPAINIC & OTHER	9515	2.0%	472427	.1108	.96923
This is the Black and Brown combined variable	9515	2.0%	472427	.1909	1.21370
This is the Hispanic Race Code	9515	2.0%	472427	.1019	.93436
This is the first Recode of the Black Variable	9515	2.0%	472427	.0889	.87919
This is the White Race Code	9515	2.0%	472427	.5818	1.52343
BLACK_BROWN_XFIRSTGEN	8582	1.8%	473360	.0821	.84733
What was your average grade in high school?	5275	1.1%	476667	6.39	4.275
WHITE_XSEX	4808	1.0%	477134	.3077	1.42634
BLACK_BROWN_XSEX	4808	1.0%	477134	.1096	.96532
ED INST TYPE (UNIVERSITY =0 & 4 YEAR COLLEGE =1)	794	0.2%	481148	.5016	1.54678
RECODE of SEX (Your sex:)	0	0.0%	481942	.54	1.540
BLACK_BROWN_AUTISM	0	0.0%	481942	.0007	.07935
INSTITUTIONAL CONTROL	0	0.0%	481942	1.32	1.447

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 11:57 PM
From: Rick Marcantonio
Subject: Multiple Imputation

I can't see the table I want to see - the Variable Summary.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 11:49 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,

I ran the model again (less iterations and imputation for the sake of time), and I received the same situation. I still have missingness in my pooled dataset for ALL variables.

I used the same syntax, except at the bottom, instead of MISSINGSUMMARIES = NONE, I changed it to: /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0).

If you have any insight into what the issue is, please let me know!

Best,

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 10:10 PM
From: Rick Marcantonio
Subject: Multiple Imputation

Well, no, they'd be at the bottom of the list, since they have no missing data.

I am suggesting that you re-run your original syntax, just please change /MISSINGSUMMARIES NONE to /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0).

I'm trying to get some idea what the "missingness" looks like in these data.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 09:14 PM
From: Courtney B Francis
Subject: Multiple Imputation

HI @Rick Marcantonio,

I realized a typo in my reply to your previous message. I meant to say: There were three variables in the box below labeled: "not imputed (no missing values)"
Are you suggesting I add /MISSINGSUMMARIES OVERALL VARIABLES (MAXVARS=100 MINPCTMISSING=0) to the overall syntax and rerun the model.

Those three variables do not appear at the top of the list in the variable summary table.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 08:46 PM
From: Rick Marcantonio
Subject: Multiple Imputation

OK, maybe we've narrowed it down to those 3 variables that are not imputed due to missing values.

Try adding /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0)

Do those three variables appear at the top of the list in the Variable Summary Table?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 08:17 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi Rick!

That section was blank just as it appears in your image. There were three variables that were "not imputed" due to do missing values" Bur the "Not Imputed (too Many Missing Values) was blank as yours appears above.

Best,
Courtney B. Francis

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 05:12 PM
From: Rick Marcantonio
Subject: Multiple Imputation

For example, Courtney, this table. What do you have, for "Not imputed"?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 05:03 PM
From: Jon Peck
Subject: Multiple Imputation

But did the tables that the MI procedure produces show that some variables/values could not be imputed?

--

Jon K Peck
jkpeck@gmail.com

Original Message:
Sent: 9/19/2022 4:46:00 PM
From: Rick Marcantonio
Subject: RE: Multiple Imputation

Yes, I understand what you mean.

I'm curious if the student weight variable (STUDWGT) has any 0 or missing values...

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 04:16 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,!

They are being imputed (The amount of cases have increased considerably) and the amount of missingness has decreased from the original dataset, but it was my understanding that once all the variables of interest were imputed, there should no longer be any missingness-especially in the pooled dataset.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 04:11 PM
From: Rick Marcantonio
Subject: Multiple Imputation

So then, no values are being imputed at all?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 04:04 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,

Thank you for your response!

All of the variables listed -including the ones in the "/constraints... as Role =IND" have missing data. In all the imputed datasets including the pooled data set. And the amount of missingess is the same across all imputed datasets.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 01:27 PM
From: Rick Marcantonio
Subject: Multiple Imputation

Hi.

Do you mean apart from the variables specified in /CONSTRAINTS as ROLE=IND?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 11:32 AM
From: Courtney B Francis
Subject: Multiple Imputation

Hello!

I have attempted to run Multiple Imputation on my dataset. After the imputation ran I still had missing data in the pooled data set.

Do you know what might have caused this issue? I'm not sure if this is common or if there is an issue in my syntax or Data that may be creating this issue. Please see my syntax below:

SET THREADS = 4.

USE ALL.

FILTER OFF.

SORT CASES BY YEAR SUBJID.

EXECUTE.

SET SEED=20220913.

MULTIPLE IMPUTATION

INSTDICO

INSTCONT

WHITE

BLACK

Brown

Black_Brown_Total

BLACK_BROWN_AUTISM

MINORITIZED_POC

RECODEAUTISM

HSGPA

COLLEGE_INVOLVEMENT

HABITS_OF_MIND_GRP

ACADEMIC_SELFCONCEPT_GRP

SOCIAL_SELFCONCEPT_GRP

DEGASPDICO

SEX

FIRSTGEN

INCOME

ACT_FINAL

BLACK_BROWN_XINCOME

BLACK_BROWN_XHSGPA

BLACK_BROWN_XHOMG

BLACK_BROWN_XCOLLINV

BLACK_BROWN_XASCG

BLACK_BROWN_XSSCG

BLACK_BROWN_XACT

BLACK_BROWN_XSEX

BLACK_BROWN_XFIRSTGEN

BLACK_BROWN_XDEGASP

WHITE_XINCOME

WHITE_XHSGPA

WHITE_XHOMG

WHITE_XCOLLINV

WHITE_XASCG

WHITE_XSSCG

WHITE_XACT

WHITE_XSEX

WHITE_XFIRSTGEN

WHITE_XDEGASP

AUTISM_XINCOME

AUTISM_XHSGPA

AUTISM_XHOMG

AUTISM_XCOLLINV

AUTISM_XASCG

AUTISM_XSSCG

AUTISM_XACT

AUTISM_XSEX

AUTISM_XFIRSTGEN

AUTISM_XDEGASP

AUTISM_XBLACK_BROWN

AUTISM_XWHITE

RECODE_DISAB01

RECODE_DISAB02

RECODE_DISAB04

RECODE_DISAB05

RECODE_DISAB06

RECODE_DISAB07

/ANALYSISWEIGHT STUDWGT

/IMPUTE METHOD=FCS MAXITER= 100 NIMPUTATIONS=10 SCALEMODEL=LINEAR INTERACTIONS=NONE

SINGULAR=1E-012 MAXPCTMISSING=NONE MAXMODELPARAM =10000

/CONSTRAINTS BLACK_BROWN_XINCOME( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XHSGPA( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XHOMG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XCOLLINV( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XASCG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XSSCG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XACT( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XSEX( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XDEGASP( ROLE=IND)

/CONSTRAINTS WHITE_XINCOME( ROLE=IND)

/CONSTRAINTS WHITE_XHSGPA( ROLE=IND)

/CONSTRAINTS WHITE_XHOMG( ROLE=IND)

/CONSTRAINTS WHITE_XCOLLINV( ROLE=IND)

/CONSTRAINTS WHITE_XASCG( ROLE=IND)

/CONSTRAINTS WHITE_XSSCG( ROLE=IND)

/CONSTRAINTS WHITE_XACT( ROLE=IND)

/CONSTRAINTS WHITE_XSEX( ROLE=IND)

/CONSTRAINTS WHITE_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS WHITE_XDEGASP ( ROLE=IND)

/CONSTRAINTS AUTISM_XINCOME( ROLE=IND)

/CONSTRAINTS AUTISM_XHSGPA( ROLE=IND)

/CONSTRAINTS AUTISM_XHOMG( ROLE=IND)

/CONSTRAINTS AUTISM_XCOLLINV( ROLE=IND)

/CONSTRAINTS AUTISM_XASCG( ROLE=IND)

/CONSTRAINTS AUTISM_XSSCG( ROLE=IND)

/CONSTRAINTS AUTISM_XACT( ROLE=IND)

/CONSTRAINTS AUTISM_XSEX( ROLE=IND)

/CONSTRAINTS AUTISM_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS AUTISM_XDEGASP ( ROLE=IND)

/CONSTRAINTS AUTISM_XBLACK_BROWN( ROLE=IND)

/CONSTRAINTS AUTISM_XWHITE( ROLE=IND)

/CONSTRAINTS DEGASP (RND=1 MIN=0 MAX=1)

/MISSINGSUMMARIES NONE

/IMPUTATIONSUMMARIES MODELS DESCRIPTIVES

/OUTFILE IMPUTATIONS=courtney_syntax_9_14_22.sav FCSITERATIONS=iteration_history.

------------------------------
Courtney B Francis
------------------------------
#SPSSStatistics

26. RE: Multiple Imputation

Like

Rick Marcantonio

Posted Tue September 20, 2022 04:58 PM

Don't let it alarm you. You did impute missing data where it could be imputed. It isn't like you did nothing. You did quite a bit!

Missing CASES are a different story. Those can safely be ignored.

------------------------------
Rick Marcantonio
Quality Assurance
IBM
------------------------------

Original Message

Original Message:
Sent: Tue September 20, 2022 04:48 PM
From: Courtney B Francis
Subject: Multiple Imputation

Okay,

that makes sense @Rick Marcantonio

So the "missing" data I'm seeing in the pooled cases is just going to "be there"? I guess I've never seen that in all my readings, so it has me a bit alarmed.

------------------------------
Courtney B Francis

Original Message:
Sent: Tue September 20, 2022 04:00 PM
From: Rick Marcantonio
Subject: Multiple Imputation

I started suspecting this last night, which is why I asked you to create the COUNT variable. Then I was going to find those completely missing cases by sorting the data in descending order by that variable and looking at it.

That's an interesting exercise, but that's about all - unless you have good reason to believe that there were some kind of data entry errors and those cases should have some observed data. But that's not a statistical question.

As for doing something, there really is nothing to "do." If a person gave no data at all for the variables in your imputation model, then they gave no data... that's that. The good news is that you have plenty of data that was complete and/or imputed; more than enough to draw some solid research conclusions. The "empty" cases are causing no harm by being there. Statistically, we cannot give any analysis degrees of freedom it does not deserve by (essentially) "making up" entire cases, no matter how well-intentioned we are in wanting to.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 03:41 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi again @Rick Marcantonio,

Based on what you shared from the manual, what is the solution? How do I go about examining for "cases that have a missing value for each analysis variable?" Would I use the sort by function to find all the missing values and then remove those particular cases somehow from my model?

OR

Are you saying the MAXPCTMISSING keyword will filter these cases out? If so I'm not seeing were it does that?

Please let me know your thoughts.

Please forgive the delay in response!

------------------------------
Courtney B Francis

Original Message:
Sent: Tue September 20, 2022 09:25 AM
From: Rick Marcantonio
Subject: Multiple Imputation

Courtney;

I think I see what is happening. Cases that are completely missing (e.g., let's say that case #5 has no valid data at all for any of the analysis variables) are not imputed. Here is an example. Open a new syntax window and run it.

preserve.
*output close all.
set undefined=nowarn.
dataset close all.
new file.
DATA LIST FREE /id a b c d e f g studwgt.
begin data.
01 1 4 3 5 6 7 5 12
02 8 6 7 5 6 4 . 21
03 4 5 . 5 4 6 3 10
04 2 1 4 4 3 6 7 21
05 . . . . . . . 14
06 7 7 8 6 8 7 4 11
07 5 4 6 7 8 9 1 16
08 7 6 . 7 . 6 5 14
09 5 3 4 3 2 1 3 16
10 6 6 5 7 . 4 . 12
11 3 3 4 2 3 1 2 15
12 . . . . . . . 20
end data.
restore.
recode a b c e g (lo thru 5=0) (6 thru hi=1).
variable level a to g (scale).
dataset declare data.
MULTIPLE IMPUTATION a to g
/IMPUTE METHOD=FCS MAXITER= 100 NIMPUTATIONS=10 SCALEMODEL=LINEAR INTERACTIONS=NONE
SINGULAR=1E-012 MAXPCTMISSING=NONE MAXMODELPARAM =10000
/MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0)
/IMPUTATIONSUMMARIES MODELS DESCRIPTIVES
/ANALYSISWEIGHT STUDWGT
/OUTFILE IMPUTATIONS=data.

dataset activate data.
des var all.
***.

Go down to the DESCRIPTIVES output. You will see that the 2 cases with no data receive no imputed values.

I missed that in the manual but it is there:

"Cases that have a missing value for each analysis variable are included in analyses of missingness but are excluded from imputation. Specifically, values of such cases are not imputed and are excluded when when building imputation models. The determination of which cases are completely missing is made after any variables are filtered out of the imputation model by the MAXPCTMISSING keyword."

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 08:39 AM
From: Courtney B Francis
Subject: Multiple Imputation

Yes!

majority is binary.

------------------------------
Courtney B Francis

Original Message:
Sent: Tue September 20, 2022 08:35 AM
From: Rick Marcantonio
Subject: Multiple Imputation

The original data, before imputation.

Also, it looks like a lot (the majority, perhaps) of these variables are binary (0/1). Is that true?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 08:33 AM
From: Courtney B Francis
Subject: Multiple Imputation

Should I be running this syntax on the imputed data set? Or the original dataset?

Best,

------------------------------
Courtney B Francis

Original Message:
Sent: Tue September 20, 2022 12:38 AM
From: Rick Marcantonio
Subject: Multiple Imputation

OK, I'm just about out of bullets.

Try this. In a syntax window, paste this and run it:

COUNT num_missing= INSTDICO INSTCONT WHITE BLACK Brown Black_Brown_Total    BLACK_BROWN_AUTISM MINORITIZED_POC RECODEAUTISM HSGPA    COLLEGE_INVOLVEMENT HABITS_OF_MIND_GRP ACADEMIC_SELFCONCEPT_GRP    SOCIAL_SELFCONCEPT_GRP DEGASPDICO SEX FIRSTGEN INCOME ACT_FINAL    BLACK_BROWN_XINCOME BLACK_BROWN_XHSGPA BLACK_BROWN_XHOMG    BLACK_BROWN_XCOLLINV BLACK_BROWN_XASCG BLACK_BROWN_XSSCG    BLACK_BROWN_XACT BLACK_BROWN_XSEX BLACK_BROWN_XFIRSTGEN    BLACK_BROWN_XDEGASP WHITE_XINCOME WHITE_XHSGPA WHITE_XHOMG    WHITE_XCOLLINV WHITE_XASCG WHITE_XSSCG WHITE_XACT WHITE_XSEX    WHITE_XFIRSTGEN WHITE_XDEGASP AUTISM_XINCOME AUTISM_XHSGPA    AUTISM_XHOMG AUTISM_XCOLLINV AUTISM_XASCG AUTISM_XSSCG    AUTISM_XACT AUTISM_XSEX AUTISM_XFIRSTGEN AUTISM_XDEGASP    AUTISM_XBLACK_BROWN AUTISM_XWHITE RECODE_DISAB01    RECODE_DISAB02 RECODE_DISAB04 RECODE_DISAB05 RECODE_DISAB06    RECODE_DISAB07 (MISSING, SYSMIS).FRE VAR num_missing.

I think I would also like to see the correlation matrix of these variables.

Maybe you could just send me the dataset.

marcantr@us.ibm.com

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 12:02 AM
From: Courtney B Francis
Subject: Multiple Imputation

I pasted it below:

Variable Summary
	Missing	Valid N	Mean	Std. Deviation
	N	Percent	Mean	Std. Deviation
ACT_FINAL	122581	25.4%	359361	24.7110	16.94311
WHITE_XACT	69418	14.4%	412524	13.2600	40.98339
RECODE of DISAB01_LEARNING DISABILITY	63094	13.1%	418848	.0438	.63238
RECODE of INCOME (Parental Income)	49476	10.3%	432466	1.97	2.875
RECODE of DISAB01_CHRONIC ILLNESS	47338	9.8%	434604	.0356	.57287
RECODE of DISAB01_PSYCHOLOGICAL DISORDER	46662	9.7%	435280	.0221	.45473
RECODE of DISAB01_OTHER DISABILITY	46533	9.7%	435409	.0745	.81162
RECODE of DISAB01_ADHD	45443	9.4%	436499	.0567	.71469
RECODE of DISAB01_LEARNING DISABILITY	45336	9.4%	436606	.0328	.55037
TFS Likelihood of College Involvement Score	44385	9.2%	437557	49.4252	25.04735
DEGREE ASPIRATIONS DICHOTOMOUS	42437	8.8%	439505	.7639	1.31083
BLACK_BROWN_XACT	41513	8.6%	440429	2.6916	22.11014
WHITE_XINCOME	36537	7.6%	445405	1.2194	3.98811
AUTISM_XACT	31898	6.6%	450044	.1061	5.15740
AUTISM_XINCOME	31462	6.5%	450480	.0103	.48015
AUTISM_XCOLLINV	31354	6.5%	450588	.2608	11.01939
AUTISM_XSSCG	31210	6.5%	450732	.0095	.41915
AUTISM_XASCG	31203	6.5%	450739	.0117	.50244
AUTISM_XHSGPA	31183	6.5%	450759	.0367	1.50114
AUTISM_XHOMG	31168	6.5%	450774	.0120	.50833
RECODE of AUTISM	31116	6.5%	450826	.0061	.23762
WHITE_XDEGASP	30886	6.4%	451056	.4121	1.51961
WHITE_XCOLLINV	30022	6.2%	451920	27.2186	76.92240
AUTISM_XDEGASP	25605	5.3%	456337	.0036	.18283
TFS Social Self-Concept Group	21697	4.5%	460245	1.92	2.366
TFS Academic Self-Concept Group	20658	4.3%	461284	1.94	2.226
AUTISM_XWHITE	18863	3.9%	463079	.0036	.18470
BLACK_BROWN_XCOLLINV	18023	3.7%	463919	8.5708	58.87801
BLACK_BROWN_XINCOME	17799	3.7%	464143	.2465	1.86394
WHITE_XSSCG	17458	3.6%	464484	1.0970	3.41879
WHITE_XASCG	16917	3.5%	465025	1.1271	3.43757
BLACK_BROWN_XDEGASP	16077	3.3%	465865	.1395	1.06979
AUTISM_XSEX	15996	3.3%	465946	.0018	.12976
RECODE of FIRSTGEN (First generation status based on parent(s) with less than 's	13686	2.8%	468256	.19	1.217
BLACK_BROWN_XSSCG	13200	2.7%	468742	.3558	2.54229
BLACK_BROWN_XASCG	12949	2.7%	468993	.3372	2.39447
WHITE_XHOMG	12777	2.7%	469165	1.1525	3.51353
WHITE_XHSGPA	11872	2.5%	470070	3.8132	10.48926
TFS Habits of Mind Group	11731	2.4%	470211	1.99	2.335
BLACK_BROWN_XHOMG	11189	2.3%	470753	.3650	2.56174
AUTISM_XFIRSTGEN	11000	2.3%	470942	.0008	.08888
BLACK_BROWN_XHSGPA	10521	2.2%	471421	1.0982	7.29505
AUTISM_XBLACK_BROWN	10239	2.1%	471703	.0007	.08021
WHITE_XFIRSTGEN	10102	2.1%	471840	.0649	.75961
This is the group of ASAIN & HISPAINIC & OTHER	9515	2.0%	472427	.1108	.96923
This is the Black and Brown combined variable	9515	2.0%	472427	.1909	1.21370
This is the Hispanic Race Code	9515	2.0%	472427	.1019	.93436
This is the first Recode of the Black Variable	9515	2.0%	472427	.0889	.87919
This is the White Race Code	9515	2.0%	472427	.5818	1.52343
BLACK_BROWN_XFIRSTGEN	8582	1.8%	473360	.0821	.84733
What was your average grade in high school?	5275	1.1%	476667	6.39	4.275
WHITE_XSEX	4808	1.0%	477134	.3077	1.42634
BLACK_BROWN_XSEX	4808	1.0%	477134	.1096	.96532
ED INST TYPE (UNIVERSITY =0 & 4 YEAR COLLEGE =1)	794	0.2%	481148	.5016	1.54678
RECODE of SEX (Your sex:)	0	0.0%	481942	.54	1.540
BLACK_BROWN_AUTISM	0	0.0%	481942	.0007	.07935
INSTITUTIONAL CONTROL	0	0.0%	481942	1.32	1.447

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 11:57 PM
From: Rick Marcantonio
Subject: Multiple Imputation

I can't see the table I want to see - the Variable Summary.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 11:49 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,

I ran the model again (less iterations and imputation for the sake of time), and I received the same situation. I still have missingness in my pooled dataset for ALL variables.

I used the same syntax, except at the bottom, instead of MISSINGSUMMARIES = NONE, I changed it to: /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0).

If you have any insight into what the issue is, please let me know!

Best,

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 10:10 PM
From: Rick Marcantonio
Subject: Multiple Imputation

Well, no, they'd be at the bottom of the list, since they have no missing data.

I am suggesting that you re-run your original syntax, just please change /MISSINGSUMMARIES NONE to /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0).

I'm trying to get some idea what the "missingness" looks like in these data.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 09:14 PM
From: Courtney B Francis
Subject: Multiple Imputation

HI @Rick Marcantonio,

I realized a typo in my reply to your previous message. I meant to say: There were three variables in the box below labeled: "not imputed (no missing values)"
Are you suggesting I add /MISSINGSUMMARIES OVERALL VARIABLES (MAXVARS=100 MINPCTMISSING=0) to the overall syntax and rerun the model.

Those three variables do not appear at the top of the list in the variable summary table.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 08:46 PM
From: Rick Marcantonio
Subject: Multiple Imputation

OK, maybe we've narrowed it down to those 3 variables that are not imputed due to missing values.

Try adding /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0)

Do those three variables appear at the top of the list in the Variable Summary Table?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 08:17 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi Rick!

That section was blank just as it appears in your image. There were three variables that were "not imputed" due to do missing values" Bur the "Not Imputed (too Many Missing Values) was blank as yours appears above.

Best,
Courtney B. Francis

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 05:12 PM
From: Rick Marcantonio
Subject: Multiple Imputation

For example, Courtney, this table. What do you have, for "Not imputed"?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 05:03 PM
From: Jon Peck
Subject: Multiple Imputation

But did the tables that the MI procedure produces show that some variables/values could not be imputed?

--

Jon K Peck
jkpeck@gmail.com

Original Message:
Sent: 9/19/2022 4:46:00 PM
From: Rick Marcantonio
Subject: RE: Multiple Imputation

Yes, I understand what you mean.

I'm curious if the student weight variable (STUDWGT) has any 0 or missing values...

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 04:16 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,!

They are being imputed (The amount of cases have increased considerably) and the amount of missingness has decreased from the original dataset, but it was my understanding that once all the variables of interest were imputed, there should no longer be any missingness-especially in the pooled dataset.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 04:11 PM
From: Rick Marcantonio
Subject: Multiple Imputation

So then, no values are being imputed at all?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 04:04 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,

Thank you for your response!

All of the variables listed -including the ones in the "/constraints... as Role =IND" have missing data. In all the imputed datasets including the pooled data set. And the amount of missingess is the same across all imputed datasets.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 01:27 PM
From: Rick Marcantonio
Subject: Multiple Imputation

Hi.

Do you mean apart from the variables specified in /CONSTRAINTS as ROLE=IND?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 11:32 AM
From: Courtney B Francis
Subject: Multiple Imputation

Hello!

I have attempted to run Multiple Imputation on my dataset. After the imputation ran I still had missing data in the pooled data set.

Do you know what might have caused this issue? I'm not sure if this is common or if there is an issue in my syntax or Data that may be creating this issue. Please see my syntax below:

SET THREADS = 4.

USE ALL.

FILTER OFF.

SORT CASES BY YEAR SUBJID.

EXECUTE.

SET SEED=20220913.

MULTIPLE IMPUTATION

INSTDICO

INSTCONT

WHITE

BLACK

Brown

Black_Brown_Total

BLACK_BROWN_AUTISM

MINORITIZED_POC

RECODEAUTISM

HSGPA

COLLEGE_INVOLVEMENT

HABITS_OF_MIND_GRP

ACADEMIC_SELFCONCEPT_GRP

SOCIAL_SELFCONCEPT_GRP

DEGASPDICO

SEX

FIRSTGEN

INCOME

ACT_FINAL

BLACK_BROWN_XINCOME

BLACK_BROWN_XHSGPA

BLACK_BROWN_XHOMG

BLACK_BROWN_XCOLLINV

BLACK_BROWN_XASCG

BLACK_BROWN_XSSCG

BLACK_BROWN_XACT

BLACK_BROWN_XSEX

BLACK_BROWN_XFIRSTGEN

BLACK_BROWN_XDEGASP

WHITE_XINCOME

WHITE_XHSGPA

WHITE_XHOMG

WHITE_XCOLLINV

WHITE_XASCG

WHITE_XSSCG

WHITE_XACT

WHITE_XSEX

WHITE_XFIRSTGEN

WHITE_XDEGASP

AUTISM_XINCOME

AUTISM_XHSGPA

AUTISM_XHOMG

AUTISM_XCOLLINV

AUTISM_XASCG

AUTISM_XSSCG

AUTISM_XACT

AUTISM_XSEX

AUTISM_XFIRSTGEN

AUTISM_XDEGASP

AUTISM_XBLACK_BROWN

AUTISM_XWHITE

RECODE_DISAB01

RECODE_DISAB02

RECODE_DISAB04

RECODE_DISAB05

RECODE_DISAB06

RECODE_DISAB07

/ANALYSISWEIGHT STUDWGT

/IMPUTE METHOD=FCS MAXITER= 100 NIMPUTATIONS=10 SCALEMODEL=LINEAR INTERACTIONS=NONE

SINGULAR=1E-012 MAXPCTMISSING=NONE MAXMODELPARAM =10000

/CONSTRAINTS BLACK_BROWN_XINCOME( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XHSGPA( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XHOMG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XCOLLINV( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XASCG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XSSCG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XACT( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XSEX( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XDEGASP( ROLE=IND)

/CONSTRAINTS WHITE_XINCOME( ROLE=IND)

/CONSTRAINTS WHITE_XHSGPA( ROLE=IND)

/CONSTRAINTS WHITE_XHOMG( ROLE=IND)

/CONSTRAINTS WHITE_XCOLLINV( ROLE=IND)

/CONSTRAINTS WHITE_XASCG( ROLE=IND)

/CONSTRAINTS WHITE_XSSCG( ROLE=IND)

/CONSTRAINTS WHITE_XACT( ROLE=IND)

/CONSTRAINTS WHITE_XSEX( ROLE=IND)

/CONSTRAINTS WHITE_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS WHITE_XDEGASP ( ROLE=IND)

/CONSTRAINTS AUTISM_XINCOME( ROLE=IND)

/CONSTRAINTS AUTISM_XHSGPA( ROLE=IND)

/CONSTRAINTS AUTISM_XHOMG( ROLE=IND)

/CONSTRAINTS AUTISM_XCOLLINV( ROLE=IND)

/CONSTRAINTS AUTISM_XASCG( ROLE=IND)

/CONSTRAINTS AUTISM_XSSCG( ROLE=IND)

/CONSTRAINTS AUTISM_XACT( ROLE=IND)

/CONSTRAINTS AUTISM_XSEX( ROLE=IND)

/CONSTRAINTS AUTISM_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS AUTISM_XDEGASP ( ROLE=IND)

/CONSTRAINTS AUTISM_XBLACK_BROWN( ROLE=IND)

/CONSTRAINTS AUTISM_XWHITE( ROLE=IND)

/CONSTRAINTS DEGASP (RND=1 MIN=0 MAX=1)

/MISSINGSUMMARIES NONE

/IMPUTATIONSUMMARIES MODELS DESCRIPTIVES

/OUTFILE IMPUTATIONS=courtney_syntax_9_14_22.sav FCSITERATIONS=iteration_history.

------------------------------
Courtney B Francis
------------------------------
#SPSSStatistics

27. RE: Multiple Imputation

Like

IBM Champion

Jon Peck

Posted Tue September 20, 2022 05:24 PM

It does matter, though, if those cases were systematically missing in a way that relates to the variables of interest. Obviously, though, you would have to infer that from external facts and empty cases don't talk - even under torture.

--

Jon K Peck
jkpeck@gmail.com

Original Message

Original Message:
Sent: 9/20/2022 4:58:00 PM
From: Rick Marcantonio
Subject: RE: Multiple Imputation

Don't let it alarm you. You did impute missing data where it could be imputed. It isn't like you did nothing. You did quite a bit!

Missing CASES are a different story. Those can safely be ignored.

------------------------------
Rick Marcantonio
Quality Assurance
IBM
------------------------------

Original Message:
Sent: Tue September 20, 2022 04:48 PM
From: Courtney B Francis
Subject: Multiple Imputation

Okay,

that makes sense @Rick Marcantonio

So the "missing" data I'm seeing in the pooled cases is just going to "be there"? I guess I've never seen that in all my readings, so it has me a bit alarmed.

------------------------------
Courtney B Francis

Original Message:
Sent: Tue September 20, 2022 04:00 PM
From: Rick Marcantonio
Subject: Multiple Imputation

I started suspecting this last night, which is why I asked you to create the COUNT variable. Then I was going to find those completely missing cases by sorting the data in descending order by that variable and looking at it.

That's an interesting exercise, but that's about all - unless you have good reason to believe that there were some kind of data entry errors and those cases should have some observed data. But that's not a statistical question.

As for doing something, there really is nothing to "do." If a person gave no data at all for the variables in your imputation model, then they gave no data... that's that. The good news is that you have plenty of data that was complete and/or imputed; more than enough to draw some solid research conclusions. The "empty" cases are causing no harm by being there. Statistically, we cannot give any analysis degrees of freedom it does not deserve by (essentially) "making up" entire cases, no matter how well-intentioned we are in wanting to.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 03:41 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi again @Rick Marcantonio,

Based on what you shared from the manual, what is the solution? How do I go about examining for "cases that have a missing value for each analysis variable?" Would I use the sort by function to find all the missing values and then remove those particular cases somehow from my model?

OR

Are you saying the MAXPCTMISSING keyword will filter these cases out? If so I'm not seeing were it does that?

Please let me know your thoughts.

Please forgive the delay in response!

------------------------------
Courtney B Francis

Original Message:
Sent: Tue September 20, 2022 09:25 AM
From: Rick Marcantonio
Subject: Multiple Imputation

Courtney;

I think I see what is happening. Cases that are completely missing (e.g., let's say that case #5 has no valid data at all for any of the analysis variables) are not imputed. Here is an example. Open a new syntax window and run it.

preserve.
*output close all.
set undefined=nowarn.
dataset close all.
new file.
DATA LIST FREE /id a b c d e f g studwgt.
begin data.
01 1 4 3 5 6 7 5 12
02 8 6 7 5 6 4 . 21
03 4 5 . 5 4 6 3 10
04 2 1 4 4 3 6 7 21
05 . . . . . . . 14
06 7 7 8 6 8 7 4 11
07 5 4 6 7 8 9 1 16
08 7 6 . 7 . 6 5 14
09 5 3 4 3 2 1 3 16
10 6 6 5 7 . 4 . 12
11 3 3 4 2 3 1 2 15
12 . . . . . . . 20
end data.
restore.
recode a b c e g (lo thru 5=0) (6 thru hi=1).
variable level a to g (scale).
dataset declare data.
MULTIPLE IMPUTATION a to g
/IMPUTE METHOD=FCS MAXITER= 100 NIMPUTATIONS=10 SCALEMODEL=LINEAR INTERACTIONS=NONE
SINGULAR=1E-012 MAXPCTMISSING=NONE MAXMODELPARAM =10000
/MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0)
/IMPUTATIONSUMMARIES MODELS DESCRIPTIVES
/ANALYSISWEIGHT STUDWGT
/OUTFILE IMPUTATIONS=data.

dataset activate data.
des var all.
***.

Go down to the DESCRIPTIVES output. You will see that the 2 cases with no data receive no imputed values.

I missed that in the manual but it is there:

"Cases that have a missing value for each analysis variable are included in analyses of missingness but are excluded from imputation. Specifically, values of such cases are not imputed and are excluded when when building imputation models. The determination of which cases are completely missing is made after any variables are filtered out of the imputation model by the MAXPCTMISSING keyword."

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 08:39 AM
From: Courtney B Francis
Subject: Multiple Imputation

Yes!

majority is binary.

------------------------------
Courtney B Francis

Original Message:
Sent: Tue September 20, 2022 08:35 AM
From: Rick Marcantonio
Subject: Multiple Imputation

The original data, before imputation.

Also, it looks like a lot (the majority, perhaps) of these variables are binary (0/1). Is that true?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 08:33 AM
From: Courtney B Francis
Subject: Multiple Imputation

Should I be running this syntax on the imputed data set? Or the original dataset?

Best,

------------------------------
Courtney B Francis

Original Message:
Sent: Tue September 20, 2022 12:38 AM
From: Rick Marcantonio
Subject: Multiple Imputation

OK, I'm just about out of bullets.

Try this. In a syntax window, paste this and run it:

COUNT num_missing= INSTDICO INSTCONT WHITE BLACK Brown Black_Brown_Total    BLACK_BROWN_AUTISM MINORITIZED_POC RECODEAUTISM HSGPA    COLLEGE_INVOLVEMENT HABITS_OF_MIND_GRP ACADEMIC_SELFCONCEPT_GRP    SOCIAL_SELFCONCEPT_GRP DEGASPDICO SEX FIRSTGEN INCOME ACT_FINAL    BLACK_BROWN_XINCOME BLACK_BROWN_XHSGPA BLACK_BROWN_XHOMG    BLACK_BROWN_XCOLLINV BLACK_BROWN_XASCG BLACK_BROWN_XSSCG    BLACK_BROWN_XACT BLACK_BROWN_XSEX BLACK_BROWN_XFIRSTGEN    BLACK_BROWN_XDEGASP WHITE_XINCOME WHITE_XHSGPA WHITE_XHOMG    WHITE_XCOLLINV WHITE_XASCG WHITE_XSSCG WHITE_XACT WHITE_XSEX    WHITE_XFIRSTGEN WHITE_XDEGASP AUTISM_XINCOME AUTISM_XHSGPA    AUTISM_XHOMG AUTISM_XCOLLINV AUTISM_XASCG AUTISM_XSSCG    AUTISM_XACT AUTISM_XSEX AUTISM_XFIRSTGEN AUTISM_XDEGASP    AUTISM_XBLACK_BROWN AUTISM_XWHITE RECODE_DISAB01    RECODE_DISAB02 RECODE_DISAB04 RECODE_DISAB05 RECODE_DISAB06    RECODE_DISAB07 (MISSING, SYSMIS).FRE VAR num_missing.

I think I would also like to see the correlation matrix of these variables.

Maybe you could just send me the dataset.

marcantr@us.ibm.com

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 12:02 AM
From: Courtney B Francis
Subject: Multiple Imputation

I pasted it below:

Variable Summary
	Missing	Valid N	Mean	Std. Deviation
	N	Percent	Mean	Std. Deviation
ACT_FINAL	122581	25.4%	359361	24.7110	16.94311
WHITE_XACT	69418	14.4%	412524	13.2600	40.98339
RECODE of DISAB01_LEARNING DISABILITY	63094	13.1%	418848	.0438	.63238
RECODE of INCOME (Parental Income)	49476	10.3%	432466	1.97	2.875
RECODE of DISAB01_CHRONIC ILLNESS	47338	9.8%	434604	.0356	.57287
RECODE of DISAB01_PSYCHOLOGICAL DISORDER	46662	9.7%	435280	.0221	.45473
RECODE of DISAB01_OTHER DISABILITY	46533	9.7%	435409	.0745	.81162
RECODE of DISAB01_ADHD	45443	9.4%	436499	.0567	.71469
RECODE of DISAB01_LEARNING DISABILITY	45336	9.4%	436606	.0328	.55037
TFS Likelihood of College Involvement Score	44385	9.2%	437557	49.4252	25.04735
DEGREE ASPIRATIONS DICHOTOMOUS	42437	8.8%	439505	.7639	1.31083
BLACK_BROWN_XACT	41513	8.6%	440429	2.6916	22.11014
WHITE_XINCOME	36537	7.6%	445405	1.2194	3.98811
AUTISM_XACT	31898	6.6%	450044	.1061	5.15740
AUTISM_XINCOME	31462	6.5%	450480	.0103	.48015
AUTISM_XCOLLINV	31354	6.5%	450588	.2608	11.01939
AUTISM_XSSCG	31210	6.5%	450732	.0095	.41915
AUTISM_XASCG	31203	6.5%	450739	.0117	.50244
AUTISM_XHSGPA	31183	6.5%	450759	.0367	1.50114
AUTISM_XHOMG	31168	6.5%	450774	.0120	.50833
RECODE of AUTISM	31116	6.5%	450826	.0061	.23762
WHITE_XDEGASP	30886	6.4%	451056	.4121	1.51961
WHITE_XCOLLINV	30022	6.2%	451920	27.2186	76.92240
AUTISM_XDEGASP	25605	5.3%	456337	.0036	.18283
TFS Social Self-Concept Group	21697	4.5%	460245	1.92	2.366
TFS Academic Self-Concept Group	20658	4.3%	461284	1.94	2.226
AUTISM_XWHITE	18863	3.9%	463079	.0036	.18470
BLACK_BROWN_XCOLLINV	18023	3.7%	463919	8.5708	58.87801
BLACK_BROWN_XINCOME	17799	3.7%	464143	.2465	1.86394
WHITE_XSSCG	17458	3.6%	464484	1.0970	3.41879
WHITE_XASCG	16917	3.5%	465025	1.1271	3.43757
BLACK_BROWN_XDEGASP	16077	3.3%	465865	.1395	1.06979
AUTISM_XSEX	15996	3.3%	465946	.0018	.12976
RECODE of FIRSTGEN (First generation status based on parent(s) with less than 's	13686	2.8%	468256	.19	1.217
BLACK_BROWN_XSSCG	13200	2.7%	468742	.3558	2.54229
BLACK_BROWN_XASCG	12949	2.7%	468993	.3372	2.39447
WHITE_XHOMG	12777	2.7%	469165	1.1525	3.51353
WHITE_XHSGPA	11872	2.5%	470070	3.8132	10.48926
TFS Habits of Mind Group	11731	2.4%	470211	1.99	2.335
BLACK_BROWN_XHOMG	11189	2.3%	470753	.3650	2.56174
AUTISM_XFIRSTGEN	11000	2.3%	470942	.0008	.08888
BLACK_BROWN_XHSGPA	10521	2.2%	471421	1.0982	7.29505
AUTISM_XBLACK_BROWN	10239	2.1%	471703	.0007	.08021
WHITE_XFIRSTGEN	10102	2.1%	471840	.0649	.75961
This is the group of ASAIN & HISPAINIC & OTHER	9515	2.0%	472427	.1108	.96923
This is the Black and Brown combined variable	9515	2.0%	472427	.1909	1.21370
This is the Hispanic Race Code	9515	2.0%	472427	.1019	.93436
This is the first Recode of the Black Variable	9515	2.0%	472427	.0889	.87919
This is the White Race Code	9515	2.0%	472427	.5818	1.52343
BLACK_BROWN_XFIRSTGEN	8582	1.8%	473360	.0821	.84733
What was your average grade in high school?	5275	1.1%	476667	6.39	4.275
WHITE_XSEX	4808	1.0%	477134	.3077	1.42634
BLACK_BROWN_XSEX	4808	1.0%	477134	.1096	.96532
ED INST TYPE (UNIVERSITY =0 & 4 YEAR COLLEGE =1)	794	0.2%	481148	.5016	1.54678
RECODE of SEX (Your sex:)	0	0.0%	481942	.54	1.540
BLACK_BROWN_AUTISM	0	0.0%	481942	.0007	.07935
INSTITUTIONAL CONTROL	0	0.0%	481942	1.32	1.447

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 11:57 PM
From: Rick Marcantonio
Subject: Multiple Imputation

I can't see the table I want to see - the Variable Summary.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 11:49 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,

I ran the model again (less iterations and imputation for the sake of time), and I received the same situation. I still have missingness in my pooled dataset for ALL variables.

I used the same syntax, except at the bottom, instead of MISSINGSUMMARIES = NONE, I changed it to: /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0).

If you have any insight into what the issue is, please let me know!

Best,

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 10:10 PM
From: Rick Marcantonio
Subject: Multiple Imputation

Well, no, they'd be at the bottom of the list, since they have no missing data.

I am suggesting that you re-run your original syntax, just please change /MISSINGSUMMARIES NONE to /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0).

I'm trying to get some idea what the "missingness" looks like in these data.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 09:14 PM
From: Courtney B Francis
Subject: Multiple Imputation

HI @Rick Marcantonio,

I realized a typo in my reply to your previous message. I meant to say: There were three variables in the box below labeled: "not imputed (no missing values)"
Are you suggesting I add /MISSINGSUMMARIES OVERALL VARIABLES (MAXVARS=100 MINPCTMISSING=0) to the overall syntax and rerun the model.

Those three variables do not appear at the top of the list in the variable summary table.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 08:46 PM
From: Rick Marcantonio
Subject: Multiple Imputation

OK, maybe we've narrowed it down to those 3 variables that are not imputed due to missing values.

Try adding /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0)

Do those three variables appear at the top of the list in the Variable Summary Table?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 08:17 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi Rick!

That section was blank just as it appears in your image. There were three variables that were "not imputed" due to do missing values" Bur the "Not Imputed (too Many Missing Values) was blank as yours appears above.

Best,
Courtney B. Francis

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 05:12 PM
From: Rick Marcantonio
Subject: Multiple Imputation

For example, Courtney, this table. What do you have, for "Not imputed"?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 05:03 PM
From: Jon Peck
Subject: Multiple Imputation

But did the tables that the MI procedure produces show that some variables/values could not be imputed?

--

Jon K Peck
jkpeck@gmail.com

Original Message:
Sent: 9/19/2022 4:46:00 PM
From: Rick Marcantonio
Subject: RE: Multiple Imputation

Yes, I understand what you mean.

I'm curious if the student weight variable (STUDWGT) has any 0 or missing values...

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 04:16 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,!

They are being imputed (The amount of cases have increased considerably) and the amount of missingness has decreased from the original dataset, but it was my understanding that once all the variables of interest were imputed, there should no longer be any missingness-especially in the pooled dataset.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 04:11 PM
From: Rick Marcantonio
Subject: Multiple Imputation

So then, no values are being imputed at all?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 04:04 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,

Thank you for your response!

All of the variables listed -including the ones in the "/constraints... as Role =IND" have missing data. In all the imputed datasets including the pooled data set. And the amount of missingess is the same across all imputed datasets.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 01:27 PM
From: Rick Marcantonio
Subject: Multiple Imputation

Hi.

Do you mean apart from the variables specified in /CONSTRAINTS as ROLE=IND?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 11:32 AM
From: Courtney B Francis
Subject: Multiple Imputation

Hello!

I have attempted to run Multiple Imputation on my dataset. After the imputation ran I still had missing data in the pooled data set.

Do you know what might have caused this issue? I'm not sure if this is common or if there is an issue in my syntax or Data that may be creating this issue. Please see my syntax below:

SET THREADS = 4.

USE ALL.

FILTER OFF.

SORT CASES BY YEAR SUBJID.

EXECUTE.

SET SEED=20220913.

MULTIPLE IMPUTATION

INSTDICO

INSTCONT

WHITE

BLACK

Brown

Black_Brown_Total

BLACK_BROWN_AUTISM

MINORITIZED_POC

RECODEAUTISM

HSGPA

COLLEGE_INVOLVEMENT

HABITS_OF_MIND_GRP

ACADEMIC_SELFCONCEPT_GRP

SOCIAL_SELFCONCEPT_GRP

DEGASPDICO

SEX

FIRSTGEN

INCOME

ACT_FINAL

BLACK_BROWN_XINCOME

BLACK_BROWN_XHSGPA

BLACK_BROWN_XHOMG

BLACK_BROWN_XCOLLINV

BLACK_BROWN_XASCG

BLACK_BROWN_XSSCG

BLACK_BROWN_XACT

BLACK_BROWN_XSEX

BLACK_BROWN_XFIRSTGEN

BLACK_BROWN_XDEGASP

WHITE_XINCOME

WHITE_XHSGPA

WHITE_XHOMG

WHITE_XCOLLINV

WHITE_XASCG

WHITE_XSSCG

WHITE_XACT

WHITE_XSEX

WHITE_XFIRSTGEN

WHITE_XDEGASP

AUTISM_XINCOME

AUTISM_XHSGPA

AUTISM_XHOMG

AUTISM_XCOLLINV

AUTISM_XASCG

AUTISM_XSSCG

AUTISM_XACT

AUTISM_XSEX

AUTISM_XFIRSTGEN

AUTISM_XDEGASP

AUTISM_XBLACK_BROWN

AUTISM_XWHITE

RECODE_DISAB01

RECODE_DISAB02

RECODE_DISAB04

RECODE_DISAB05

RECODE_DISAB06

RECODE_DISAB07

/ANALYSISWEIGHT STUDWGT

/IMPUTE METHOD=FCS MAXITER= 100 NIMPUTATIONS=10 SCALEMODEL=LINEAR INTERACTIONS=NONE

SINGULAR=1E-012 MAXPCTMISSING=NONE MAXMODELPARAM =10000

/CONSTRAINTS BLACK_BROWN_XINCOME( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XHSGPA( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XHOMG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XCOLLINV( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XASCG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XSSCG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XACT( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XSEX( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XDEGASP( ROLE=IND)

/CONSTRAINTS WHITE_XINCOME( ROLE=IND)

/CONSTRAINTS WHITE_XHSGPA( ROLE=IND)

/CONSTRAINTS WHITE_XHOMG( ROLE=IND)

/CONSTRAINTS WHITE_XCOLLINV( ROLE=IND)

/CONSTRAINTS WHITE_XASCG( ROLE=IND)

/CONSTRAINTS WHITE_XSSCG( ROLE=IND)

/CONSTRAINTS WHITE_XACT( ROLE=IND)

/CONSTRAINTS WHITE_XSEX( ROLE=IND)

/CONSTRAINTS WHITE_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS WHITE_XDEGASP ( ROLE=IND)

/CONSTRAINTS AUTISM_XINCOME( ROLE=IND)

/CONSTRAINTS AUTISM_XHSGPA( ROLE=IND)

/CONSTRAINTS AUTISM_XHOMG( ROLE=IND)

/CONSTRAINTS AUTISM_XCOLLINV( ROLE=IND)

/CONSTRAINTS AUTISM_XASCG( ROLE=IND)

/CONSTRAINTS AUTISM_XSSCG( ROLE=IND)

/CONSTRAINTS AUTISM_XACT( ROLE=IND)

/CONSTRAINTS AUTISM_XSEX( ROLE=IND)

/CONSTRAINTS AUTISM_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS AUTISM_XDEGASP ( ROLE=IND)

/CONSTRAINTS AUTISM_XBLACK_BROWN( ROLE=IND)

/CONSTRAINTS AUTISM_XWHITE( ROLE=IND)

/CONSTRAINTS DEGASP (RND=1 MIN=0 MAX=1)

/MISSINGSUMMARIES NONE

/IMPUTATIONSUMMARIES MODELS DESCRIPTIVES

/OUTFILE IMPUTATIONS=courtney_syntax_9_14_22.sav FCSITERATIONS=iteration_history.

------------------------------
Courtney B Francis
------------------------------
#SPSSStatistics

28. RE: Multiple Imputation

Like

Rick Marcantonio

Posted Tue September 20, 2022 05:28 PM

Yes, that's true. That brings up whether the data are MAR, MCAR, or NMAR.

MI is going to assume MAR (and then of course MCAR as well).

------------------------------
Rick Marcantonio
Quality Assurance
IBM
------------------------------

Original Message

Original Message:
Sent: Tue September 20, 2022 05:24 PM
From: Jon Peck
Subject: Multiple Imputation

It does matter, though, if those cases were systematically missing in a way that relates to the variables of interest. Obviously, though, you would have to infer that from external facts and empty cases don't talk - even under torture.

--

Jon K Peck
jkpeck@gmail.com

Original Message:
Sent: 9/20/2022 4:58:00 PM
From: Rick Marcantonio
Subject: RE: Multiple Imputation

Don't let it alarm you. You did impute missing data where it could be imputed. It isn't like you did nothing. You did quite a bit!

Missing CASES are a different story. Those can safely be ignored.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 04:48 PM
From: Courtney B Francis
Subject: Multiple Imputation

Okay,

that makes sense @Rick Marcantonio

So the "missing" data I'm seeing in the pooled cases is just going to "be there"? I guess I've never seen that in all my readings, so it has me a bit alarmed.

------------------------------
Courtney B Francis

Original Message:
Sent: Tue September 20, 2022 04:00 PM
From: Rick Marcantonio
Subject: Multiple Imputation

I started suspecting this last night, which is why I asked you to create the COUNT variable. Then I was going to find those completely missing cases by sorting the data in descending order by that variable and looking at it.

That's an interesting exercise, but that's about all - unless you have good reason to believe that there were some kind of data entry errors and those cases should have some observed data. But that's not a statistical question.

As for doing something, there really is nothing to "do." If a person gave no data at all for the variables in your imputation model, then they gave no data... that's that. The good news is that you have plenty of data that was complete and/or imputed; more than enough to draw some solid research conclusions. The "empty" cases are causing no harm by being there. Statistically, we cannot give any analysis degrees of freedom it does not deserve by (essentially) "making up" entire cases, no matter how well-intentioned we are in wanting to.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 03:41 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi again @Rick Marcantonio,

Based on what you shared from the manual, what is the solution? How do I go about examining for "cases that have a missing value for each analysis variable?" Would I use the sort by function to find all the missing values and then remove those particular cases somehow from my model?

OR

Are you saying the MAXPCTMISSING keyword will filter these cases out? If so I'm not seeing were it does that?

Please let me know your thoughts.

Please forgive the delay in response!

------------------------------
Courtney B Francis

Original Message:
Sent: Tue September 20, 2022 09:25 AM
From: Rick Marcantonio
Subject: Multiple Imputation

Courtney;

I think I see what is happening. Cases that are completely missing (e.g., let's say that case #5 has no valid data at all for any of the analysis variables) are not imputed. Here is an example. Open a new syntax window and run it.

preserve.
*output close all.
set undefined=nowarn.
dataset close all.
new file.
DATA LIST FREE /id a b c d e f g studwgt.
begin data.
01 1 4 3 5 6 7 5 12
02 8 6 7 5 6 4 . 21
03 4 5 . 5 4 6 3 10
04 2 1 4 4 3 6 7 21
05 . . . . . . . 14
06 7 7 8 6 8 7 4 11
07 5 4 6 7 8 9 1 16
08 7 6 . 7 . 6 5 14
09 5 3 4 3 2 1 3 16
10 6 6 5 7 . 4 . 12
11 3 3 4 2 3 1 2 15
12 . . . . . . . 20
end data.
restore.
recode a b c e g (lo thru 5=0) (6 thru hi=1).
variable level a to g (scale).
dataset declare data.
MULTIPLE IMPUTATION a to g
/IMPUTE METHOD=FCS MAXITER= 100 NIMPUTATIONS=10 SCALEMODEL=LINEAR INTERACTIONS=NONE
SINGULAR=1E-012 MAXPCTMISSING=NONE MAXMODELPARAM =10000
/MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0)
/IMPUTATIONSUMMARIES MODELS DESCRIPTIVES
/ANALYSISWEIGHT STUDWGT
/OUTFILE IMPUTATIONS=data.

dataset activate data.
des var all.
***.

Go down to the DESCRIPTIVES output. You will see that the 2 cases with no data receive no imputed values.

I missed that in the manual but it is there:

"Cases that have a missing value for each analysis variable are included in analyses of missingness but are excluded from imputation. Specifically, values of such cases are not imputed and are excluded when when building imputation models. The determination of which cases are completely missing is made after any variables are filtered out of the imputation model by the MAXPCTMISSING keyword."

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 08:39 AM
From: Courtney B Francis
Subject: Multiple Imputation

Yes!

majority is binary.

------------------------------
Courtney B Francis

Original Message:
Sent: Tue September 20, 2022 08:35 AM
From: Rick Marcantonio
Subject: Multiple Imputation

The original data, before imputation.

Also, it looks like a lot (the majority, perhaps) of these variables are binary (0/1). Is that true?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 08:33 AM
From: Courtney B Francis
Subject: Multiple Imputation

Should I be running this syntax on the imputed data set? Or the original dataset?

Best,

------------------------------
Courtney B Francis

Original Message:
Sent: Tue September 20, 2022 12:38 AM
From: Rick Marcantonio
Subject: Multiple Imputation

OK, I'm just about out of bullets.

Try this. In a syntax window, paste this and run it:

COUNT num_missing= INSTDICO INSTCONT WHITE BLACK Brown Black_Brown_Total    BLACK_BROWN_AUTISM MINORITIZED_POC RECODEAUTISM HSGPA    COLLEGE_INVOLVEMENT HABITS_OF_MIND_GRP ACADEMIC_SELFCONCEPT_GRP    SOCIAL_SELFCONCEPT_GRP DEGASPDICO SEX FIRSTGEN INCOME ACT_FINAL    BLACK_BROWN_XINCOME BLACK_BROWN_XHSGPA BLACK_BROWN_XHOMG    BLACK_BROWN_XCOLLINV BLACK_BROWN_XASCG BLACK_BROWN_XSSCG    BLACK_BROWN_XACT BLACK_BROWN_XSEX BLACK_BROWN_XFIRSTGEN    BLACK_BROWN_XDEGASP WHITE_XINCOME WHITE_XHSGPA WHITE_XHOMG    WHITE_XCOLLINV WHITE_XASCG WHITE_XSSCG WHITE_XACT WHITE_XSEX    WHITE_XFIRSTGEN WHITE_XDEGASP AUTISM_XINCOME AUTISM_XHSGPA    AUTISM_XHOMG AUTISM_XCOLLINV AUTISM_XASCG AUTISM_XSSCG    AUTISM_XACT AUTISM_XSEX AUTISM_XFIRSTGEN AUTISM_XDEGASP    AUTISM_XBLACK_BROWN AUTISM_XWHITE RECODE_DISAB01    RECODE_DISAB02 RECODE_DISAB04 RECODE_DISAB05 RECODE_DISAB06    RECODE_DISAB07 (MISSING, SYSMIS).FRE VAR num_missing.

I think I would also like to see the correlation matrix of these variables.

Maybe you could just send me the dataset.

marcantr@us.ibm.com

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 12:02 AM
From: Courtney B Francis
Subject: Multiple Imputation

I pasted it below:

Variable Summary
	Missing	Valid N	Mean	Std. Deviation
	N	Percent	Mean	Std. Deviation
ACT_FINAL	122581	25.4%	359361	24.7110	16.94311
WHITE_XACT	69418	14.4%	412524	13.2600	40.98339
RECODE of DISAB01_LEARNING DISABILITY	63094	13.1%	418848	.0438	.63238
RECODE of INCOME (Parental Income)	49476	10.3%	432466	1.97	2.875
RECODE of DISAB01_CHRONIC ILLNESS	47338	9.8%	434604	.0356	.57287
RECODE of DISAB01_PSYCHOLOGICAL DISORDER	46662	9.7%	435280	.0221	.45473
RECODE of DISAB01_OTHER DISABILITY	46533	9.7%	435409	.0745	.81162
RECODE of DISAB01_ADHD	45443	9.4%	436499	.0567	.71469
RECODE of DISAB01_LEARNING DISABILITY	45336	9.4%	436606	.0328	.55037
TFS Likelihood of College Involvement Score	44385	9.2%	437557	49.4252	25.04735
DEGREE ASPIRATIONS DICHOTOMOUS	42437	8.8%	439505	.7639	1.31083
BLACK_BROWN_XACT	41513	8.6%	440429	2.6916	22.11014
WHITE_XINCOME	36537	7.6%	445405	1.2194	3.98811
AUTISM_XACT	31898	6.6%	450044	.1061	5.15740
AUTISM_XINCOME	31462	6.5%	450480	.0103	.48015
AUTISM_XCOLLINV	31354	6.5%	450588	.2608	11.01939
AUTISM_XSSCG	31210	6.5%	450732	.0095	.41915
AUTISM_XASCG	31203	6.5%	450739	.0117	.50244
AUTISM_XHSGPA	31183	6.5%	450759	.0367	1.50114
AUTISM_XHOMG	31168	6.5%	450774	.0120	.50833
RECODE of AUTISM	31116	6.5%	450826	.0061	.23762
WHITE_XDEGASP	30886	6.4%	451056	.4121	1.51961
WHITE_XCOLLINV	30022	6.2%	451920	27.2186	76.92240
AUTISM_XDEGASP	25605	5.3%	456337	.0036	.18283
TFS Social Self-Concept Group	21697	4.5%	460245	1.92	2.366
TFS Academic Self-Concept Group	20658	4.3%	461284	1.94	2.226
AUTISM_XWHITE	18863	3.9%	463079	.0036	.18470
BLACK_BROWN_XCOLLINV	18023	3.7%	463919	8.5708	58.87801
BLACK_BROWN_XINCOME	17799	3.7%	464143	.2465	1.86394
WHITE_XSSCG	17458	3.6%	464484	1.0970	3.41879
WHITE_XASCG	16917	3.5%	465025	1.1271	3.43757
BLACK_BROWN_XDEGASP	16077	3.3%	465865	.1395	1.06979
AUTISM_XSEX	15996	3.3%	465946	.0018	.12976
RECODE of FIRSTGEN (First generation status based on parent(s) with less than 's	13686	2.8%	468256	.19	1.217
BLACK_BROWN_XSSCG	13200	2.7%	468742	.3558	2.54229
BLACK_BROWN_XASCG	12949	2.7%	468993	.3372	2.39447
WHITE_XHOMG	12777	2.7%	469165	1.1525	3.51353
WHITE_XHSGPA	11872	2.5%	470070	3.8132	10.48926
TFS Habits of Mind Group	11731	2.4%	470211	1.99	2.335
BLACK_BROWN_XHOMG	11189	2.3%	470753	.3650	2.56174
AUTISM_XFIRSTGEN	11000	2.3%	470942	.0008	.08888
BLACK_BROWN_XHSGPA	10521	2.2%	471421	1.0982	7.29505
AUTISM_XBLACK_BROWN	10239	2.1%	471703	.0007	.08021
WHITE_XFIRSTGEN	10102	2.1%	471840	.0649	.75961
This is the group of ASAIN & HISPAINIC & OTHER	9515	2.0%	472427	.1108	.96923
This is the Black and Brown combined variable	9515	2.0%	472427	.1909	1.21370
This is the Hispanic Race Code	9515	2.0%	472427	.1019	.93436
This is the first Recode of the Black Variable	9515	2.0%	472427	.0889	.87919
This is the White Race Code	9515	2.0%	472427	.5818	1.52343
BLACK_BROWN_XFIRSTGEN	8582	1.8%	473360	.0821	.84733
What was your average grade in high school?	5275	1.1%	476667	6.39	4.275
WHITE_XSEX	4808	1.0%	477134	.3077	1.42634
BLACK_BROWN_XSEX	4808	1.0%	477134	.1096	.96532
ED INST TYPE (UNIVERSITY =0 & 4 YEAR COLLEGE =1)	794	0.2%	481148	.5016	1.54678
RECODE of SEX (Your sex:)	0	0.0%	481942	.54	1.540
BLACK_BROWN_AUTISM	0	0.0%	481942	.0007	.07935
INSTITUTIONAL CONTROL	0	0.0%	481942	1.32	1.447

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 11:57 PM
From: Rick Marcantonio
Subject: Multiple Imputation

I can't see the table I want to see - the Variable Summary.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 11:49 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,

I ran the model again (less iterations and imputation for the sake of time), and I received the same situation. I still have missingness in my pooled dataset for ALL variables.

I used the same syntax, except at the bottom, instead of MISSINGSUMMARIES = NONE, I changed it to: /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0).

If you have any insight into what the issue is, please let me know!

Best,

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 10:10 PM
From: Rick Marcantonio
Subject: Multiple Imputation

Well, no, they'd be at the bottom of the list, since they have no missing data.

I am suggesting that you re-run your original syntax, just please change /MISSINGSUMMARIES NONE to /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0).

I'm trying to get some idea what the "missingness" looks like in these data.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 09:14 PM
From: Courtney B Francis
Subject: Multiple Imputation

HI @Rick Marcantonio,

I realized a typo in my reply to your previous message. I meant to say: There were three variables in the box below labeled: "not imputed (no missing values)"
Are you suggesting I add /MISSINGSUMMARIES OVERALL VARIABLES (MAXVARS=100 MINPCTMISSING=0) to the overall syntax and rerun the model.

Those three variables do not appear at the top of the list in the variable summary table.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 08:46 PM
From: Rick Marcantonio
Subject: Multiple Imputation

OK, maybe we've narrowed it down to those 3 variables that are not imputed due to missing values.

Try adding /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0)

Do those three variables appear at the top of the list in the Variable Summary Table?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 08:17 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi Rick!

That section was blank just as it appears in your image. There were three variables that were "not imputed" due to do missing values" Bur the "Not Imputed (too Many Missing Values) was blank as yours appears above.

Best,
Courtney B. Francis

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 05:12 PM
From: Rick Marcantonio
Subject: Multiple Imputation

For example, Courtney, this table. What do you have, for "Not imputed"?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 05:03 PM
From: Jon Peck
Subject: Multiple Imputation

But did the tables that the MI procedure produces show that some variables/values could not be imputed?

--

Jon K Peck
jkpeck@gmail.com

Original Message:
Sent: 9/19/2022 4:46:00 PM
From: Rick Marcantonio
Subject: RE: Multiple Imputation

Yes, I understand what you mean.

I'm curious if the student weight variable (STUDWGT) has any 0 or missing values...

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 04:16 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,!

They are being imputed (The amount of cases have increased considerably) and the amount of missingness has decreased from the original dataset, but it was my understanding that once all the variables of interest were imputed, there should no longer be any missingness-especially in the pooled dataset.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 04:11 PM
From: Rick Marcantonio
Subject: Multiple Imputation

So then, no values are being imputed at all?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 04:04 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,

Thank you for your response!

All of the variables listed -including the ones in the "/constraints... as Role =IND" have missing data. In all the imputed datasets including the pooled data set. And the amount of missingess is the same across all imputed datasets.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 01:27 PM
From: Rick Marcantonio
Subject: Multiple Imputation

Hi.

Do you mean apart from the variables specified in /CONSTRAINTS as ROLE=IND?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 11:32 AM
From: Courtney B Francis
Subject: Multiple Imputation

Hello!

I have attempted to run Multiple Imputation on my dataset. After the imputation ran I still had missing data in the pooled data set.

Do you know what might have caused this issue? I'm not sure if this is common or if there is an issue in my syntax or Data that may be creating this issue. Please see my syntax below:

SET THREADS = 4.

USE ALL.

FILTER OFF.

SORT CASES BY YEAR SUBJID.

EXECUTE.

SET SEED=20220913.

MULTIPLE IMPUTATION

INSTDICO

INSTCONT

WHITE

BLACK

Brown

Black_Brown_Total

BLACK_BROWN_AUTISM

MINORITIZED_POC

RECODEAUTISM

HSGPA

COLLEGE_INVOLVEMENT

HABITS_OF_MIND_GRP

ACADEMIC_SELFCONCEPT_GRP

SOCIAL_SELFCONCEPT_GRP

DEGASPDICO

SEX

FIRSTGEN

INCOME

ACT_FINAL

BLACK_BROWN_XINCOME

BLACK_BROWN_XHSGPA

BLACK_BROWN_XHOMG

BLACK_BROWN_XCOLLINV

BLACK_BROWN_XASCG

BLACK_BROWN_XSSCG

BLACK_BROWN_XACT

BLACK_BROWN_XSEX

BLACK_BROWN_XFIRSTGEN

BLACK_BROWN_XDEGASP

WHITE_XINCOME

WHITE_XHSGPA

WHITE_XHOMG

WHITE_XCOLLINV

WHITE_XASCG

WHITE_XSSCG

WHITE_XACT

WHITE_XSEX

WHITE_XFIRSTGEN

WHITE_XDEGASP

AUTISM_XINCOME

AUTISM_XHSGPA

AUTISM_XHOMG

AUTISM_XCOLLINV

AUTISM_XASCG

AUTISM_XSSCG

AUTISM_XACT

AUTISM_XSEX

AUTISM_XFIRSTGEN

AUTISM_XDEGASP

AUTISM_XBLACK_BROWN

AUTISM_XWHITE

RECODE_DISAB01

RECODE_DISAB02

RECODE_DISAB04

RECODE_DISAB05

RECODE_DISAB06

RECODE_DISAB07

/ANALYSISWEIGHT STUDWGT

/IMPUTE METHOD=FCS MAXITER= 100 NIMPUTATIONS=10 SCALEMODEL=LINEAR INTERACTIONS=NONE

SINGULAR=1E-012 MAXPCTMISSING=NONE MAXMODELPARAM =10000

/CONSTRAINTS BLACK_BROWN_XINCOME( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XHSGPA( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XHOMG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XCOLLINV( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XASCG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XSSCG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XACT( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XSEX( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XDEGASP( ROLE=IND)

/CONSTRAINTS WHITE_XINCOME( ROLE=IND)

/CONSTRAINTS WHITE_XHSGPA( ROLE=IND)

/CONSTRAINTS WHITE_XHOMG( ROLE=IND)

/CONSTRAINTS WHITE_XCOLLINV( ROLE=IND)

/CONSTRAINTS WHITE_XASCG( ROLE=IND)

/CONSTRAINTS WHITE_XSSCG( ROLE=IND)

/CONSTRAINTS WHITE_XACT( ROLE=IND)

/CONSTRAINTS WHITE_XSEX( ROLE=IND)

/CONSTRAINTS WHITE_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS WHITE_XDEGASP ( ROLE=IND)

/CONSTRAINTS AUTISM_XINCOME( ROLE=IND)

/CONSTRAINTS AUTISM_XHSGPA( ROLE=IND)

/CONSTRAINTS AUTISM_XHOMG( ROLE=IND)

/CONSTRAINTS AUTISM_XCOLLINV( ROLE=IND)

/CONSTRAINTS AUTISM_XASCG( ROLE=IND)

/CONSTRAINTS AUTISM_XSSCG( ROLE=IND)

/CONSTRAINTS AUTISM_XACT( ROLE=IND)

/CONSTRAINTS AUTISM_XSEX( ROLE=IND)

/CONSTRAINTS AUTISM_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS AUTISM_XDEGASP ( ROLE=IND)

/CONSTRAINTS AUTISM_XBLACK_BROWN( ROLE=IND)

/CONSTRAINTS AUTISM_XWHITE( ROLE=IND)

/CONSTRAINTS DEGASP (RND=1 MIN=0 MAX=1)

/MISSINGSUMMARIES NONE

/IMPUTATIONSUMMARIES MODELS DESCRIPTIVES

/OUTFILE IMPUTATIONS=courtney_syntax_9_14_22.sav FCSITERATIONS=iteration_history.

------------------------------
Courtney B Francis
------------------------------
#SPSSStatistics

29. RE: Multiple Imputation

Like

Courtney B Francis

Posted Wed September 21, 2022 10:23 AM

Well Thank you Both @Rick Marcantonio and @Jon Peck!

I appreciate you both supporting me as I try to understand why there are still "missing values" in my pooled dataset.

I ran the imputation model again (It's taking such a LONG TIME this time around -if you all have any advice for speeding it up let me know! :'D ) with a few variable adjustments, and hopefully I can simply move on to the final stages of my analysis. Even if the pooled dataset still has missing values.

Thank you again! If any other thoughts come up regarding this thread, please let me know!

Best,

------------------------------
Courtney B Francis
------------------------------

Original Message

Original Message:
Sent: Tue September 20, 2022 05:27 PM
From: Rick Marcantonio
Subject: Multiple Imputation

Yes, that's true. That brings up whether the data are MAR, MCAR, or NMAR.

MI is going to assume MAR (and then of course MCAR as well).

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 05:24 PM
From: Jon Peck
Subject: Multiple Imputation

It does matter, though, if those cases were systematically missing in a way that relates to the variables of interest. Obviously, though, you would have to infer that from external facts and empty cases don't talk - even under torture.

--

Jon K Peck
jkpeck@gmail.com

Original Message:
Sent: 9/20/2022 4:58:00 PM
From: Rick Marcantonio
Subject: RE: Multiple Imputation

Don't let it alarm you. You did impute missing data where it could be imputed. It isn't like you did nothing. You did quite a bit!

Missing CASES are a different story. Those can safely be ignored.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 04:48 PM
From: Courtney B Francis
Subject: Multiple Imputation

Okay,

that makes sense @Rick Marcantonio

So the "missing" data I'm seeing in the pooled cases is just going to "be there"? I guess I've never seen that in all my readings, so it has me a bit alarmed.

------------------------------
Courtney B Francis

Original Message:
Sent: Tue September 20, 2022 04:00 PM
From: Rick Marcantonio
Subject: Multiple Imputation

I started suspecting this last night, which is why I asked you to create the COUNT variable. Then I was going to find those completely missing cases by sorting the data in descending order by that variable and looking at it.

That's an interesting exercise, but that's about all - unless you have good reason to believe that there were some kind of data entry errors and those cases should have some observed data. But that's not a statistical question.

As for doing something, there really is nothing to "do." If a person gave no data at all for the variables in your imputation model, then they gave no data... that's that. The good news is that you have plenty of data that was complete and/or imputed; more than enough to draw some solid research conclusions. The "empty" cases are causing no harm by being there. Statistically, we cannot give any analysis degrees of freedom it does not deserve by (essentially) "making up" entire cases, no matter how well-intentioned we are in wanting to.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 03:41 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi again @Rick Marcantonio,

Based on what you shared from the manual, what is the solution? How do I go about examining for "cases that have a missing value for each analysis variable?" Would I use the sort by function to find all the missing values and then remove those particular cases somehow from my model?

OR

Are you saying the MAXPCTMISSING keyword will filter these cases out? If so I'm not seeing were it does that?

Please let me know your thoughts.

Please forgive the delay in response!

------------------------------
Courtney B Francis

Original Message:
Sent: Tue September 20, 2022 09:25 AM
From: Rick Marcantonio
Subject: Multiple Imputation

Courtney;

I think I see what is happening. Cases that are completely missing (e.g., let's say that case #5 has no valid data at all for any of the analysis variables) are not imputed. Here is an example. Open a new syntax window and run it.

preserve.
*output close all.
set undefined=nowarn.
dataset close all.
new file.
DATA LIST FREE /id a b c d e f g studwgt.
begin data.
01 1 4 3 5 6 7 5 12
02 8 6 7 5 6 4 . 21
03 4 5 . 5 4 6 3 10
04 2 1 4 4 3 6 7 21
05 . . . . . . . 14
06 7 7 8 6 8 7 4 11
07 5 4 6 7 8 9 1 16
08 7 6 . 7 . 6 5 14
09 5 3 4 3 2 1 3 16
10 6 6 5 7 . 4 . 12
11 3 3 4 2 3 1 2 15
12 . . . . . . . 20
end data.
restore.
recode a b c e g (lo thru 5=0) (6 thru hi=1).
variable level a to g (scale).
dataset declare data.
MULTIPLE IMPUTATION a to g
/IMPUTE METHOD=FCS MAXITER= 100 NIMPUTATIONS=10 SCALEMODEL=LINEAR INTERACTIONS=NONE
SINGULAR=1E-012 MAXPCTMISSING=NONE MAXMODELPARAM =10000
/MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0)
/IMPUTATIONSUMMARIES MODELS DESCRIPTIVES
/ANALYSISWEIGHT STUDWGT
/OUTFILE IMPUTATIONS=data.

dataset activate data.
des var all.
***.

Go down to the DESCRIPTIVES output. You will see that the 2 cases with no data receive no imputed values.

I missed that in the manual but it is there:

"Cases that have a missing value for each analysis variable are included in analyses of missingness but are excluded from imputation. Specifically, values of such cases are not imputed and are excluded when when building imputation models. The determination of which cases are completely missing is made after any variables are filtered out of the imputation model by the MAXPCTMISSING keyword."

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 08:39 AM
From: Courtney B Francis
Subject: Multiple Imputation

Yes!

majority is binary.

------------------------------
Courtney B Francis

Original Message:
Sent: Tue September 20, 2022 08:35 AM
From: Rick Marcantonio
Subject: Multiple Imputation

The original data, before imputation.

Also, it looks like a lot (the majority, perhaps) of these variables are binary (0/1). Is that true?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 08:33 AM
From: Courtney B Francis
Subject: Multiple Imputation

Should I be running this syntax on the imputed data set? Or the original dataset?

Best,

------------------------------
Courtney B Francis

Original Message:
Sent: Tue September 20, 2022 12:38 AM
From: Rick Marcantonio
Subject: Multiple Imputation

OK, I'm just about out of bullets.

Try this. In a syntax window, paste this and run it:

COUNT num_missing= INSTDICO INSTCONT WHITE BLACK Brown Black_Brown_Total    BLACK_BROWN_AUTISM MINORITIZED_POC RECODEAUTISM HSGPA    COLLEGE_INVOLVEMENT HABITS_OF_MIND_GRP ACADEMIC_SELFCONCEPT_GRP    SOCIAL_SELFCONCEPT_GRP DEGASPDICO SEX FIRSTGEN INCOME ACT_FINAL    BLACK_BROWN_XINCOME BLACK_BROWN_XHSGPA BLACK_BROWN_XHOMG    BLACK_BROWN_XCOLLINV BLACK_BROWN_XASCG BLACK_BROWN_XSSCG    BLACK_BROWN_XACT BLACK_BROWN_XSEX BLACK_BROWN_XFIRSTGEN    BLACK_BROWN_XDEGASP WHITE_XINCOME WHITE_XHSGPA WHITE_XHOMG    WHITE_XCOLLINV WHITE_XASCG WHITE_XSSCG WHITE_XACT WHITE_XSEX    WHITE_XFIRSTGEN WHITE_XDEGASP AUTISM_XINCOME AUTISM_XHSGPA    AUTISM_XHOMG AUTISM_XCOLLINV AUTISM_XASCG AUTISM_XSSCG    AUTISM_XACT AUTISM_XSEX AUTISM_XFIRSTGEN AUTISM_XDEGASP    AUTISM_XBLACK_BROWN AUTISM_XWHITE RECODE_DISAB01    RECODE_DISAB02 RECODE_DISAB04 RECODE_DISAB05 RECODE_DISAB06    RECODE_DISAB07 (MISSING, SYSMIS).FRE VAR num_missing.

I think I would also like to see the correlation matrix of these variables.

Maybe you could just send me the dataset.

marcantr@us.ibm.com

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Tue September 20, 2022 12:02 AM
From: Courtney B Francis
Subject: Multiple Imputation

I pasted it below:

Variable Summary
	Missing	Valid N	Mean	Std. Deviation
	N	Percent	Mean	Std. Deviation
ACT_FINAL	122581	25.4%	359361	24.7110	16.94311
WHITE_XACT	69418	14.4%	412524	13.2600	40.98339
RECODE of DISAB01_LEARNING DISABILITY	63094	13.1%	418848	.0438	.63238
RECODE of INCOME (Parental Income)	49476	10.3%	432466	1.97	2.875
RECODE of DISAB01_CHRONIC ILLNESS	47338	9.8%	434604	.0356	.57287
RECODE of DISAB01_PSYCHOLOGICAL DISORDER	46662	9.7%	435280	.0221	.45473
RECODE of DISAB01_OTHER DISABILITY	46533	9.7%	435409	.0745	.81162
RECODE of DISAB01_ADHD	45443	9.4%	436499	.0567	.71469
RECODE of DISAB01_LEARNING DISABILITY	45336	9.4%	436606	.0328	.55037
TFS Likelihood of College Involvement Score	44385	9.2%	437557	49.4252	25.04735
DEGREE ASPIRATIONS DICHOTOMOUS	42437	8.8%	439505	.7639	1.31083
BLACK_BROWN_XACT	41513	8.6%	440429	2.6916	22.11014
WHITE_XINCOME	36537	7.6%	445405	1.2194	3.98811
AUTISM_XACT	31898	6.6%	450044	.1061	5.15740
AUTISM_XINCOME	31462	6.5%	450480	.0103	.48015
AUTISM_XCOLLINV	31354	6.5%	450588	.2608	11.01939
AUTISM_XSSCG	31210	6.5%	450732	.0095	.41915
AUTISM_XASCG	31203	6.5%	450739	.0117	.50244
AUTISM_XHSGPA	31183	6.5%	450759	.0367	1.50114
AUTISM_XHOMG	31168	6.5%	450774	.0120	.50833
RECODE of AUTISM	31116	6.5%	450826	.0061	.23762
WHITE_XDEGASP	30886	6.4%	451056	.4121	1.51961
WHITE_XCOLLINV	30022	6.2%	451920	27.2186	76.92240
AUTISM_XDEGASP	25605	5.3%	456337	.0036	.18283
TFS Social Self-Concept Group	21697	4.5%	460245	1.92	2.366
TFS Academic Self-Concept Group	20658	4.3%	461284	1.94	2.226
AUTISM_XWHITE	18863	3.9%	463079	.0036	.18470
BLACK_BROWN_XCOLLINV	18023	3.7%	463919	8.5708	58.87801
BLACK_BROWN_XINCOME	17799	3.7%	464143	.2465	1.86394
WHITE_XSSCG	17458	3.6%	464484	1.0970	3.41879
WHITE_XASCG	16917	3.5%	465025	1.1271	3.43757
BLACK_BROWN_XDEGASP	16077	3.3%	465865	.1395	1.06979
AUTISM_XSEX	15996	3.3%	465946	.0018	.12976
RECODE of FIRSTGEN (First generation status based on parent(s) with less than 's	13686	2.8%	468256	.19	1.217
BLACK_BROWN_XSSCG	13200	2.7%	468742	.3558	2.54229
BLACK_BROWN_XASCG	12949	2.7%	468993	.3372	2.39447
WHITE_XHOMG	12777	2.7%	469165	1.1525	3.51353
WHITE_XHSGPA	11872	2.5%	470070	3.8132	10.48926
TFS Habits of Mind Group	11731	2.4%	470211	1.99	2.335
BLACK_BROWN_XHOMG	11189	2.3%	470753	.3650	2.56174
AUTISM_XFIRSTGEN	11000	2.3%	470942	.0008	.08888
BLACK_BROWN_XHSGPA	10521	2.2%	471421	1.0982	7.29505
AUTISM_XBLACK_BROWN	10239	2.1%	471703	.0007	.08021
WHITE_XFIRSTGEN	10102	2.1%	471840	.0649	.75961
This is the group of ASAIN & HISPAINIC & OTHER	9515	2.0%	472427	.1108	.96923
This is the Black and Brown combined variable	9515	2.0%	472427	.1909	1.21370
This is the Hispanic Race Code	9515	2.0%	472427	.1019	.93436
This is the first Recode of the Black Variable	9515	2.0%	472427	.0889	.87919
This is the White Race Code	9515	2.0%	472427	.5818	1.52343
BLACK_BROWN_XFIRSTGEN	8582	1.8%	473360	.0821	.84733
What was your average grade in high school?	5275	1.1%	476667	6.39	4.275
WHITE_XSEX	4808	1.0%	477134	.3077	1.42634
BLACK_BROWN_XSEX	4808	1.0%	477134	.1096	.96532
ED INST TYPE (UNIVERSITY =0 & 4 YEAR COLLEGE =1)	794	0.2%	481148	.5016	1.54678
RECODE of SEX (Your sex:)	0	0.0%	481942	.54	1.540
BLACK_BROWN_AUTISM	0	0.0%	481942	.0007	.07935
INSTITUTIONAL CONTROL	0	0.0%	481942	1.32	1.447

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 11:57 PM
From: Rick Marcantonio
Subject: Multiple Imputation

I can't see the table I want to see - the Variable Summary.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 11:49 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,

I ran the model again (less iterations and imputation for the sake of time), and I received the same situation. I still have missingness in my pooled dataset for ALL variables.

I used the same syntax, except at the bottom, instead of MISSINGSUMMARIES = NONE, I changed it to: /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0).

If you have any insight into what the issue is, please let me know!

Best,

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 10:10 PM
From: Rick Marcantonio
Subject: Multiple Imputation

Well, no, they'd be at the bottom of the list, since they have no missing data.

I am suggesting that you re-run your original syntax, just please change /MISSINGSUMMARIES NONE to /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0).

I'm trying to get some idea what the "missingness" looks like in these data.

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 09:14 PM
From: Courtney B Francis
Subject: Multiple Imputation

HI @Rick Marcantonio,

I realized a typo in my reply to your previous message. I meant to say: There were three variables in the box below labeled: "not imputed (no missing values)"
Are you suggesting I add /MISSINGSUMMARIES OVERALL VARIABLES (MAXVARS=100 MINPCTMISSING=0) to the overall syntax and rerun the model.

Those three variables do not appear at the top of the list in the variable summary table.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 08:46 PM
From: Rick Marcantonio
Subject: Multiple Imputation

OK, maybe we've narrowed it down to those 3 variables that are not imputed due to missing values.

Try adding /MISSINGSUMMARIES OVERALL VARIABLES(MAXVARS=100 MINPCTMISSING=0)

Do those three variables appear at the top of the list in the Variable Summary Table?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 08:17 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi Rick!

That section was blank just as it appears in your image. There were three variables that were "not imputed" due to do missing values" Bur the "Not Imputed (too Many Missing Values) was blank as yours appears above.

Best,
Courtney B. Francis

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 05:12 PM
From: Rick Marcantonio
Subject: Multiple Imputation

For example, Courtney, this table. What do you have, for "Not imputed"?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 05:03 PM
From: Jon Peck
Subject: Multiple Imputation

But did the tables that the MI procedure produces show that some variables/values could not be imputed?

--

Jon K Peck
jkpeck@gmail.com

Original Message:
Sent: 9/19/2022 4:46:00 PM
From: Rick Marcantonio
Subject: RE: Multiple Imputation

Yes, I understand what you mean.

I'm curious if the student weight variable (STUDWGT) has any 0 or missing values...

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 04:16 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,!

They are being imputed (The amount of cases have increased considerably) and the amount of missingness has decreased from the original dataset, but it was my understanding that once all the variables of interest were imputed, there should no longer be any missingness-especially in the pooled dataset.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 04:11 PM
From: Rick Marcantonio
Subject: Multiple Imputation

So then, no values are being imputed at all?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 04:04 PM
From: Courtney B Francis
Subject: Multiple Imputation

Hi @Rick Marcantonio,

Thank you for your response!

All of the variables listed -including the ones in the "/constraints... as Role =IND" have missing data. In all the imputed datasets including the pooled data set. And the amount of missingess is the same across all imputed datasets.

------------------------------
Courtney B Francis

Original Message:
Sent: Mon September 19, 2022 01:27 PM
From: Rick Marcantonio
Subject: Multiple Imputation

Hi.

Do you mean apart from the variables specified in /CONSTRAINTS as ROLE=IND?

------------------------------
Rick Marcantonio
Quality Assurance
IBM

Original Message:
Sent: Mon September 19, 2022 11:32 AM
From: Courtney B Francis
Subject: Multiple Imputation

Hello!

I have attempted to run Multiple Imputation on my dataset. After the imputation ran I still had missing data in the pooled data set.

Do you know what might have caused this issue? I'm not sure if this is common or if there is an issue in my syntax or Data that may be creating this issue. Please see my syntax below:

SET THREADS = 4.

USE ALL.

FILTER OFF.

SORT CASES BY YEAR SUBJID.

EXECUTE.

SET SEED=20220913.

MULTIPLE IMPUTATION

INSTDICO

INSTCONT

WHITE

BLACK

Brown

Black_Brown_Total

BLACK_BROWN_AUTISM

MINORITIZED_POC

RECODEAUTISM

HSGPA

COLLEGE_INVOLVEMENT

HABITS_OF_MIND_GRP

ACADEMIC_SELFCONCEPT_GRP

SOCIAL_SELFCONCEPT_GRP

DEGASPDICO

SEX

FIRSTGEN

INCOME

ACT_FINAL

BLACK_BROWN_XINCOME

BLACK_BROWN_XHSGPA

BLACK_BROWN_XHOMG

BLACK_BROWN_XCOLLINV

BLACK_BROWN_XASCG

BLACK_BROWN_XSSCG

BLACK_BROWN_XACT

BLACK_BROWN_XSEX

BLACK_BROWN_XFIRSTGEN

BLACK_BROWN_XDEGASP

WHITE_XINCOME

WHITE_XHSGPA

WHITE_XHOMG

WHITE_XCOLLINV

WHITE_XASCG

WHITE_XSSCG

WHITE_XACT

WHITE_XSEX

WHITE_XFIRSTGEN

WHITE_XDEGASP

AUTISM_XINCOME

AUTISM_XHSGPA

AUTISM_XHOMG

AUTISM_XCOLLINV

AUTISM_XASCG

AUTISM_XSSCG

AUTISM_XACT

AUTISM_XSEX

AUTISM_XFIRSTGEN

AUTISM_XDEGASP

AUTISM_XBLACK_BROWN

AUTISM_XWHITE

RECODE_DISAB01

RECODE_DISAB02

RECODE_DISAB04

RECODE_DISAB05

RECODE_DISAB06

RECODE_DISAB07

/ANALYSISWEIGHT STUDWGT

/IMPUTE METHOD=FCS MAXITER= 100 NIMPUTATIONS=10 SCALEMODEL=LINEAR INTERACTIONS=NONE

SINGULAR=1E-012 MAXPCTMISSING=NONE MAXMODELPARAM =10000

/CONSTRAINTS BLACK_BROWN_XINCOME( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XHSGPA( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XHOMG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XCOLLINV( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XASCG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XSSCG( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XACT( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XSEX( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS BLACK_BROWN_XDEGASP( ROLE=IND)

/CONSTRAINTS WHITE_XINCOME( ROLE=IND)

/CONSTRAINTS WHITE_XHSGPA( ROLE=IND)

/CONSTRAINTS WHITE_XHOMG( ROLE=IND)

/CONSTRAINTS WHITE_XCOLLINV( ROLE=IND)

/CONSTRAINTS WHITE_XASCG( ROLE=IND)

/CONSTRAINTS WHITE_XSSCG( ROLE=IND)

/CONSTRAINTS WHITE_XACT( ROLE=IND)

/CONSTRAINTS WHITE_XSEX( ROLE=IND)

/CONSTRAINTS WHITE_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS WHITE_XDEGASP ( ROLE=IND)

/CONSTRAINTS AUTISM_XINCOME( ROLE=IND)

/CONSTRAINTS AUTISM_XHSGPA( ROLE=IND)

/CONSTRAINTS AUTISM_XHOMG( ROLE=IND)

/CONSTRAINTS AUTISM_XCOLLINV( ROLE=IND)

/CONSTRAINTS AUTISM_XASCG( ROLE=IND)

/CONSTRAINTS AUTISM_XSSCG( ROLE=IND)

/CONSTRAINTS AUTISM_XACT( ROLE=IND)

/CONSTRAINTS AUTISM_XSEX( ROLE=IND)

/CONSTRAINTS AUTISM_XFIRSTGEN( ROLE=IND)

/CONSTRAINTS AUTISM_XDEGASP ( ROLE=IND)

/CONSTRAINTS AUTISM_XBLACK_BROWN( ROLE=IND)

/CONSTRAINTS AUTISM_XWHITE( ROLE=IND)

/CONSTRAINTS DEGASP (RND=1 MIN=0 MAX=1)

/MISSINGSUMMARIES NONE

/IMPUTATIONSUMMARIES MODELS DESCRIPTIVES

/OUTFILE IMPUTATIONS=courtney_syntax_9_14_22.sav FCSITERATIONS=iteration_history.

------------------------------
Courtney B Francis
------------------------------
#SPSSStatistics

30. RE: Multiple Imputation

Like

Rick Marcantonio

Posted Tue September 20, 2022 08:21 AM
Edited by System Test Fri January 20, 2023 04:40 PM

By the way, do these means and standard deviations look right to you? The means are very large and variance practically non-existent.

WAIT. Sorry, this was an artifact of the way it reads in my email. N and Mean have no separator in my view.

------------------------------
Rick Marcantonio
Quality Assurance
IBM
------------------------------

Original Message

31. RE: Multiple Imputation

Like

Courtney B Francis

Posted Tue September 20, 2022 08:33 AM

@Rick Marcantonio

No worries!

I will do the syntax you suggested earlier and I can email you!

Best,

------------------------------
Courtney B Francis
------------------------------

Original Message

32. RE: Multiple Imputation

Like

Courtney B Francis

Posted Mon September 19, 2022 08:19 PM

All variables were imputed with the exception of the three in the "Not Imputed (No Missing Values)" section.

I used the example image from Rick, below to illustrate.

------------------------------
Courtney B Francis
------------------------------

Original Message

33. RE: Multiple Imputation

Like

Courtney B Francis

Posted Mon September 19, 2022 11:51 PM

Hi @Jon Peck,

Please see the results from my attempt:

Let me know if you have any advice for changes I should make.

Best,

------------------------------
Courtney B Francis
------------------------------

Original Message

34. RE: Multiple Imputation

Like

Frank Furter

Posted Thu September 22, 2022 02:03 AM

@Courtney B Francis, while this thread discusses several 'technical' aspects of performing MI in SPSS, let's not forget the impact that MI will have on the validity of the inference drawn from the results of the imputed data set. @Jon Peck and @Rick Marcantonio have already mentioned that the type of missingness (MCAR, MAR, MNAR) is important. Moreover, is there a 'healthy' ratio between the proportions of imputed and observed data? Are we even performing MI based on already imputed data?

In any case, consider performing sensitivity analyses under different scenarios in order to assess whether and how the imputations impact the results. Do analyses using different assumptions about missing data come to different conclusions regarding your main outcomes of interest?

------------------------------
Frank Furter
------------------------------

Original Message

35. RE: Multiple Imputation

Like

Courtney B Francis

Posted Thu September 22, 2022 09:37 PM
Edited by System Test Fri January 20, 2023 04:25 PM

Thank you for the insight @Frank Furter !
You are very right! And I did address patterns of missiningess before deciding to complete multiple imputation on my original (not already imputed data). Thank you for tapping in!

@Rick Marcantonio @Jon Peck @Frank Furter

I'm curious (not sure if I need to start a separate thread): Does anyone know why my pooled data set now has no values for minimum and maximum or standard deviation in the Descriptive Statistics? My original data (and the subsequent MI datasets) has the: N, Minimum, Maximum, Mean, Std. Deviation?

My Pooled Data has an N and a Mean, that's it. Any thoughts on what might have went wrong? Syntax is still the same:

Is there some way to get a pooled dataset that has all the descriptive statistics?

Best,

------------------------------
Courtney B Francis
------------------------------

Original Message

36. RE: Multiple Imputation

Like

Matthew Heller

Posted Wed April 10, 2024 08:38 PM

Hello Courtney and others,

This thread may have run cold, and you may be long gone, but I have the same question. Wondering if you or anyone has a clear answer? Similar to your situation, I have used the MI in SPSS to create a number of sets of data with missing values imputed. So far, when I have tried to run analysis, I get separate lines of output for the original data, as well as imputations 1 thru X. However, the "pooled" section at the end has only contained counts, means, and SE in some cases. I have tried running cross tabs with chi-square and one-way ANOVA so far, and the result is the same (meaning - no actual statistical output such as test values or p-values).

My understanding is that the point of creating multiple imputations is to make different models of reasonable data to fill missing values. Further, I figured that since sampling error will be better contained with larger samples, that a pooling of the various imputed sets of data would be the best outcome for analysis. If this is incorrect, could someone please help me better understand?

As it stands, I have 11 sets of results (original + 10 MI sets), and reading through them to try and make sense is quite challenging. In the few test analyses I have run, the results from all 11 sets match perfectly, so interpretation is straightforward. However, what happens if 2 or more results patterns emerge? Do I just split the difference or go with the most common result? This seems quite unscientific. Furthermore, what are the standards for reporting results in a paper if I am choosing between 11 different outcomes?

I have gotten a little deeper than I planned in this post, but thank you for any advice!

Best -

Matt

------------------------------
Matthew Heller
------------------------------

Original Message

37. RE: Multiple Imputation

Like

IBM Champion

Jon Peck

Posted Wed April 10, 2024 09:42 PM

You are not using MI to create a larger dataset nor to see how results vary by set. You do the MI step and then just run the procedure you want, e.g., regression. You don't use the individual repetitions. They are created in order to get proper estimates of the variances that can be used to pool the results. What you care about is only the pooled results. There are some statistics that don't get pooled, but the main results should be present after all the individual samples are estimated.

--

Jon K Peck
jkpeck@gmail.com

Original Message

38. RE: Multiple Imputation

Like

Matthew Heller

Posted Thu April 11, 2024 05:46 PM

Hello Jon - thank you for your response!

I may not be clearly communicating the problem (as I understand it), so let me offer a bit more context and some examples.

I filled in missing values in my dataset using the SPSS 29 process:

Analyze --> Multiple Imputation --> Impute Missing Data Values

The dataset looks right - each imputed set has created missing values and they are highlighted yellow. The default setting is to create 5 sets of imputed data, and because more is (sometimes) better, I created 10 imputed sets. So, now my dataset has the original data cases (N = 1552), and the imputation set 1 (cases 1553 - 3104), imputations #2, etc.

As you said, my goal was simply to run the analysis of choice, read the pooled data results, and move on with my life. However, SPSS is not producing any helpful pooled results. Here is a screenshot of a the relevant section of a basic Frequencies analysis I ran:

Note that "Imputation 10" includes counts as well as percentage, valid percentage, etc. On the other hand, the "Pooled" section only includes counts, and no additional calculations. I had to manually calculate each percentage to understand the patterns I was seeing. This same pattern repeats in other analyses as well. For example, here is a Crosstabs with chi-square analysis. First, a portion of the Crosstabulation table, where I asked for a bunch of content, such as "expected count", "row/column/total percentages", and "adjusted residual". You can see (in contrast to Imputation 10) that the only output for "Pooled" section is the actual counts:

And then, the actual chi-square output. Here you can see that SPSS does not even both to create a section for "Pooled" - it just skips it entirely, ending with Imputation 10:

I hope this clarifies the problem I am having. I am asking SPSS for an analysis, and not getting what I believe I am asking for. Could this be a problem based on menu commands (vs. syntax), since I am simply using the menus? Could this be something to do with having 10 instead of 5 imputations? Or am I misreading the output and expecting the wrong thing?

Any help would be appreciated. Thank you!

Matt

------------------------------
Matthew Heller
------------------------------

Original Message

39. RE: Multiple Imputation

Like

IBM Champion

Jon Peck

Posted Thu April 11, 2024 06:34 PM

The statistics in the pooled output would be computed by aggregating the values in the individual segments, not from the other statistics in the pooled section of the output. Doing the latter would pretty much require recreating the formulas in each procedure that supports MI analysis. Some procedure output isn't amenable to that approach.

There is a section in the Algorithms doc

Multiple Imputation: Pooling Algorithms

that might be helpful, but it doesn't enumerate all the possibilities.

Beyond that, I have to leave this to the statistician team to go deeper.

--

Jon K Peck
jkpeck@gmail.com

Original Message

40. RE: Multiple Imputation

Like

Frank Furter

Posted Fri April 12, 2024 03:02 AM

When multiply imputing missing data, you do not pool the imputed data sets and then perform the analysis. Instead, you perform the analysis separately on each of the imputed data sets and then pool the results. Some procedures in SPSS can do this automatically when they recognize a multiply imputed data set generated by the MI procedure whereas unfortunately others can't. See, e. g., https://www.ibm.com/docs/en/spss-statistics/29.0.0?topic=imputation-analyzing-multiple-data and https://bookdown.org/mwheymans/bookmi/data-analysis-after-multiple-imputation.html

------------------------------
Frank Furter
------------------------------

Original Message

41. RE: Multiple Imputation

Like

Matthew Heller

Posted Fri April 12, 2024 01:52 PM

Thank you for your feedback, Jon and Frank.

My understanding of what you are saying, and the attached documentation, is that using MI to fill in missing values is not nearly as straightforward in SPSS as I would have liked to see it. It seems that means and N are readily calculated in the pooled condition, but other analyses, including test statistics, p-values, or effect sizes are not in many cases.

My original goal in using MI was to fill in reasonable "guesses" for the missing values in my survey responses and then use this more complete data set in analysis. What I am thinking about now, as a solution to my problem, is whether there a meaningful way that I can take my 10 imputations, and perhaps average them (?), in order to create a single new, complete data set? I feel like I am being distracted by these 10 different imputation sets. Is is accurate to imagine each imputation as a sort of random sample of reasonable data points for that missing value? In other words, one imputation is not better or worse than another, as far as we know, because we do not know the true "population" value, but they should cluster around the population mean, following rules of normal distributions, etc.? If this is correct, then if I averaged the 10 imputations for each missing value, and entered them in a final, complete data set, I could run analyses without worrying about MI and which analyses support pooling or not. In other words, I would have a single dataset composed of a) my original, real data, and b) missing values composed of the average of the 10 imputations for each value. Would that make sense?

------------------------------
Matthew Heller
------------------------------

Original Message

42. RE: Multiple Imputation

Like

IBM Champion

Jon Peck

Posted Fri April 12, 2024 02:14 PM

This would be defeating the point of multiple imputation. You would be better off just using the single imputation method with an appropriate choice of imputation method. The point of MI is to account for variance, and you would be eliminating that.

--

Jon K Peck
jkpeck@gmail.com

Original Message

AI and Data Science

Master the art of AI and Data Science.

SPSS Statistics

Courtney B FrancisMon September 19, 2022 01:04 PM

Jon PeckMon September 19, 2022 01:27 PM

Rick MarcantonioMon September 19, 2022 01:28 PM

Courtney B FrancisMon September 19, 2022 04:05 PM

Rick MarcantonioMon September 19, 2022 04:11 PM

Courtney B FrancisMon September 19, 2022 04:17 PM

Rick MarcantonioMon September 19, 2022 04:46 PM

Jon PeckMon September 19, 2022 05:04 PM

Rick MarcantonioMon September 19, 2022 05:12 PM

Courtney B FrancisMon September 19, 2022 08:17 PM

Rick MarcantonioMon September 19, 2022 08:47 PM

Courtney B FrancisMon September 19, 2022 09:15 PM

Rick MarcantonioMon September 19, 2022 10:10 PM

Courtney B FrancisMon September 19, 2022 10:25 PM

Courtney B FrancisMon September 19, 2022 11:49 PM

Rick MarcantonioMon September 19, 2022 11:58 PM

Courtney B FrancisTue September 20, 2022 12:03 AM

Rick MarcantonioTue September 20, 2022 12:39 AM

Courtney B FrancisTue September 20, 2022 08:34 AM

Rick MarcantonioTue September 20, 2022 08:36 AM

Courtney B FrancisTue September 20, 2022 08:39 AM

Rick MarcantonioTue September 20, 2022 09:26 AM

Courtney B FrancisTue September 20, 2022 03:42 PM

Rick MarcantonioTue September 20, 2022 04:01 PM

Courtney B FrancisTue September 20, 2022 04:49 PM

Rick MarcantonioTue September 20, 2022 04:58 PM

Jon PeckTue September 20, 2022 05:24 PM

Rick MarcantonioTue September 20, 2022 05:28 PM

Courtney B FrancisWed September 21, 2022 10:23 AM

Rick MarcantonioTue September 20, 2022 08:21 AM

Courtney B FrancisTue September 20, 2022 08:33 AM

Courtney B FrancisMon September 19, 2022 08:19 PM

Courtney B FrancisMon September 19, 2022 11:51 PM

Frank FurterThu September 22, 2022 02:03 AM

Courtney B FrancisThu September 22, 2022 09:37 PM

Matthew HellerWed April 10, 2024 08:38 PM

Jon PeckWed April 10, 2024 09:42 PM

Matthew HellerThu April 11, 2024 05:46 PM

Jon PeckThu April 11, 2024 06:34 PM

Frank FurterFri April 12, 2024 03:02 AM

Matthew HellerFri April 12, 2024 01:52 PM

Jon PeckFri April 12, 2024 02:14 PM

1. Multiple Imputation

2. RE: Multiple Imputation

3. RE: Multiple Imputation

4. RE: Multiple Imputation

5. RE: Multiple Imputation

6. RE: Multiple Imputation

7. RE: Multiple Imputation

8. RE: Multiple Imputation

9. RE: Multiple Imputation

10. RE: Multiple Imputation

11. RE: Multiple Imputation

12. RE: Multiple Imputation

13. RE: Multiple Imputation

14. RE: Multiple Imputation

15. RE: Multiple Imputation

16. RE: Multiple Imputation

17. RE: Multiple Imputation

18. RE: Multiple Imputation

19. RE: Multiple Imputation

20. RE: Multiple Imputation

21. RE: Multiple Imputation

22. RE: Multiple Imputation

23. RE: Multiple Imputation

24. RE: Multiple Imputation

25. RE: Multiple Imputation

26. RE: Multiple Imputation

27. RE: Multiple Imputation

28. RE: Multiple Imputation

29. RE: Multiple Imputation

30. RE: Multiple Imputation

31. RE: Multiple Imputation

32. RE: Multiple Imputation

33. RE: Multiple Imputation

34. RE: Multiple Imputation