Hi, CP. As for your first question, please see
this.
As for the second question, it depends. It is true that you want to understand missing data, but not always true that you have to actually *do* something. Sometimes listwise deletion is fine.
I do not advise you to add constants to the data, but to begin with the
Missing Values Analysis feature in SPSS. Select the variables of interest and run Little's test of MCAR (missing completely at random). If that comes back non-significant, then ignoring missing data (MISSING=LISTWISE) may be OK; PAIRWISE can also be used if you want to make better use of the available data.
If, however, Little's test comes back significant, then most likely the data are
at least MAR (missing at random; see
here for a better understanding of MCAR vs. MAR). In that event, you do
not want PAIRWISE or MEANSUB (which I don't recommend anyway, since substituting the mean does nothing more than underestimate the covariance and give the analysis degrees of freedom it doesn't necessarily warrant having - see the first few chapters of Little and Rubin's 1987 book on analysis with missing data to learn more about all of that).
Sorry; this is kind of a difficult subject but, as you already know, missingness in data is important to understand.
------------------------------
Rick Marcantonio
Quality Assurance
IBM
------------------------------
Original Message:
Sent: Tue November 23, 2021 02:52 AM
From: CP T
Subject: PCA for animal stomach food composition?
Hi, I'm doing PCA on the stomach food composition of 13 animals species and their 6 types of food composition measured in percentage in a scale of 0-100%.
Question 1:
Since all 6 variables are measured on the same scale (0-100%), let said despite of missing data (it is natural that some food are consumed by some but no others), and some low (~0-20%) vs high (~90-100%) values, is data standardization required here?
Question 2:
My understanding is that missing data in PCA need to be deal with. Consider the nature of the missing data in my case, what suggestion would you give? Should I fill in the missing values with calculated standardized data, mean imputation technique, or replace them with constant?
Thank you.
------------------------------
CP T
------------------------------
#SPSSStatistics