AI and Data Science Master the art of data science. Join now
Hello, I am attempting to perform data analysis of a data set supplied by the National Cancer Database (NCDB). The data was supplied in the form of a .dat file. I am able to answer the .dat file within SPSS just fine, however, there is a significant formatting issue as the labels/variables are not being displayed, and I'm not sure whether the data is being formatted properly within SPSS.
Through my initial discussions with the IBM support staff, they helped me find out that the issue likely has to do with the fact that unlike in the past years visions of the dataset, the "script" required for the data is not provided and from the data information documents, "we no longer provide STATA or SPSS scripts...if you use software other than SAS, and need to create your own script." They state that "NCDB PUF data files are provided to investigators in a flat text file format and can be read with any standard statistical software package of the investigator's choosing."
Now, on the website, there is a separate SPSS script that you can download from the NCDB website, which I have and is titled,"ncdb-puf-spss-labels-2020 (1).sps". The question/issue I'm having is: 1) is this the script they previously say we "need to create" or is this separate and we actually have to create a script?. And 2) if this is the correct script, how to utilize this within SPSS so that my .dat file gets opened in a form that I can properly understand/utilize it.
I am hoping that there is someone here in this community that has ran into the same issue and could not, but if not, hopefully someone within the community will be able to understand the issue I'm having and provide much needed support. I hope the information I provided is adequate/makes sense, but if not, please feel free to ask for clarification/more information.
I very, very much appreciate your help in advance.