SPSS Statistics

Expand all | Collapse all

Problem with user-missing value identification from string variables (Python class spssdata)

  • 1.  Problem with user-missing value identification from string variables (Python class spssdata)

    Posted Tue August 24, 2021 04:06 PM
      |   view attached
    I need help about SPSS Statistics integration with Python 3.

    I can't identify user-missing values from string variables with the hasmissing and ismissing methods from Python class spssdata. A single example (syntax file and SPSS Statistics output) is attached. This program block identifies the cases with missing values for the variable numVar, but not the ones for stringVar (case 4 has an user-missing value of stringVar).

    ------------------------------
    Estefano Souza
    ------------------------------

    Attachment(s)

    zip
    example.zip   1 KB 1 version


  • 2.  RE: Problem with user-missing value identification from string variables (Python class spssdata)

    Posted Tue August 24, 2021 05:23 PM
    I see the problem looking at the spssdata.ismissing code.  It looks like this.
    stringmv = isinstance(value, str)
    # with long strings, only the first 8 bytes are compared to the defined missing values
    # truncating a multibyte/utf-8 string could result in a partial character at the end
    # but then the first 8 bytes could not be a missing value anyway.

    if stringmv:
    value = value[:8].rstrip() # 8-24-2021 add rstrip
    if value is None or value in missingtuple[-3:]:
    return True
    if missingtuple[0] == 0 or stringmv:
    return False

    At line 809 (version 28), it should be this
        value = value[:8].rstrip() # 8-24-2021 add rstrip

    I suggest you change the code accordingly.  However, there is a comment elsewhere in the module that the rstrip call was removed.  I can't see why.  Try this change and let me know if everything works for you while I investigate.

    You can find the spssdata.py module after the import if you do
    print(spssdata)

    Jon Peck (jkpeck@gmail.com)

    ------------------------------
    Jon Peck
    ------------------------------



  • 3.  RE: Problem with user-missing value identification from string variables (Python class spssdata)

    Posted Wed August 25, 2021 10:57 AM
    Hello, Jon.

    I changed the spssdata.py code and everything worked: the value of stringVar in case 4 is identified as user-missing and there were no problems with the rstrip method. However, I've made some tests using long strings as user-missing values and, as expected, it hasn't worked.

    Thank you for the help.

    Estefano

    ------------------------------
    Estefano Souza
    ------------------------------



  • 4.  RE: Problem with user-missing value identification from string variables (Python class spssdata)

    Posted Wed August 25, 2021 11:49 AM
    Yes, it's unfortunate but official that only the first eight bytes are checked.  I will submit this as a fix for the V28 fixpack after a bit more checking.

    --