SPSS Statistics

SPSS Statistics

Your hub for statistical analysis, data management, and data documentation. Connect, learn, and share with your peers! 

 View Only
Expand all | Collapse all

Frequently Get :This file contains one or more characters not recognized by the selected or default encoding. The syntax could produce errors or could produce unintended results

  • 1.  Frequently Get :This file contains one or more characters not recognized by the selected or default encoding. The syntax could produce errors or could produce unintended results

    Posted Wed November 13, 2024 01:43 PM

    When I am writing code in the SPSS syntax editor, I frequently get a Failed to save document box when I select the save icon. If I try to save again, I get a box with the following: 

    This file contains one or more characters not recognized by the selected or default encoding. The syntax could produce errors or could produce unintended results.
    Press OK to continue. Press Cancel to select a different encoding.

    When I select OK, the file saves, but frequently the cursor disappears from the syntax window and nothing works. I can work with the data and output windows but sometimes they are a problem as well. 

    I notice that I get multiple *Encoding: at the top of the file. Sometimes the top line is: * Encoding: UTF-8.

    I am using version 30, but this started happening with the previous version. I have been using SPSS for years and have not seen this before. It really puts a hit on productivity, and I can't seem to find an answer.

    Any help would be really appreciated.



    ------------------------------
    Kevin Taylor
    ------------------------------


  • 2.  RE: Frequently Get :This file contains one or more characters not recognized by the selected or default encoding. The syntax could produce errors or could produce unintended results

    Posted Wed November 13, 2024 02:59 PM

    Hi @Kevin Taylor

    A couple things:

    • For some time now IBM SPSS Statistics has defaulted to Unicode mode.  Note that it is possible to have SPSS Statistics set for Unicode mode, but have an SPSS Statistics Syntax Editor document in Codepage mode.  This can lead to confusion and failed runs.  My suggestion is to put your SPSS Statistics into Unicode mode and never look back.  You can change the underlying SPSS Locale to fit whatever character set you'd like to use.  But for most (all) occasions you will want to stay in Unicode mode.
    • The very first release of IBM SPSS Statistics 30.0.0.0 (build 171) showed the behavior you are describing in the syntax editor.  This release was refreshed with IBM SPSS Statistics 30.0.0.0 (build 172).  Be sure that your instance of  Statistics 30.0.0.0 is build 172.  If it is not, please download the media once more.  The Part number is on the Download Document for 30.0.



    ------------------------------
    David Dwyer
    SPSS Technical Support
    IBM Software
    ------------------------------



  • 3.  RE: Frequently Get :This file contains one or more characters not recognized by the selected or default encoding. The syntax could produce errors or could produce unintended results

    Posted Tue April 29, 2025 12:44 PM

    Hi @David Dwyer

    I receive a similar warning message when I try to run an INSERT command "SPSS Statistics read a line of syntax which contains one or more characters which are invalid in the current locale.  These characters have been converted to question marks." When I open the syntax file and simply run all I do not get this warning, so I assume it is related to the INSERT command.

    I am using IBM SPSS Statistics 30.0.0.0 (build 172) on a Windows 11 machine and both my syntax windows (parent/child, where child is in the INSERT command) have "* Encoding: UTF-8." at the very top line of the syntax file. I also ran SHOW LOCALE and the results show "en_US.windows-1252 (English)".

    Any thoughts?



    ------------------------------
    Chris Keran
    ------------------------------



  • 4.  RE: Frequently Get :This file contains one or more characters not recognized by the selected or default encoding. The syntax could produce errors or could produce unintended results

    Posted Tue April 29, 2025 12:50 PM
    There were some circumstances where an external invocation of SPSS would not start in Unicode mode.  I don't know whether that is still true, but try running a SET UNICODE ON command before the INSERT.  You can't put the SET command in the INSERT file as it reads the file before executing the contents IIRC.






  • 5.  RE: Frequently Get :This file contains one or more characters not recognized by the selected or default encoding. The syntax could produce errors or could produce unintended results

    Posted Tue April 29, 2025 12:56 PM
    Edited by Chris Keran Tue April 29, 2025 01:00 PM

    Thanks for the suggestion, Jon, unfortunately that didn't work.

    FYI, here is the syntax I just tried that didn't work to prevent the warning (the warning appears only in the syntax window NOT the Output window).

    SET UNICODE ON.
    INSERT FILE='K:\CEA\MI\RESOURCES\SPSS\World Regions syntax.sps' ENCODING='UTF8'.

    Fyi, I tried with/without the ENCODING='UTF8.



    ------------------------------
    ************
    Chris Keran
    ************
    ------------------------------



  • 6.  RE: Frequently Get :This file contains one or more characters not recognized by the selected or default encoding. The syntax could produce errors or could produce unintended results

    Posted Tue April 29, 2025 01:02 PM
    Hmm,  Does the file have a BOM (byte order mark) at the start






  • 7.  RE: Frequently Get :This file contains one or more characters not recognized by the selected or default encoding. The syntax could produce errors or could produce unintended results

    Posted Tue April 29, 2025 01:10 PM
    Edited by Chris Keran Tue April 29, 2025 01:11 PM

    This is at the very top of the syntax file: * Encoding: UTF-8.

    The rest is just SPSS syntax/comments.

    FYI, still have that copy/paste issue between SPSS and anything else, so I took me a bit to paste the encoding text. :)

    ------------------------------
    Chris Keran
    ------------------------------



  • 8.  RE: Frequently Get :This file contains one or more characters not recognized by the selected or default encoding. The syntax could produce errors or could produce unintended results

    Posted Tue April 29, 2025 03:28 PM

    Hi @Chris Keran,

    Assumptions I'm making about your scenario:

    • You have IBM SPSS Statistics 30.0.0.0 (172) installed.  If you are connecting to an IBM SPSS Statistics Server, then that too should be version 30.0.0.0 (172).  Check your version with "SHOW VERSION." command syntax.
    • Statistics is running in Unicode mode with the Locale set to the same as your Windows display language.  Check this with "SHOW UNICODE LOCALE." command syntax.
    • Your command syntax file is also in Unicode with all characters appropriate for your chosen LOCALE

    You mentioned previously that you are getting multiple encoding lines at the top of your syntax file?  That seems odd!

    Ideally, I would have you open a Support case and we could dig into your environment and this (these?) syntax file(s) more thoroughly.



    ------------------------------
    David Dwyer
    Global Solution Consultant
    IBM
    ------------------------------



  • 9.  RE: Frequently Get :This file contains one or more characters not recognized by the selected or default encoding. The syntax could produce errors or could produce unintended results

    Posted Tue April 29, 2025 03:51 PM

    Hi @David Dwyer, Here are my answers to each of your assumptions...

    • You have IBM SPSS Statistics 30.0.0.0 (172) installed. YES, and confirmed with SHOW VERSION.
    • If you are connecting to an IBM SPSS Statistics Server, then that too should be version 30.0.0.0 (172).  I have SPSS installed on my local machine and am not connecting to an SPSS server.
    • Statistics is running in Unicode mode with the Locale set to the same as your Windows display language.  Check this with "SHOW UNICODE LOCALE." command syntax. The SHOW UNICODE LOCALE command returns: en_US.windows-1252 (English) and my Windows display language shows: English (United States).
    • Your command syntax file is also in Unicode with all characters appropriate for your chosen LOCALE. Each syntax file has this at the top: * Encoding: UTF-8.

    You mentioned previously that you are getting multiple encoding lines at the top of your syntax file?  That seems odd! This syntax occurs only once in each syntax file: * Encoding: UTF-8.



    ------------------------------
    Chris Keran
    ------------------------------



  • 10.  RE: Frequently Get :This file contains one or more characters not recognized by the selected or default encoding. The syntax could produce errors or could produce unintended results

    Posted Wed April 30, 2025 02:29 AM

    Hi @Chris Keran.

    This happens when the syntax file claims to be UTF-8 encoded (with Encoding: UTF-8 tag), but in reality, it is not, or the BOM is missing. When you open the syntax manually and execute it, there is no error, but there is one with INSERT FILE. I check and correct this by opening the syntax with Notepad++. It shows whether the file has a BOM. Usually, it helps me if I change the file's encoding to UTF-8-BOM using Notepad++. This problem can arise when not all editors of the file use the same encoding settings in SPSS options



    ------------------------------
    Frederic Dahl
    ------------------------------



  • 11.  RE: Frequently Get :This file contains one or more characters not recognized by the selected or default encoding. The syntax could produce errors or could produce unintended results

    Posted Wed April 30, 2025 09:18 AM

    @Frederic Dahl,

     Thank you for the BOM suggestion. Both files only had "* Encoding: UTF-8." at the top of the file. I tried your suggestion, so I opened the INSERT file within Notepad, replaced "* Encoding: UTF-8." with "* Encoding: UTF-8-BOM." but when I used the INSERT command on this file, I got a File not found error. I then opened this file with SPSS and when I tried to save this .sps file, I received this message:

    This file contains one or more characters not recognized by the selected or default encoding. The syntax could produce errors or could produce unintended results. Press OK to continue. Press Cancel to select a different encoding.

    Thanks anyway!



    ------------------------------
    Chris Keran
    ------------------------------



  • 12.  RE: Frequently Get :This file contains one or more characters not recognized by the selected or default encoding. The syntax could produce errors or could produce unintended results

    Posted Wed April 30, 2025 09:24 AM

    Ok, an UPDATE. SUCCESS! After doing all that above with the BOM, I reverted back to the INSERT file without the BOM, and now all is well without any warnings. I rebooted and tried again and no warnings. Not sure exactly how/why this worked but all seems well now. 

    Thank you all, especially, @Frederic Dahl!



    ------------------------------
    Chris Keran
    ------------------------------



  • 13.  RE: Frequently Get :This file contains one or more characters not recognized by the selected or default encoding. The syntax could produce errors or could produce unintended results

    Posted Wed April 30, 2025 09:32 AM

    Hello @Chris Keran,

    That's not what I meant! Once, it's the string at the beginning of the file. Do not change this! It must remain * Encoding: UTF-8. However, the encoding of the file itself does not seem to match. You can change the encoding of the file itself with Notepad++ (Encoding > Convert to UTF-8-BOM). But make a backup of the file first just to be safe.



    ------------------------------
    Frederic Dahl
    ------------------------------



  • 14.  RE: Frequently Get :This file contains one or more characters not recognized by the selected or default encoding. The syntax could produce errors or could produce unintended results

    Posted Wed April 30, 2025 10:36 AM

    I went in and saved a file as your specified in textpad. I opened it in SPSS and began editing it. I saved my changes once, and the syntax editor let me. As usual after the second save attempt sometimes the third if I am lucky, I got a "Failed to save document" message box. When I try to save again, I get the "This file contains one or more characters not recognized by the default encoding. ...." That I always get.

     

    When I go ahead and save, syntax editor stops working.  I still get a new: * Encoding: . line below the initial * Encoding: UTF-8. Line at the to of the file. SPSS is set to Unicode under the option menu.

     

    I would really like to get this fixed. I have tried everything and the issue persists.

     

    Thank you.

     






  • 15.  RE: Frequently Get :This file contains one or more characters not recognized by the selected or default encoding. The syntax could produce errors or could produce unintended results

    Posted Wed April 30, 2025 12:22 PM
    Is it possible that the error message is actually correct?  That is, might there be some invalid byte strings in the file?


    --





  • 16.  RE: Frequently Get :This file contains one or more characters not recognized by the selected or default encoding. The syntax could produce errors or could produce unintended results

    Posted Wed April 30, 2025 12:26 PM
    It happens with every file. I have hundreds and every time I open one or create a new file this happens.

    Kevin Taylor





  • 17.  RE: Frequently Get :This file contains one or more characters not recognized by the selected or default encoding. The syntax could produce errors or could produce unintended results

    Posted Wed April 30, 2025 12:34 PM
    Can you send me a file that triggers this behavior (jkpeck@gmail.com)?


    --





  • 18.  RE: Frequently Get :This file contains one or more characters not recognized by the selected or default encoding. The syntax could produce errors or could produce unintended results

    Posted Tue December 03, 2024 04:01 PM

    Kevin, did either of the solutions below work for you? I also have this issue! I already appear to be in Unicode mode, but I'm not sure which build of the software I have. The latter involves opening a ticket with our help desk (I don't have admin rights), so I'd like to know if this worked. Thanks!



    ------------------------------
    Amanda Felbab
    ------------------------------



  • 19.  RE: Frequently Get :This file contains one or more characters not recognized by the selected or default encoding. The syntax could produce errors or could produce unintended results

    Posted Tue December 03, 2024 04:13 PM
    Amanda,
    Neither worked.

    Kevin Taylor





  • 20.  RE: Frequently Get :This file contains one or more characters not recognized by the selected or default encoding. The syntax could produce errors or could produce unintended results

    Posted Tue December 03, 2024 04:54 PM

    Thanks, Kevin. That is a bummer. 

    @David Dwyer, do you have any other ideas for this one? It is a frustrating one.

    Also, even though it doesn't seem to fix this issue, I will eventually want the newer build of v30. However, I don't use PAO. My view looks like the one below. Maybe this is because it's a subscription. If I download this here, will it automatically be the newer build (172)? I currently have 171. 



    ------------------------------
    Amanda Felbab
    ------------------------------



  • 21.  RE: Frequently Get :This file contains one or more characters not recognized by the selected or default encoding. The syntax could produce errors or could produce unintended results

    Posted Tue April 29, 2025 03:53 PM

    Hi @Amanda Felbab,

    IBM SPSS Statistics Subscription is identical to IBM SPSS Statistics 30.0.0.0 (172).  The only difference between the two is their licensing mechanism.  To be sure you have the latest and greatest, launch IBM SPSS Statistics Subscription and use the "Help -> Check for updates" menus.  If there is a newer release than what you have installed currently, this will be one route to update to it.



    Has there ever been a time where your syntax file was edited outside of the Syntax Editor?  It is a text file after all and can be edited by any text editor you like.  Just be sure of the encoding when you save it from that favored editor.  Just adding "* Encoding: UTF-8." to the top of the file does not make it so. Also, when you open the file in the Syntax Editor, take a moment to ensure the file is being read "As Declared" or forcefully choose how to read it:


    The key is to make sure everything lines up.

    • The file really is unicode
    • The characters in the file actually match the unicode characters for your chosen locale

    Is this a command syntax file that you share with other co-workers?  Is it possible one (or more) of them do not have IBM SPSS Statistics in Unicode mode and/or are not using the same Locale as you?  Many times I've seen corruption in both Syntax and Data files when members of the same team are not interacting with shared files using the same settings.



    ------------------------------
    David Dwyer
    Global Solution Consultant
    IBM
    ------------------------------



  • 22.  RE: Frequently Get :This file contains one or more characters not recognized by the selected or default encoding. The syntax could produce errors or could produce unintended results

    Posted Tue April 29, 2025 04:49 PM

    @David Dwyer - thank you for these ideas! It is still happening, so I hope something works. One thing is that I sometimes use Generative AI to create code that I'm pasting into my syntax. Do you think this could be the culprit? Also, similar to @Chris Keran, it's pretty typical to see something like below at the top of my syntax. 

    * Encoding: UTF-8.
    * Encoding: .
    * Encoding: .
    * Encoding: .
    * Encoding: .
    * Encoding: .



    ------------------------------
    Amanda Felbab
    ------------------------------



  • 23.  RE: Frequently Get :This file contains one or more characters not recognized by the selected or default encoding. The syntax could produce errors or could produce unintended results

    Posted Tue April 29, 2025 05:21 PM
    I still get this all of the time and I only use the syntax editor to write code. When I try to save, I still get the error messages that I posted earlier. The editor won't then let me write any further code. If I close the editor then reopen the file, I can continue. I don't have to exit spss to continue working. I have tried every solution that was suggested to no result. I have used spss for over 40 years and have never before run into an issue that can't be easily fixed until now. Very frustrating when working under time constraints.

    Kevin Taylor