SPSS Statistics

SPSS Statistics

Your hub for statistical analysis, data management, and data documentation. Connect, learn, and share with your peers! 

 View Only
  • 1.  Syntax editor encoding issue

    Posted Tue March 07, 2023 04:12 PM

    Hello,

    we have several syntax documents with declaration "* Encoding: UTF-8.", but without BOM. When we open such a document from Windows Explorer with double click, the SPSS editor unexpectedly uses codepage encoding. This led to some confusion.

    Regards, fdsbonn



    ------------------------------
    forplan Statistik
    ------------------------------


  • 2.  RE: Syntax editor encoding issue

    Posted Tue March 07, 2023 04:37 PM

    Hi. Just so I am clear, how do you know that it is codepage encoding? Do you have UNICODE OFF?

    If so, then the file will be treated as non-UTF8.



    ------------------------------
    Rick Marcantonio
    Quality Assurance
    IBM
    ------------------------------



  • 3.  RE: Syntax editor encoding issue

    Posted Tue March 07, 2023 05:10 PM

    When I select the file in File, Open, Syntax, the dialog automatically switches to "declarated" ("Wie deklariert" in german), and the file is shown correct. But when I select the file in Windows Explorer, there is no choice of encoding and umlauts are shown incorrect. Unicode is always ON.

    I personally use Notepad++ for editing and check the file encoding there. This is also the reason a lot of files without BOM exist, as I didn't set it. We always assumed the declaration line has priority in SPSS and wondered about the resulting mess.



    ------------------------------
    forplan Statistik
    ------------------------------



  • 4.  RE: Syntax editor encoding issue

    Posted Tue March 07, 2023 05:17 PM

    In Notepad++, could you change from UTF-8 to UTF-8-BOM and see if that makes a difference?



    ------------------------------
    Rick Marcantonio
    Quality Assurance
    IBM
    ------------------------------



  • 5.  RE: Syntax editor encoding issue

    Posted Tue March 07, 2023 05:18 PM

    Yes it does. Files with BOM always are displayed correct.



    ------------------------------
    forplan Statistik
    ------------------------------



  • 6.  RE: Syntax editor encoding issue

    Posted Tue March 07, 2023 05:34 PM

    Good!

    The commented line with UTF-8 (or not) isn't a definitive instruction to the program on how to treat the file; the byte-order-mark is.

    We should make the syntax editor more intelligent and robust in this regard, no doubt, but at least for the time being be sure when in Notepad++ to save with the BOM and you should be fine.



    ------------------------------
    Rick Marcantonio
    Quality Assurance
    IBM
    ------------------------------



  • 7.  RE: Syntax editor encoding issue

    Posted Tue March 07, 2023 06:01 PM

    Yes, now we figured out under which circumstances corruption happens, we can get around it.

    Alas Notepad++ has no means to define standard encoding by file type, and I don't want to have BOMs in all text files. Unfortunately some programs are not able to treat UTF8 with BOM correctly. For example the csv import dialog in SPSS will generate code that has the BOM as part of the first variable name. And the opinions regarding UTF8 with BOM differ.



    ------------------------------
    forplan Statistik
    ------------------------------