IBM Security QRadar SOAR

 View Only
Expand all | Collapse all

UnicodeEncodeError 'ascii' 'ordinal not in range(128)'

  • 1.  UnicodeEncodeError 'ascii' 'ordinal not in range(128)'

    Posted Fri April 10, 2020 07:00 AM
    In my post process results I have this as a dictionary value:

    {u'type': u'email_subject', u'value': u'Win a \xa3100 gift card by telling Sky what you think!'}

    It's a result passed directly from my function script to Resilient. If I try to execute my function, which does some other operations on the dictionary (as a json in this case), I obtain the below error:

    UnicodeEncodeError('ascii', u'Win a \xa3100 gift card by telling Sky what you think!', 6, 7, 'ordinal not in range(128)')

    I tried first to do some operations in post-process:
    - instead of cast my code to str() I used unicode() - nothing changed
    - use .encode() or .decode() function but I get a very annoying message: An error occurred while processing the action acknowledgement. Additional information: Post-processing script for Function 'Cisco Threat Response' from Workflow 'Cisco Sightings' was unable to complete because: access denied ("java.lang.RuntimePermission" "accessClassInPackage.encodings") - I don't know why encoding functions are disabled...

    Then, in the function script (or my local script):
    - encode() decode() functions - change the way it's encoded (from \xa to \ua for example and viceversa) but the problems remains: I don't get the pund symbol £
    - in my "offline" script used json.dumps (keep in mind that we are starting from a json) with ensure_ascii=False - this, finally, showed my the £ symbol in console. I use PyCharm and I learned that it try to show you the result in console in utf-8 format so this solution is not a solution at all as long as I also learned that Resilient use ascii encoding.

    What I have to do ? How can I solve this problem ?
    My goal is to have well formatted the pound symbol £ in final Notes.

    Thanks

    ------------------------------
    Lucian Sipos
    ------------------------------


  • 2.  RE: UnicodeEncodeError 'ascii' 'ordinal not in range(128)'

    IBM Champion
    Posted Mon April 13, 2020 03:59 AM
    This is a product limitation. You can only work with ASCII, but the documentation also states that it is possible to use unicode string literals.

    So, you you can ignore the characters (as you mentioned), or you can use unicode. I have not tested it, but you might be able to get away with this depending on what you're doing:
    unicode(your_string_here, "utf-8")

    You could also do the operations dynamically within your function (or via a utility function) to eliminate this issue. The processors will only fail when you try to manipulate variables containing non-ASCII.

    References:
    https://www.ibm.com/support/knowledgecenter/en/SSBRUQ_36.0.0/doc/playbook/resilient_playbook_configscripts_considerations.html

    ------------------------------
    Jared Fagel
    Cyber Security Analyst I
    Public Utility
    ------------------------------



  • 3.  RE: UnicodeEncodeError 'ascii' 'ordinal not in range(128)'

    Posted Mon April 13, 2020 08:54 AM
    I would like to understand this use case better. Resilient should be able to handle unicode. It may be that some python code is assuming ASCII where a unicode string is present that may result in errors. 

    This link https://www.ibm.com/support/knowledgecenter/en/SSBRUQ_36.0.0/doc/playbook/resilient_playbook_configscripts_considerations.html only refers to hard coded strings within in-product scripts. If you have constructed the dictionary with unicode (which it looks like you have) and the python code executing handles unicode then it should work.

    You mention this error:

    UnicodeEncodeError('ascii', u'Win a \xa3100 gift card by telling Sky what you think!', 6, 7, 'ordinal not in range(128)')


    Is this error showing up in the circuits logs? Or is this error showing up in the Resilient logs? I'm trying to figure out if the code causing this is circuits code or in-product script code. Can you post the code that is executing when this error occurs?

    Ben



    ------------------------------
    Ben Lurie
    ------------------------------



  • 4.  RE: UnicodeEncodeError 'ascii' 'ordinal not in range(128)'

    IBM Champion
    Posted Mon April 13, 2020 11:47 AM
    Edited by Jared Fagel Mon April 13, 2020 11:47 AM
    Hey Ben,

    I believe the pre/post processor scripts essentially have this defined on the backend:

    # -*- coding: ascii -*-

    This error is reported within the action status view when it occurs. As a test, you can try to add the mentioned symbol into the incident name by directly modifying the field. Then, try to run a script or a pre/post processor containing a method that accesses the field. Simply doing a .replace() should produce this.

    We've had this issue, and our solution has been to strip the characters or to handle them in the functions directly.

    ------------------------------
    Jared Fagel
    Cyber Security Analyst I
    Public Utility
    ------------------------------



  • 5.  RE: UnicodeEncodeError 'ascii' 'ordinal not in range(128)'

    Posted Tue April 14, 2020 05:54 AM
    Edited by Lucian Sipos Tue April 14, 2020 06:21 AM
    Thanks for the answers.

    Using unicode() is not a viable option as long as I have a dictionary at code level. But I tried anyway, by first using json.dumps() and then applied unicode(str, "utf-8). This is the result now:

    {"type": "email_subject", "value": "Win a \u00a3100 gift card by telling Sky what you think!"}

    I would like to avoid to take care of each value of my dict to have the desired result because dictionary itself is big and, maybe, it's not a programmatic solution.

    What I did after reading your answers in locale was this:

    import json

    my_dict = {u'related': {u'type': u'email_subject', u'value': u'Win a \xa3100 gift card by telling Sky what you think!'}}


    # option 1
    def safe_str(obj):
    try:
    return str(obj)
    except UnicodeEncodeError:
    return obj.encode('ascii', 'ignore').decode('ascii')


    print(safe_str(my_dict))
    # >>> {u'related': {u'type': u'email_subject', u'value': u'Win a \xa3100 gift card by telling Sky what you think!'}}

    # option 2
    dumped = json.dumps(my_dict)
    print(dumped)
    # >>> {"related": {"type": "email_subject", "value": "Win a \u00a3100 gift card by telling Sky what you think!"}}

    loaded = json.loads(dumped)
    print(loaded)
    # >>> {u'related': {u'type': u'email_subject', u'value': u'Win a \xa3100 gift card by telling Sky what you think!'}}

    # option 3
    print(dumped.encode('ascii', 'ignore').decode('ascii'))
    # >>> {"related": {"type": "email_subject", "value": "Win a \u00a3100 gift card by telling Sky what you think!"}}

    # option 4
    print(unicode(dumped, "utf-8"))
    # >>> {"related": {"type": "email_subject", "value": "Win a \u00a3100 gift card by telling Sky what you think!"}}

    # option 5
    for k, v in my_dict.items():
    print(k, v)
    for g, h in v.items():
    print(safe_str(g), safe_str(h))

    # >>> (u'related', {u'type': u'email_subject', u'value': u'Win a \xa3100 gift card by telling Sky what you think!'})
    # >>> ('type', 'email_subject')
    # >>> ('value', u'Win a 100 gift card by telling Sky what you think!') ---> here the pound symbol disappeared...

    I still did not tried anything new (apart of what already did before). I don't know, at this point, if is right to expect the pound symbol in offline code/at function level.

    In the end, what is mandatory is to have a dictionary in input (so in output from the function) in the post-process because the whole code in post is based on a dict (and it's already working - :) - with normal characters).

    To answer @Ben Lurie, I tried to print my function response to app.log but still it go to error (master_response is my dictionary):

    Traceback (most recent call last):
    File "/opt/rh/python27/root/usr/lib/python2.7/site-packages/cisco_threat_response/components/funct_cisco_threat_response.py", line 128, in _cisco_threat_response_function
    log.info("CTR - MASTER RESPONSE: {}".format(u'' + str(master_response)))
    UnicodeEncodeError: 'ascii' codec can't encode character u'\xa3' in position 53331: ordinal not in range(128)

    Before the test I made above, the value is logged normally.

    This is a major part of the code I use in my post-process script:

    cisco_sightings = results["value"]["cisco_sightings"]

    sightings_list = []

    for d in cisco_sightings["data"]:
    try:
    if d["module"] == "SMA Email":
    for s in d["data"]["sightings"]["docs"]:
    cisco_sightings_response = {"artifact_value": None, "observed_time": None, "confidence": None,
    "resolution": None,
    "relations": None,
    "description": None, "type": None}

    cisco_sightings_response["artifact_value"] = artifact.value
    cisco_sightings_response["observed_time"] = s["observed_time"]
    cisco_sightings_response["confidence"] = s["confidence"]
    cisco_sightings_response["resolution"] = s["resolution"]
    cisco_sightings_response["relations"] = s["relations"]
    cisco_sightings_response["description"] = s["description"]
    cisco_sightings_response["type"] = s["type"]

    sightings_list.append(cisco_sightings_response)

    except KeyError as key_error:
    continue

    result = ""
    relations_dict = {"Origin": None, "Source type": None, "Source value": None, "Relation": None, "Related type": None,
    "Related value": None}

    for i, s in enumerate(sightings_list):
    result += "Sighting {}".format(i + 1)
    result += "\nArtifact value: {}".format(s["artifact_value"])
    result += "\nObserved time: {}".format(s["observed_time"]["start_time"])
    result += "\nConfidence: {}".format(s["confidence"])
    result += "\nResolution: {}".format(s["resolution"])

    result += "\nDescription: {}".format(s["description"])
    result += "\nType: {}".format(s["type"])

    relations_list = []
    for r in s["relations"]:
    relations_list.append(r)

    for e, rel in enumerate(relations_list):
    relations_dict["Origin"] = rel["origin"]
    relations_dict["Source type"] = rel["source"]["type"]
    relations_dict["Source value"] = rel["source"]["value"]
    relations_dict["Relation"] = rel["relation"]
    relations_dict["Related type"] = rel["related"]["type"]
    relations_dict["Related value"] = rel["related"]["value"]

    result += "\n\nRelation {}:\n{}".format(e + 1, "\n".join(
    [str("\t" + k + ": {}".format(v)) for k, v in relations_dict.items()]))

    result += "\n\n"
    result += "#" * 50
    result += "\n\n"

    I think the part where the code breaks is this (from my debugging, here the pound symbol should appear):

    relations_dict["Related value"] = rel["related"]["value"]

    If I do a simple incident.addNote(str(results["value"])), I have the next output printed in Notes (I extracted just the related part):

    {u'related': {u'type': u'email_subject', u'value': u'Win a \xa3100 gift card by telling Sky what you think!'} ... }

    Finally, I think last Jared message touched a point here about ascii encoding. I also though about this, but the question is why ? Because of Python 2 ?

    If I print this in my IDE:
    print(u"Win a \u00a3100 gift card by telling Sky what you think!")
     the output is correct:
    Win a £100 gift card by telling Sky what you think!

    Feel free to suggest any solution.

    Thanks


    ------------------------------
    Lucian Sipos
    ------------------------------



  • 6.  RE: UnicodeEncodeError 'ascii' 'ordinal not in range(128)'

    IBM Champion
    Posted Wed April 15, 2020 03:17 PM
    Can you move this code into the function code itself or a util function inside of the package instead? This would solve your issue.

    The reason for the encoding issue is not directly because of Python 2 (although Python 2 does default to ASCII, where Python 3 defaults to UTF-8). Resilient uses jython on the back-end for scripts, and they likely specified ASCII as the encoding type.

    I don't know exactly how jython is implemented, but I imagine it could have been changed like we see here:
    https://stackoverflow.com/a/28348970
    https://www.jython.org/jython-old-sites/docs/library/codecs.html


    ------------------------------
    Jared Fagel
    Cyber Security Analyst I
    Public Utility
    ------------------------------



  • 7.  RE: UnicodeEncodeError 'ascii' 'ordinal not in range(128)'

    Posted Thu April 16, 2020 07:02 AM
    Edited by System Thu November 11, 2021 11:15 AM
    Somehow, yesterday I accomplished to show in incidents Note the pound symbol for example dictionary:

    my_dict = {u'related': {u'type': u'email_subject', u'value': u'Win a \xa3100 gift card by telling Sky what you think!'}}

    By waiting for answers I thought to try this other piece of code:

    def myprint(d):
    stack = d.items()
    while stack:
    k, v = stack.pop()
    if isinstance(v, dict):
    stack.extend(v.iteritems())
    else:
    print("%s: %s" % (k, v))

    This correctly prints "£" in notes (replace "print()" with "incident.addNote()").

    @Jared Fagel code you can see in post-process was originally created in my local code (an offline python script) which, when I tested the "faulty" dictionary returned the same result (with no error like Resilient but with \xa3 or \ua003 instead of symbol). Also, for internal need there is a part of code which need to stay in post-process.

    About Resilient, is possible to specify another coding, from ascii to utf-8, of internal processor ?

    EDIT: I tried again with the code above on my whole dictionary and still not working, keep see \xa3 in print...
    ​​

    ------------------------------
    Lucian Sipos
    ------------------------------



  • 8.  RE: UnicodeEncodeError 'ascii' 'ordinal not in range(128)'

    Posted Mon April 20, 2020 10:06 AM
    Not sure if it can help but I have seen this new support document on:
    How to use Unicode characters via in-product scripting
    Hop this may help.

    ------------------------------
    BENOIT ROSTAGNI
    ------------------------------



  • 9.  RE: UnicodeEncodeError 'ascii' 'ordinal not in range(128)'

    Posted Wed August 19, 2020 11:09 AM

    is problem stil exist ?
     if not how could you solve it as iam facing the same Issue 

    Thanks in advance



    ------------------------------
    Mohamed El Bagory
    ------------------------------



  • 10.  RE: UnicodeEncodeError 'ascii' 'ordinal not in range(128)'

    IBM Champion
    Posted Wed August 19, 2020 12:11 PM
    @Mohamed El Bagory

    https://www.ibm.com/support/pages/node/6193791?myns=swgother&mynp=OCSS5E58&mync=E&cm_sp=swgother-_-OCSS5E58-_-E

    Otherwise handle it ​​in the function code using native Python modules.

    ------------------------------
    Jared Fagel
    Cyber Security Analyst I
    Public Utility
    ------------------------------



  • 11.  RE: UnicodeEncodeError 'ascii' 'ordinal not in range(128)'

    Posted Thu October 01, 2020 06:04 AM

    when adding u"\"{0}\"".format to code it got errors form next variables as he can't get the understand the value of  EX: var x = 

    i think of way to ignore this line if this error appear as its an added  feature in the code and not this essintial 



    ------------------------------
    Mohamed El Bagory
    ------------------------------