IBM QRadar SOAR

IBM QRadar

Join this online user group to communicate across Security product users and IBM experts by sharing advice and best practices with peers and staying up to date regarding product enhancements.

 View Only
Expand all | Collapse all

Script limitation issue

  • 1.  Script limitation issue

    Posted Thu April 25, 2019 03:05 AM
    Hi Team,
    I have a request to modify a default email parsing script to extract email addresses.
    To achieve this I modified the default email parsing script to this way:
    1. I have defined <g class="gr_ gr_163 gr-alert gr_gramm gr_inline_cards gr_disable_anim_appear Grammar only-ins doubleReplace replaceWithoutSep" id="163" data-gr-id="163">new</g> static method:

    def makeEmailPattern():
      return "(^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$)"

    and added the following string at the bottom of the script to capture emails from a message body:

    processor.processArtifactCategory(processor.makeEmailPattern(), "String", "Email address in a message body")

    When I run the script I got the following error message:
    "Error Running Script: either the script was running longer than the timeout period of 5 seconds or the script length was more than  50000 lines"

    First of all, I thought that the root cause of the issue is on the regexp side. But if I comment any other processor string for extracting (for example IPV6), then the script works. I also tried to simplify the regex to something more simple (like "^.+@[^\.].*\.[a-z]{2,}$" ) In this case script works but we can not extract email with multiple dots in <g class="gr_ gr_1058 gr-alert gr_gramm gr_inline_cards gr_disable_anim_appear Grammar only-ins replaceWithoutSep" id="1058" data-gr-id="1058">name\domain</g> side. (like user.name@sub.domain.com).

    So, if you have any ideas about how to extract all pre-built artefacts types + complex emails type - I would be much appreciated for any recommendations.

    BR,
    Alexander Saulenko 


    ------------------------------
    Alexander Saulenko
    ------------------------------


  • 2.  RE: Script limitation issue
    Best Answer

    Posted Thu April 25, 2019 06:35 AM
      |   view attached
    Hello,
      Please find attached a script that addresses the problem you are having, while following closely the instructions you posted about how you went about adapting the default script to your needs.
      I ran this script against an email with the following body:

      someone@somewhere.com
      another.name@somewhereelse.org
      Pah
      NoDomain@
      @invalid.name.org
      user.name@sub.domain.com

    This lead to an incident being created with "String"-type artifacts:

    Please note that the entirety of the string passed in to the first parameter of processor.processArtifactCategory() is used as the capturing group for the regex and so it is not necessary to enclose the string in "(...)". Please also note that ^ and $ should also be avoided.


    Yours,
      PJ McKenna

    ------------------------------
    Patrick (PJ) McKenna
    Resilient Development
    ------------------------------

    Attachment(s)



  • 3.  RE: Script limitation issue

    Posted Thu April 25, 2019 11:17 AM
    Thank you, Patrick, for your reply. The script working like a charm!
    BR,
    Alex

    ------------------------------
    Alexander Saulenko
    ------------------------------



  • 4.  RE: Script limitation issue

    Posted Thu April 25, 2019 12:10 PM
    Dear Alex,
      Great news!
    -P.J.

    ------------------------------
    Patrick (PJ) McKenna
    Resilient Development
    ------------------------------



  • 5.  RE: Script limitation issue

    Posted Mon April 29, 2019 11:50 AM

    Hi PJ,  

    I received the same error for a different use case. I am trying to remove all the "re:", "fw:", or "fwd:" from the email subject. I had the following line of code in ther​e:
    email_subject = re.sub(r"re\:|fw\:|fwd\:", "", emailmessage.subject, re.IGNORECASE)
    newIncidentTitle = u"\"{0}\"".format(email_subject)

    I did have additional regex parsing going to decode URLs that was working fine and not returning any errors until I added this. Then I received the error: "Error Running Script: either the script was running longer than the timeout period of 5 seconds or the script length was more than  50000 lines"

    Any tips or ideas how to resolve it?

    Thanks!



    ------------------------------
    Adina
    ------------------------------



  • 6.  RE: Script limitation issue

    Posted Tue April 30, 2019 07:31 AM
    Hi Adina,
      Did you want to share the script you are using?
    -PJ

    ------------------------------
    Patrick (PJ) McKenna
    Resilient Development
    ------------------------------



  • 7.  RE: Script limitation issue

    Posted Tue April 30, 2019 09:10 AM
    Dear Adina,
      I might suggest using

    email_subject = re.sub(r"re\:\s+|fw\:\s+|fwd\:\s+", "", emailmessage.subject, flags=re.IGNORECASE)

      because without using flags= you don't get the case insensitivity you evidently want. Secondly there is usually some whitespace after re:, fw:, or fwd: and I addressed this in the regular expression too.

      Regarding the overrunning the 5000 line maximum, it would help to see your script in its entirety.
    -P.J.



    ------------------------------
    Patrick (PJ) McKenna
    Resilient Development
    ------------------------------



  • 8.  RE: Script limitation issue

    Posted Mon May 06, 2019 04:50 PM
      |   view attached
    Hi Patrick,

    Sorry for the delayed response. Please find our modified script attached. We still are getting the line error when trying to use the email_subject regex line.

    Thank you!

    ------------------------------
    Adina Bodkins
    ------------------------------

    Attachment(s)

    zip
    Email Script.zip   8 KB 1 version


  • 9.  RE: Script limitation issue

    Posted Thu May 09, 2019 11:02 AM
    Hi Adina,
      I ran your script and did not encounter any problems.
      Could you please send me an example of the email message that is confounding the script?
    -P.J.

    ------------------------------
    Patrick (PJ) McKenna
    Resilient Development
    ------------------------------



  • 10.  RE: Script limitation issue

    Posted Thu May 09, 2019 11:13 AM
    Hi PJ,

    One thing to bear in mind here as regards the "Error Running Script: either the script was running longer than the timeout period of 5 seconds or the script length was more than 50000 lines" message - the 50k lines are not necessarily unique lines, so for example any lines executed as part of a for loop would all count towards that maximum number. Beware of iterations over very large collections and things like that.

    -P.

    ------------------------------
    PAUL CURRAN
    ------------------------------



  • 11.  RE: Script limitation issue

    Posted Thu May 09, 2019 11:46 AM
    Hi Paul,
      Yes, entirely correct. The regular expression parsing in particular can give rise to quite computationally intensive algorithms being followed that are dictated by the regular expression itself.
      Asking for the offending email will hopefully help determine if one or more of the regexs are taking a disproportionate amount of interpreted lines.
    -P.J.


    ------------------------------
    Patrick (PJ) McKenna
    Resilient Development
    ------------------------------