Monitoring for Log4Shell exploitation remains a highly important task for many security teams across most industries. The key factor driving the need to monitor for this exploitation stems from the sheer amount of applications and platforms that rely on Log4j's functionality. While many vendors have released patches or workarounds for fixing the exploit, there still remain many more that are vulnerable with no patches or are currently in an unknown status for vulnerability. Furthermore, there is growing evidence of adversaries weaponising Log4Shell to successfully compromise organisations with activities including malware distribution, stealing sensitive data, and encrypting networks with ransomware. These are being achieved with the standard malicious strings and obfuscated variants to evade detection, which highlights the need to perform detection beyond looking for the malicious string in its most basic form.
In the early stages of the Log4Shell exploits being published and even during the initial stages of obfuscation methods being discussed, many teams had their focus on building regular expressions (regex) to perform pattering matching against the malicious strings. As more obfuscation methods were released, one thing became rapidly clear: maintaining and creating regular expressions was becoming increasingly difficult.
Work performed by back2root did result in a fairly robust regex which can be found here: https://github.com/back2root/log4shell-rex. This regex has been shown to capture many methods of obfuscation and, as shown by the regex101 examples, has a fairly good execution time. However, as QRadar SIEM supports the use of tools/scripts through custom AQL functions, it only makes sense to pursue a solution that we can fairly easily maintain and frequently update by introducing new malicious strings or introducing new functions for alternative methods of obfuscation. It is worth keeping in mind that regex searching can have its benefits in quickly performing historical searches, similar to QRadar quick filters, to get an indication on possible malicious activity.
Developing a custom AQL function is exactly the path I have taken, as you may have seen from Adam Frank's community blog post. This blog will guide you through the benefits of integrating the AQL function in your SIEM, discuss considerations you need to make around performance, and give you the confidence to enhance your detection of Log4Shell exploits.
What is the Log4Shell Detection AQL Function?
The AQL function we developed was the result of needing a way to confidently detect Log4Shell exploit attempts of varying obfuscation, while maintaining an efficient level of performance on QRadar's Ariel queries and Custom Rule Engine. We achieved this by initially using logic developed by Florian Roth's open source project log4shell-detector.
Our aim with the function is to continually align its detection capabilities with the obfuscation techniques outlined on Maciej Pulikowski's list of bypasses without compromising on performance. As of the AQL function's Release 2.1, we successfully detect the 11 obfuscations outlined (further testing is planned for the invalid unicode character obfuscation) on Maciej's page and plan to continue development as new obfuscation methods are documented.
Benefits of Detecting Log4Shell Exploits Through AQL Functions
Before we get into the benefits of using an AQL function for Log4Shell exploit detection, you should always consider introducing multiple layers of detection across your QRadar SIEM. Multiple layers of detection can help to ensure you are setting QRadar up to have the opportunity to detect exploitation at multiple stages of the attack; Jose Bravo discusses detection at various stages of the exploit in his Log4Shell YouTube series.
Maintainable and Customisable
AQL functions use a JavaScript engine to support extending functionality and capabilities of AQL searching and filters. This is how we have built our Log4Shell exploit detection function. By building the logic into JavaScript functions we are able to easily maintain a set of smaller functions, which, in comparison to large regex strings, becomes a significantly more manageable task.
An example of the easy to update the AQL function can be found in the way the malicious strings are maintained:
var DETECTION_STRINGS = ['${jndi:'];
To introduce new protocols or new forms of the exploit, it would be a simple case of modifying the JavaScript file and adding these further items to the list. Additionally, refining the detection to be less prone to false positives is as simple as adding more specific items to our list as follows.
var DETECTION_STRINGS = ['${jndi:ldap:', '${jndi:rmi:', '${jndi:ldaps:', '${jndi:dns:',
'${jndi:nis:', '${jndi:nds:', '${jndi:corba:', '${jndi:iiop:', '${jndi:http:'];
Better Performance for Analysing Payloads Without Indicators
Regex can become very expensive when it is analysing payloads that do not contain the pattern you are wishing to match. This can be especially taxing on the system when performing historical searching and real-time monitoring through the Custom Rule Engine (CRE). The load on the system can increase quite significantly where the number of steps in the regex increases.
Our AQL function takes a different approach to how it validates strings, which involves deobfuscating the payload then performing checks for a key aspect of the malicious string '${'. Where the function does not identify these two characters, it immediately exits out of the code by returning 'false' to indicate the payload does not contain data that would eventually lead to a positive match. This means we do not waste time on processing the payload, which will reduce load on the CRE and improves performance of historical searching significantly.
Considerations For Using The AQL Function
Making use of the AQL function does require considerations into how you implement it. As part of the development process, the latest releases brought significant improvements on how payloads are processed leading to major efficiency improvements compared to version 1.0. However, there is always the need to be diligent when introducing new use cases or performing excessively large searches.
While performing historical searches, you may encounter performance issues in the following scenarios:
Additionally, when implementing the AQL function as a rule, you should always consider adding additional rule tests above the AQL filter test. This will help to reduce the number of payloads you are sending through to the functions and reduce the load on the CRE - many log source types are unlikely to provide any relevant results for Log4Shell exploitation. You can read more into
rule optimisation and tuning here.
Examples of Malicious Strings Detected
This section discusses the detection capabilities of the AQL function and provides insight into custom properties that could be useful to pass as a function parameter. As mentioned earlier in the blog, the desired outcome was to achieve at least sufficient coverage for the techniques discussed in Maciej Pulikowski's list of bypasses excluding the DoS. However, you could include part or all of the string into the malicious string array to introduce this level of coverage (has not been tested).
As part of our testing in lab environments, we managed to detect the following strings for both historical searching and real-time detection:
${${env:ENV_NAME:-j}ndi${env:ENV_NAME:-:}${env:ENV_NAME:-l}dap${env:ENV_NAME:-:}
${${lower:j}ndi:${lower:l}${lower:d}a${lower:p}:
${${upper:j}ndi:${upper:l}${upper:d}a${lower:p}:
${${::-j}${::-n}${::-d}${::-i}:${::-l}${::-d}${::-a}${::-p}:
${jnd${upper:ı}:ldap:
${jnd${sys:SYS_NAME:-i}:ldap:
${j${${:-l}${:-o}${:-w}${:-e}${:-r}:n}di:ldap:
${${date:'j'}${date:'n'}${date:'d'}${date:'i'}:${date:'l'}${date:'d'}${date:'a'}${date:'p'}:
%24%7bjNd${sys:SYS_NAME:-i}:ldap:
${\u006a\u006e\u0064\u0069:ldap:
one-${jnd${a":"a:-i}:ld${", "two":"o:-a}p:
Additionally, we successfully tested detection on the following example to determine the nesting capabilities of our AQL function:
\u0025\u0032\u0035\u0032\u0034\u0025\u0032\u0035\u0037\u0042\u006A\u0025\u0032\u0035\u0032\u0034\u0025\u0032\u0035\u0037\u0042\u0025\u0032\u0035\u0032\u0034\u0025\u0032\u0035\u0037\u0042\u0025\u0032\u0035\u0033\u0041\u0025\u0032\u0035\u0032\u0044\u006C\u0025\u0032\u0035\u0037\u0044\u0025\u0032\u0035\u0032\u0034\u0025\u0032\u0035\u0037\u0042\u0025\u0032\u0035\u0033\u0041\u0025\u0032\u0035\u0032\u0044\u006F\u0025\u0032\u0035\u0037\u0044\u0025\u0032\u0035\u0032\u0034\u0025\u0032\u0035\u0037\u0042\u0025\u0032\u0035\u0033\u0041\u0025\u0032\u0035\u0032\u0044\u0077\u0025\u0032\u0035\u0037\u0044\u0025\u0032\u0035\u0032\u0034\u0025\u0032\u0035\u0037\u0042\u0025\u0032\u0035\u0033\u0041\u0025\u0032\u0035\u0032\u0044\u0065\u0025\u0032\u0035\u0037\u0044\u0025\u0032\u0035\u0032\u0034\u0025\u0032\u0035\u0037\u0042\u0025\u0032\u0035\u0033\u0041\u0025\u0032\u0035\u0032\u0044\u0072\u0025\u0032\u0035\u0037\u0044\u0025\u0032\u0035\u0033\u0041\u006E\u0025\u0032\u0035\u0037\u0044\u0064\u0069:ldap:
Note: The above string was artificially created for the detection testing. You can view the layers of decoding required to detect this string through CyberChef. Base64 encoded methods were not shown in any above examples, but our AQL function has the capability to detect these encoded exploits where an environment may have implemented the base64 features.
While our detection examples were performed using the full UTF8-formatted payload of an event, it is possible to refine the use of the function to focus on specific custom properties. Based on detected activity from mass scanners, we have identified at least the following parameters being filled with the malicious string as part of web application firewall (WAF) and web server logs.
Authorization
GET Request
Query string
Referer
User-Agent
X-Api-Version
X-Forwarded-For
X-IP
X-Real-IP
Note: The above are an example of the parameters where detection has been successful. However, you should always design your properties to align with your environment needs and dependent on the data within your logs. If you are unsure on specific parameters, you could consider passing the whole payload for checks and limiting to specific log source types.
Using the AQL Function - Rule and Searches
This final section covers how to use the AQL function in a rule or searches. We will be working with the full UTF8-formatted payload in the examples and our historical search AQL will be fairly broad. You should be familiar with uploading extensions via the extensions management feature. Additionally, there is the assumption that you are familiar with creating custom rules and performing AQL searches.
Adding Your Rule
To create your rule, you will need to use two lines of logic:
'and when the event(s) were detected by one or more of these log source types'
'and when the event matches this AQL filter query'
You should select log source types relevant to your environment - you could use the results of a historical search to get an idea on where your logging capabilities might exist. As for the AQL filter query line, you will want to use a variation of the following AQL filter logic (40 is being used for the maximum distance; this can be adjusted to higher or lower as the field takes an integer value. Lower maximum distances will result in reduced detection capabilities. Higher maximum distances will result in potentially increased detection capabilities, especially where nesting is significant, but it will put more load on the system.)
The distance value tells the function how many characters you are willing to let it search over to find the next character in the malicious string. As an example, if an attacker attempted to exploit the vulnerability with a malicious string like '${j${lower:n}di:', the distance between 'j' and 'n' is 7 characters meaning a maximum distance of 5 would not be able to detect this attempted exploitation. As such, using higher maximum distance values will help to give you an increase chance of detecting the exploitation attempts; this might come with performance impacts if your maximum distance is excessively (hundreds) large.
EXPLOITDETECT::LOG4J(UTF8(payload),40)
Your completed rule logic should look similar to my example:

Performing Your Search
Our example search provides a good standard on columns that can help support a more efficient investigation. Using an AQL function in a search is fairly simple and you can find examples of these searches in Jose Bravo's playlist mentioned earlier in the blog. The following search excludes the Custom Rule Engine and the SIM Audit log source types to prevent false positives. As with the rule logic, you can modify the '40' value passed to the function to a maximum distance of your choice.
select QIDNAME(qid) AS "Event Name", sourceip AS "Source IP", destinationip AS "Destination IP", destinationport as "Destination Port",DATEFORMAT(devicetime, 'yyyy-MM-dd HH:mm:ss z') AS "Log Source Time",LOGSOURCENAME(logsourceid) AS "Log Source", SUM(eventcount)
from events where EXPLOITDETECT::LOG4J(UTF8(payload),40) and devicetype != 105 and devicetype != 18
Group by "Event Name","Source IP"
last 24 HOURS
Closing Notes
Thank you for taking the time to read through the blog post. I hope that you found value and a fresh insight into using AQL functions to help make the detection of Log4Shell exploitation manageable. Please be sure to follow my repository on GitHub to get the latest updates to the AQL function the moment they are released.
If you encounter any issues with using the AQL function or identify possible gaps, please raise an issue via the GitHub repository.
You can find out more about developing AQL functions through Jose Bravo's new video:
Writing your very own AQL Custom Function.