This post was originally published here by James Bower.
In this article, we’ll be discussing a couple starting points of hunting for web shells on your network. A web shell offers varied functionality to an attacker in a single file. Imagine an attacker having command line access to your web server through an executable file placed somewhere on the web server. It’s even scarier when you imagine that single file hidden somewhere amongst thousands of other legitimate files on the server.
Web Shells are a server-side compromise resulting from a vulnerability such as an LFI, RFI, or some type of upload functionality that an attacker can exploit in the web application.
So, if you aren’t familiar with fighting web shells you really need to be. In my opinion, web shells are in a category of their own compared to other malicious things and are quite hard to fight using traditional defenses. By traditional defenses, I’m referring particularly to signature detection using something like an IDS/IPS. One of the main reasons is that the most popular web shells are typically open source which makes code reuse quite common. A low-level attacker simply needs to change a few options in the web shell and rename it to make signature detection more difficult.
As a reminder, Sqrrl has developed a hunting methodology called the Threat Hunting Loop. The hunting loop has four steps:
Although web shells can be created from almost any scripting, they are most often written in a traditional web language such as .php, .asp, .aspx, .jsp, and .js. Another obfuscation method used by attackers to avoid detection is to employ authentication in the web shell. By hiding the malicious file with other legitimate web server files, finding web shells can effectively feel like searching for a needle in a hay stack.
As with any threat, searching for specific indicators has a variety of pitfalls. Therefore I don’t search for particular web shells such as WSO, C99, or others. Instead, hunting for specific behaviors that are typical amongst many of the popular web shells observed in use is a preferred method. By focusing on characteristics of the use of web shells we can decrease our investigation time and increase our efficiency, which is a positive outcome for any SOC.
Hunting web shells can be done using either web server logs such as Apache or IIS logs (to include those from your Exchange servers), or network logs using a tool such as Bro IDS.
We will be making use of Bro’s http.log which contains all the unencrypted web traffic detected by Bro on our network. This data is fed into Sqrrl to give us the ability to query specific data in the HTTP traffic.
Starting Point #1
Look for HTTP POST Request with Successful Response
In our first query we’ll be using Sqrrl and looking for evidence of an attacker successfully uploading or logging in to a web shell on one of our web servers.
The query below is specifically looking for an HTTP POST request with any response equal to or below a 403 response from a file with an extension of php, jsp, cfm, asp, or aspx. By looking at responses equal to and below 403 we’ll be able to see both successful requests such as a 200 response and authorization responses such as 401 and 403. These file extensions pretty much cover the gambit of most popular web shells seen in the wild. We’re sorting the results of the query so that we can see the outliers in our results. This makes sense when we consider that an attackers activity should be much less frequent than a legitimate application
SELECT uri, orig_h, host, count(*) FROM BroHttp WHERE status_code<403 AND method=’POST’ AND (uri LIKE ‘%php’ OR uri LIKE ‘%jsp’ OR uri LIKE ‘%cfm’ OR uri LIKE ‘%asp’ OR uri LIKE ‘%aspx’) GROUP BY uri, orig_h, host
As we can see in the results of our sample data, Sqrrl found 162 POST requests to the file “chsquery.php” on ts.eset.com which happens to be the anti-virus on that particular machine.
To gain more insight into this I simply search by the host to see what assets it has connections to.
In the Wireshark image below we can see the POST login request for the C99 web shell and how our above query would help us to detect this particular web shell.
Starting Point #2
Outliers and Out of Date User Agents
This query specifically focuses on finding obscure or out of date user agents seen on the network. By obscure we’re referring to user agent strings that aren’t seen very often on your network and may be worth investigating further. The benefits of this are twofold. The first benefit in finding older user agents stems from the code reuse often observed among web shells. Many adversaries won’t bother changing the user agents or other factors associated with a particular web shell which makes them suitable detection methods. One example is the China Chopper web shell where the default user agent used for communication is
“Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1)”. This user agent can be considered anomalous on many networks because it is the user agent for Internet Explorer 6 running on Windows XP.
User Agent analysis also provides some potential benefit in identifying out of date software or proprietary code that has long been forgotten but continues to run on the network.
In Sqrrl, we’ll be utilizing the robust CounterOps model when running our query to look for the 50 most unique browser user agents in our Bro HTTP logs.
MATCH TOP 50 UserAgent FROM CounterOps ORDER BY UserAgent.totalRequests ASC
The results of our query come back in the nice looking graphical format we see below.
The graphical nature of the results allows us to easily find out more information about a particular user agent by simply right clicking on the user agent and selecting the next action we like to make.
We can then begin to see exactly which hosts had used this particular user agent. This can be seen in the image below.
If this user agent looked suspicious to us we could now begin investigating both of our hosts for more information.
Starting Point #3
Most common web shells allow the attacker to use some sort of authentication when communicating with the web shell. This authentication is generally either basic, form, or digest HTTP authentication. This can provide us with a great opportunity for tracking web shells as our legitimate web applications shouldn’t be using basic authentication or should be modified to at least use SSL.
In order to search Sqrrl for similar output we can use the following query.
SELECT uri, count(*) FROM BroHttp WHERE username is not null GROUP BY uri
I’m sure most reading this will be able to use the basic queries we’ve shown as a stepping stone to create much more robust and complex queries. However, One of Sqrrl’s main advantages is the interactive graph representation of the relationship between entities within the organization’s network, clear and explorable, which allows the hunters to drill down into details or to step back and take a wider view of the situation. Sqrrl extracts specific fields from log files and reorganizes them in security graph form, which enables analysts to easily traverse the data without having to write a lot of search scripts and queries.
Sqrrl also leverages numerous behavioral analytics capabilities. For example, Sqrrl’s lateral movement detector first uses an unsupervised machine learning algorithm to look for suspicious login events and then uses a multi-hop graph algorithm to chain those login events into predicted lateral movement pathways. By looking for connected series of anomalies using graph algorithms, Sqrrl is more accurate in its detections because connected series of anomalies are rarer than a single anomaly.
Threat hunter can choose one of these detections as a starting point for a hunt. The detections are powered by algorithms. Some specific examples of Sqrrl’s TTP Detectors are below.
Malicious Beaconing: Sqrrl detects beaconing at regular intervals by using signal processing techniques to cut through the noise of network traffic.
Data Staging and Exfiltration: Sqrrl’s analytics can determine when data staging and exfiltration are happening by looking for abnormal spikes in internal-to-external network traffic and anomalous data flows within a network.
DNS Tunneling: Sqrrl detects DNS Tunneling by grouping DNS traffic according to the internal endpoint making the request, and the external registered domain it is querying for. Sqrrl also detects DGAs by singling out unregistered websites in DNS traffic.