Skip to main content

Apache log analysis with Sublime Text 3

Analyzing log files is generally a tedious task, especially when you are hunting for anomalies without an initial lead or indication of evil. Trying to remove all the legitimate entries while leaving the malicious entries requires not only knowledge of common attacker techniques and understanding patterns but a flexible tool. In this post, we’re going to cover analysis of Apache Tomcat access logs and Catalina logs using a text editor called “Sublime Text 3” (
The Scenario
To make things semi-realistic, i’ve deployed Apache Tomcat on top of Windows Server 2012 with ports 80,443 and 8080 exposed. For now, we’re not going to deploy any apps such as WordPress, Drupal or Jenkins. In our scenario, the customer (who owns this Tomcat server) has tasked our team with analyzing both the Apache and Catalina logs to help identify some suspicious activity.

In many real world cases, web applications are usually in a DMZ on their own, behind a load balancer, inside a docker container or directly connected to the internet with very little protections such as a Web Application Firewall (WAF). Many applications are not kept up-to-date, resulting in web application compromises  and often leading to web shells.

Sublime Text
Sublime Text is a cross platform text editor that exposes a Python API and supports a variety of programming languages’, formatting and highlighting. Sublime also supports third-party plugins; there is a significant plugin community and plugins are relatively easy for users to write.. While sublime has many features, the main features we will cover are mass cursor, mass highlighting and invert selection with respect to log file analysis. We will use these features to carve out malicious entries and form a timeline of events to better understand the scope of the incident. We can also use sublime to help format the raw log into other formats like CSV.

Digging into log analysis
After installing sublime, we need to acquire copies of the log files from our target system. For our example, the Apache logs are located at “C:\xampp\apache\logs” and we will be analyzing the access.log. The Tomcat logs are located at “C:\xampp\tomcat\logs”. The image below outlines the common logs in this directory:

We will get started with copying all the access.log files from the target system by archiving them into a zip file and extracting to our system. Depending on the size of the log files, we can do some pre-processing to limit our scope, such as grepping for a specific date/time of known activity or keyword/phrase in the log files and write any matches to a new file. After the new file is created, we can use sublime to carve out more with some precision. To begin, lets load the access.log into Sublime text.

By default, Sublime uses the format of “Plain Text”, as seen in the bottom right hand corner. We can change this format highlight syntax to make it more readable. To do this, click on the bottom right corner where “Plain Text” is currently shown; next you click, a menu will appear showing all the various syntax highlighting formats Sublime Text supports.

Since we don’t have an Apache formatter in Sublime by default, i’ve found that “Lua” formatting for the Apache logs does a decent job highlighting the right fields, as shown below:

We can see this formatting significantly helps the readability of the log. We can also see the mini-map on the right hand side. This mini-map is great when looking for patterns in log files based on the log line structure. Next, we may want to filter down our result set to just HTTP 200 requests. To do this, we can use the Mass Cursor, Expand Selection to Line and Invert Selection features to quickly reduce the log lines shown. To use the mass cursor, we first highlight the text “HTTP/1.1" 200” on any log line, then from the Find menu, select Quick Find All. Your editor should look like below, with only the log lines including “HTTP/1.1" 200” entries highlighted.

Once our selected entries are highlighted, from the top menu, select Selection > Expand Selection to Line followed promptly by Selection Invert Selection. What we now have are all log lines that do not have the text “HTTP/1.1" 200” selected. We can remove them by pressing the backspace/delete key. The end result takes our log file from 745 lines to 33 lines, as shown in the image below:

To put our new Sublime skills to good use for incident response, let’s take a look at the Catalina log localhost_access_log.2018-12-03.txt. Once we load this file into Sublime, we will set the format type to Lua. The editor should look like below:

Upon further analysis, we can see one IP address ( appears to be interacting with our tomcat manager. Using the mass cursor and invert selection features, we can quickly remove all the other IP address to narrow down our view to just this one IP of interest.

We can clearly see from the logs that this attacker appears to have successfully authenticated to our tomcat server and deployed a WAR (Web Application Archive) file. A WAR file is a ZIP file and can be extracted with any zip file utility, resulting in additional code being deployed to the web server and accessible by the attacker. To confirm this WAR file was deployed, we can analyze the related log file called catalina.2018-12-03.log and review the log lines below:

Dec 03, 2018 4:14:28 AM org.apache.catalina.startup.HostConfig deployWARINFO: Deploying web application archive C:\xampp\tomcat\webapps\cmd.war

Dec 03, 2018 4:14:28 AM org.apache.catalina.startup.HostConfig deployWARINFO: Deployment of web application archive C:\xampp\tomcat\webapps\cmd.war has finished in 218 ms

If we review the contents of this directory “C:\xampp\tomcat\webapps\”, we can see a new folder was created called “cmd”. The image below shows the contents of this folder. This is a typical JSP web shell, deployed by a WAR file.


While analysis of these files are out of scope for this post, we were able to use Sublime Text to quickly rip through these logs, leaving only log lines of interest behind and determine that our web server was indeed compromised by an attacker logging into our Apache Tomcat manager and deploying a WAR file. This WAR file included a web shell that allows execution of arbitrary commands, among many other capabilities. I hope this post was informative. Happy Hunting!

To learn more about web shells, checkout my related post:

For medium users, checkout the posts on my profile here:


Post a Comment

Popular posts from this blog

Analyzing and detecting web shells

Of the various pieces of malware i’ve analyzed, I still find web shells to be the most fascinating. While this not a new topic, i've been asked by others to do a write up on web shells, so here it is ;).  For those new to web shells, think of this type of malware as code designed to be executed by the web server - instead of writing a backdoor in C, for example, an attacker can write malicious PHP and upload the code directly to a vulnerable web server. Web shells span across many different languages and server types. Let's take a looks at some common servers and some web extensions: Operating System Service Binary Name Extensions Windows IIS (Internet Information Services) w3wp.exe .asp/.aspx Windows/Linux apache/ apache2/nginx httpd/httpd.exe/nginx .php Windows/Linux Apache Tomcat* tomcat*.exe/tomcat* .jsp/.jspx Web shells 101 To better understand web shells, let’s take a look at a simple eval web shell below: <?php

Web shell hunting: Meet the web shell analyzer

 In continuation of my prior work on web shells ( Medium / Blog ), I wanted to take my work a step further and introduce a new tool that goes beyond my legacy webshell-scan tool. The “webshell-scan” tool was written in GoLang and provided threat hunters and analysts alike with the ability to quickly scan a target system for web shells in a cross platform fashion. That said, I found it was lacking in many other areas. Allow me to elaborate below… Requirements of web shell analysis In order to perform proper web shell analysis, we need to define some of the key requirements that a web shell analyzer would need to include. This isn’t a definitive list but more of a guide on key requirements based on my experience working on the front lines: Static executable: Tooling must include all dependencies when being deployed. This ensures the execution is consistent and expected. Simple and easy to use: A tool must be simple and straightforward to deploy and execute. Nothing is more frustrating