Skip to main content

Apache log analysis with Sublime Text 3

Analyzing log files is generally a tedious task, especially when you are hunting for anomalies without an initial lead or indication of evil. Trying to remove all the legitimate entries while leaving the malicious entries requires not only knowledge of common attacker techniques and understanding patterns but a flexible tool. In this post, we’re going to cover analysis of Apache Tomcat access logs and Catalina logs using a text editor called “Sublime Text 3” (https://www.sublimetext.com/).
The Scenario
To make things semi-realistic, i’ve deployed Apache Tomcat on top of Windows Server 2012 with ports 80,443 and 8080 exposed. For now, we’re not going to deploy any apps such as WordPress, Drupal or Jenkins. In our scenario, the customer (who owns this Tomcat server) has tasked our team with analyzing both the Apache and Catalina logs to help identify some suspicious activity.

In many real world cases, web applications are usually in a DMZ on their own, behind a load balancer, inside a docker container or directly connected to the internet with very little protections such as a Web Application Firewall (WAF). Many applications are not kept up-to-date, resulting in web application compromises  and often leading to web shells.

Sublime Text
Sublime Text is a cross platform text editor that exposes a Python API and supports a variety of programming languages’, formatting and highlighting. Sublime also supports third-party plugins; there is a significant plugin community and plugins are relatively easy for users to write.. While sublime has many features, the main features we will cover are mass cursor, mass highlighting and invert selection with respect to log file analysis. We will use these features to carve out malicious entries and form a timeline of events to better understand the scope of the incident. We can also use sublime to help format the raw log into other formats like CSV.


Digging into log analysis
After installing sublime, we need to acquire copies of the log files from our target system. For our example, the Apache logs are located at “C:\xampp\apache\logs” and we will be analyzing the access.log. The Tomcat logs are located at “C:\xampp\tomcat\logs”. The image below outlines the common logs in this directory:

We will get started with copying all the access.log files from the target system by archiving them into a zip file and extracting to our system. Depending on the size of the log files, we can do some pre-processing to limit our scope, such as grepping for a specific date/time of known activity or keyword/phrase in the log files and write any matches to a new file. After the new file is created, we can use sublime to carve out more with some precision. To begin, lets load the access.log into Sublime text.

By default, Sublime uses the format of “Plain Text”, as seen in the bottom right hand corner. We can change this format highlight syntax to make it more readable. To do this, click on the bottom right corner where “Plain Text” is currently shown; next you click, a menu will appear showing all the various syntax highlighting formats Sublime Text supports.

Since we don’t have an Apache formatter in Sublime by default, i’ve found that “Lua” formatting for the Apache logs does a decent job highlighting the right fields, as shown below:

We can see this formatting significantly helps the readability of the log. We can also see the mini-map on the right hand side. This mini-map is great when looking for patterns in log files based on the log line structure. Next, we may want to filter down our result set to just HTTP 200 requests. To do this, we can use the Mass Cursor, Expand Selection to Line and Invert Selection features to quickly reduce the log lines shown. To use the mass cursor, we first highlight the text “HTTP/1.1" 200” on any log line, then from the Find menu, select Quick Find All. Your editor should look like below, with only the log lines including “HTTP/1.1" 200” entries highlighted.

Once our selected entries are highlighted, from the top menu, select Selection > Expand Selection to Line followed promptly by Selection Invert Selection. What we now have are all log lines that do not have the text “HTTP/1.1" 200” selected. We can remove them by pressing the backspace/delete key. The end result takes our log file from 745 lines to 33 lines, as shown in the image below:

To put our new Sublime skills to good use for incident response, let’s take a look at the Catalina log localhost_access_log.2018-12-03.txt. Once we load this file into Sublime, we will set the format type to Lua. The editor should look like below:

Upon further analysis, we can see one IP address (118.24.100.xxx) appears to be interacting with our tomcat manager. Using the mass cursor and invert selection features, we can quickly remove all the other IP address to narrow down our view to just this one IP of interest.

We can clearly see from the logs that this attacker appears to have successfully authenticated to our tomcat server and deployed a WAR (Web Application Archive) file. A WAR file is a ZIP file and can be extracted with any zip file utility, resulting in additional code being deployed to the web server and accessible by the attacker. To confirm this WAR file was deployed, we can analyze the related log file called catalina.2018-12-03.log and review the log lines below:

Dec 03, 2018 4:14:28 AM org.apache.catalina.startup.HostConfig deployWARINFO: Deploying web application archive C:\xampp\tomcat\webapps\cmd.war

Dec 03, 2018 4:14:28 AM org.apache.catalina.startup.HostConfig deployWARINFO: Deployment of web application archive C:\xampp\tomcat\webapps\cmd.war has finished in 218 ms

If we review the contents of this directory “C:\xampp\tomcat\webapps\”, we can see a new folder was created called “cmd”. The image below shows the contents of this folder. This is a typical JSP web shell, deployed by a WAR file.


Summary

While analysis of these files are out of scope for this post, we were able to use Sublime Text to quickly rip through these logs, leaving only log lines of interest behind and determine that our web server was indeed compromised by an attacker logging into our Apache Tomcat manager and deploying a WAR file. This WAR file included a web shell that allows execution of arbitrary commands, among many other capabilities. I hope this post was informative. Happy Hunting!

To learn more about web shells, checkout my related post:
https://blog.stillztech.com/2018/08/analyzing-and-detecting-web-shells.html

For medium users, checkout the posts on my profile here: https://medium.com/@tstillz17


Comments

Popular posts from this blog

Revealing malware relationships with GraphDB: Part 1

In this post, we will learn how using a Graph Database like Neo4j can help visualize malware relationships and extend these relationships to identify patterns between samples. Before we dig into Neo4j, let’s start with some fundamental graph terminologies:   
Nodes represent entities such as a human, car, laptop or phone. Properties are attributes nodes can contain. A steering wheel or tires would be a property of the “car” node. Labels are a way to group together nodes of a similar type. For example, a label of “FastFood” may include nodes such as “Taco Bell, McDonald’s, and Chipotle”. Edges (or vertices) represent the relationship connection between two nodes. Relationships can also have their own properties. Getting started with Neo4jLink: https://neo4j.com/
Neo4j is a Graph Database commonly known for its pure simplicity and easy to use interface. I find the structure of a graph database quite fascinating, on top of learning how to normalize malware analysis data for each sample into a …

Analyzing and detecting web shells

Of the various pieces of malware i’ve analyzed, I still find web shells to be the most fascinating. While this not a new topic, i've been asked by others to do a write up on web shells, so here it is ;). 
For those new to web shells, think of this type of malware as code designed to be executed by the web server - instead of writing a backdoor in C, for example, an attacker can write malicious PHP and upload the code directly to a vulnerable web server. Web shells span across many different languages and server types. Let's take a looks at some common servers and some web extensions:
Operating System Service Binary Name Extensions Windows IIS (Internet Information Services) w3wp.exe .asp/.aspx Windows/Linux apache/apache2/nginx httpd/httpd.exe/nginx .php Windows/Linux Apache Tom

Introduction to Malware Analysis

Why malware analysisMalware analysis (“MA”) is a fun and excited journey for anyone new or seasoned in the career field. Taking a specimen (malware sample) and reverse engineering it to better understand its inner workings can be a long, tedious adventure. With the sheer number of malware samples circulating the internet, in addition to the various formats specimens are found in, makes malware analysis a good challenge. Outside of learning MA as a hobby, here are some other reasons why we perform malware analysis:To better understand how a specimen works. This may yield certain unique attributes about how the malware was written, methods it performs or its dependencies.To collect intelligence and build Indicators of Compromise (“IOCs”), usually comprised of Host Based Indicators (“HBIs”) and/or Network Based Indicators (“NBIs”).For general knowledge or research purposes.How do I get started?!If you’re new to malware analysis, you want to ensure you’ve taken the right precautions befor…