Introduction

Apache is part of the LAMP stack of software for Linux (Linux, Apache, MySQL, PHP). Apache is responsible for serving web pages to people looking at your website.

The server works sort of like a door attendant at an apartment building. It grants access for visits to your website, and it keeps an access log. These records, or log files, can be a valuable source of information about your website, usage, and audience.

Prerequisites

  • A Linux computer running Apache web services
  • A user account with root (sudo) access

Tools/Software

  • A terminal window (Ctrl-Alt-T in Ubuntu, Alt-F2 in CentOS)
  • cPanel (optional)

Viewing Apache Access Logs

Use cPanel to Download Raw Access Files

If you’re logged in to a web server with cPanel, you can download the Apache access logs through a graphical interface.

Scroll down to the section labeled Metrics

Click Raw Access

You should see a page that gives you basic options on when the system should archive or delete the log files.  Below that, it will provide you with the option to download the raw log files.  They will look like standard hyperlinks, labeled for the website you’re managing.

Clicking the hyperlink will prompt you to save or open the file.  These log files are compressed using gzip, so if you’re not using a Linux system, you might need a decompression utility.  Save the file to a location you can remember.

Locate the file in the graphics interface, then right-click -> extract.  A new file should appear without the .gz extension.  You can right-click -> edit to open the file in your favorite text editor to view the contents.

Using Terminal Commands to Display Local Access Logs

If you’re working on the machine that hosts Apache, or if you’re logged into that machine remotely, you can use the terminal to display and filter the contents of the access logs.  By default, you can find the Apache access log file at the following path:

/var/log/apache/access.log

/var/log/apache2/access.log

/etc/httpd/logs/access_log

You can use the GUI or the terminal with the cd command to navigate your system to find where the logs are stored.

Step 1: Display the Last 100 Entries of the Access Log

In the terminal window, enter the following:

sudo tail -100 /var/log/apache2/access.log

The tail command tells the machine to read the last part of the file, and the -100 command directs it to display the last 100 entries.  The final part, /var/log/apache2/access.log tells the machine where to look for the log file. If your log file is in a different place, make sure to substitute your machine’s path to the log files.

Step 2: Display a Specific Term From Access Logs

Sometimes, you only want to display a certain type of entry in the log.  You can use the grep command to filter your report by certain keywords.  For example, enter the following into a terminal:

sudo grep GET /var/log/apache2/access.log

Like the previous command, this looks at the /var/log/apache2/access.log file to display the contents of the access log.  The grep command tells the machine to only display entries with the GET request.  You can substitute other Apache commands as well.  For example, if you’re looking to monitor access to .jpg images, you could simply substitute .jpg for GET.  As before, use the actual path to your server’s log file.

Step 3: View Apache Error Log

In addition to the access log, you can use the previous terminal commands to view the error log. Enter the following command in the Terminal:

sudo tail -100 /var/log/apache2/error.log

If you found your access log file in another location, your error log file will be in the same location. Make sure you type the correct path.

Interpreting the Access Log in Apache

When you open your access log file for the first time, you may feel overwhelmed.  There’s a lot of information about HTTP requests, and some text editors (and the terminal) will wrap the text to the next line.  This can make it confusing to read, but each piece of information is displayed in a specific order.  The conventional method for expressing the format of the log files is:

"%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\""

This is a code for the most common things in each line of the log.  Each % sign corresponds to a piece of information in the log:

  • %h – the client’s IP address (the source of the access request)
  • %l – This next entry may simply be a hyphen that means no information was retrieved. This is the result of checking identd on the client.
  • %u – Client’s userid, if the access request required http authentication.
  • %t – Timestamp of the incoming request
  • \%r\ – Request line that was used. This tells you the http method (GET, POST, HEAD, etc.), the path to what was requested, and the http protocol being used.
  • %>s – Status code that was returned from the server to the client
  • %b – Size of the resource that was requested
  • \”%{Referer}i\” – This tells you if the access came from clicking a link on another website, or other ways that the client was referred to your page.
  • \”%{User-agent}i\” – Tells you information about the entity making the request, such as web browser, operating system, website source (in the case of a robot), etc.

Just read across the line in your log file, and each entry can be decoded as above.  If there is no information, the log will display a hyphen.  If you’re working on a preconfigured server, your log file may have more or less information.  You can also create a custom log format by using the custom log module.  For more information about decoding log formats, see this page.

How To Use The Data In The Log Files

Apache log analysis gives you the opportunity to measure the ways that clients interact with your website.

For example, you might look at the time stamp to figure out how many access requests arrive per hour to measure traffic patterns.  You could look at the user agent to find out if particular users are logging in to a website to access a database or create content.  You could even track failed authentications to monitor cyber attacks against your system.

The apache error log can be used similarly. Often, it’s simply used to see how many 404 errors are being generated.  A 404 error happens when a client requests a missing resource, and this can alert you to broken links or other errors within the page.  However, it can also be used to find configuration glitches or even warnings about potential server problems.

Conclusion

This guide provided methods for extracting data to view Apache access log files.

The access.log file is an excellent resource for measuring the ways that clients are interacting with your server. The error.log file can help you troubleshoot issues with your website.