Introduction

Elastic Stack, formerly known as the ELK stack, is a popular suite of tools for viewing and managing log files. As open-source software, you can download and use it for free (though fee-based and cloud-hosted versions are also available).

This tutorial introduces basic ELK Stack usage and functionality.

guide on Getting started with Elastic / elk stack

Prerequisites

  • A system with Elasticsearch installed

What is the ELK Stack?

ELK stands for Elasticsearch, Logstash, and Kibana. In previous versions, the core components of the ELK Stack were:

  • Elasticsearch – The core component of ELK. It works as a searchable database for log files.
  • Logstash – A pipeline to retrieve data. It can be configured to retrieve data from many different sources and then to send to Elasticsearch.
  • Kibana – A visualization tool. It uses a web browser interface to organize and display data.

how the elk stack works

In newer versions of Elastic Stack, additional software packages called Beats are used. These are smaller data collection applications, specialized for individual tasks.

There are many different Beats applications. For example, Filebeat is used to collect log files, while Packetbeat is used to analyze network traffic.


Note: Need to install the ELK stack to manage server log files Follow this step-by-step guide and set up each layer of the stack - Elasticsearch, Logstash, and Kibana for Centos 8 or Ubuntu 18.04 / 20.04.


How Does Elastic Stack Work?

First, a computer or server creates log files. All computers have log files that document events on the system. Some systems, such as server clusters, generate massive amounts of log files. Elastic Stack is designed to help manage scalable amounts of log data.

Next, the log files are collected by a Beats application. Different Beats reach out to different parts of the server and read the log files. (Some users may skip Beats, and use Logstash.)

Then Logstash is configured to reach out and collect data from the different Beats applications (or directly from various sources). In larger configurations, Logstash can collect from multiple systems, and filter and collate the data into one location.

Elasticsearch is used as a scalable, searchable database to store data. Elasticsearch is the warehouse where Logstash pipes all the data.

Finally, Kibana provides a user-friendly interface for you to review the data that’s been collected. It is highly configurable, so you can adjust the metrics to fit your needs. It also provides graphs and other tools to visualize and interpret patterns in the data.

Supporting Applications

Some users configure additional third-party applications to enhance Elastic Stack.

Apache Kafka

Kafka is a real-time streaming distribution platform. That means that it can read multiple data sources at once. Kafka helps prevent data loss or interruption while streaming files quickly.


Note: Working with a Kubernetes cluster? Learn how to set up Apache Kafka on Kubernetes.


RabbitMQ

RabbitMQ is a messaging platform. Elastic Stack users might use this software to build a stable, buffered queue of log files.

Nginx

Nginx is best known as a web server that can also be set up as a reverse-proxy. It can be used to manage network traffic or to create a security buffer between your server and the internet.

Elasticsearch Overview

Elasticsearch is the core of the Elastic Stack. It has two main jobs:

      • Storage and indexing of data.
      • Search engine to retrieve data.

Technical Elasticsearch details include:

      • Robust programming language support for clients (Java, PHP, Ruby, C#, Python).
      • Uses a REST API –  Applications written for Elasticsearch have excellent compatibility with Web applications.
      • Responsive results – Users see data almost in real-time.
      • Distributed architecture – Elasticsearch can run and connect between many different servers. The Elastic Stack can scale easily as infrastructure grows.
      • Inverted indexing – Elasticsearch indexes by keywords, much like the index in a book. This helps speed up queries to large data sets.
      • Shards – If your data is too large for your server, Elasticsearch can break it up into subsets called Shards.
      • Non-relational (NoSQL) – Elasticsearch uses a non-relational database to break free from the constraints of structured/tabular data storage.
      • Apache Lucene – This is the base search engine that Elasticsearch is based on.

Note: If you are working on an Ubuntu system, take a look at How to Install Elasticsearch on Ubuntu 18.04.


Logstash Overview

Logstash is a tool for gathering and sorting data from different sources. Logstash can reach out to a remote server, collect a specific set of logs, and import them into Elasticsearch.

It can sort, filter, and organize data. Additionally, it includes several default configurations, or you can create your own. This is especially helpful for structuring data in a way that’s uniform (or readable).

Logstash technical features:

      • Accepts a wide range of data formats and sources – This helps consolidate different data sets into one central location.
      • Manipulates data in real-time – As data is read from sources, Logstash analyzes it and restructures it immediately.
      • Flexible output – Logstash is built for Elasticsearch, but like many open-source projects, it can be reconfigured to export to other utilities.
      • Plugin support – A wide range of add-ons can be added to enhance Logstash’s features.

Kibana Overview

You can use Elasticsearch from a command line just by having it installed. But Kibana gives you a graphical interface to generate and display data more intuitively.

Here are some of the technical details:

      • Dashboard interface – Configure graphs, data sources, and at-a-glance metrics.
      • Configurable menus – Build data visualizations and menus to quickly navigate or explore data sets.
      • Plug-ins – Adding plug-ins like Canvas allow you to add structured views and real-time monitoring to your graphical interface.

Beats Overview

Beats runs on the system it monitors. It collects and ships data to a destination, like Logstash or Elasticsearch.

You can use Beats to import data directly into Elasticsearch if you’re running a smaller data set.

Use Beats to import data directly into Elasticsearch.

Alternatively, Beats can be used to break data up into manageable streams, then parsed into Logstash, to be read by Elasticsearch.

ELK stack with Beats installed

For a single server, you can install Elasticsearch, Kibana, and a few Beats. Each Beats collects data and sends it to Elasticsearch. You then view the results in Kibana.

Alternatively, you can install Beats on several remote servers, then configure Logstash to collect and parse the data from the servers. That data is sent to Elasticsearch and then becomes visible in Kibana.

There are different Beats applications. You can install one or multiple Beats.

      • Filebeat – Reads and ships system logfiles. It is useful for server logs, such as hardware events or application logs.
      • Metricbeat – Reads metric data – CPU usage, memory, disk usage, network bandwidth. Use this as a supercharged system resource monitor.
      • Packetbeat – Analyzes network traffic. Use it to monitor latency and responsiveness, or usage and traffic patterns.
      • Winlogbeat – Ships data from the Windows Event Log. Track logon events, installation events, even hardware or application errors.
      • Auditbeat – A supercharged version of Linux auditd. It can interact directly with your Linux system in place of the auditd process. If you already have auditd rules in place, Auditbeat will read from your existing configuration.
      • Heartbeat – Displays uptime and response time. Use this to keep an eye on critical servers or other systems, to make sure they’re running and available.
      • Functionbeat – Ships data from serverless or cloud infrastructure. If you’re running a cloud-hosted service, use it to collate data from the cloud and export it to Elasticsearch.

ELK Stack Use Cases

So, what’s the point of collecting, sorting, and displaying all this data? Log monitoring gives insight into how your resources are performing.

      • For example, data might indicate that one server is overloaded with traffic. You could implement a load-balancing application (like Nginx) to shift traffic to other servers.
      • Alternatively, you might find that network traffic patterns favor some web pages more strongly than others. This helps you discover which web pages are more relevant or useful to visitors.
      • You might monitor user account activity. A pattern of attaching a storage device might indicate a user account permission problem. Multiple failed login attempts might indicate an attempt to breach the system.
      • If you’re deploying a new application, you can monitor errors for that application. These can help quickly identify areas to fix bugs or improve application design.

Elastic Stack generates data that you can use to solve problems and make sound business decisions.

Conclusion

In this Elk Stack Tutorial, you learned the basics of the Elastic Stack, how it is used, what it is used for, and its components.

Review the use cases, and consider trying out the ELK stack.


Next you should also read