Make monitoring easier by adding an ELK stack to your dev environment

Kublr Team
Kublr Team
Published in
9 min readMay 16, 2017

--

Today there’s little debate regarding the importance of monitoring for every deployment, whether in your production or development environment. However, monitoring can turn into a full-time job. A smaller scale monitoring stack can help, adding value to the development process and visibility to test cycles, while giving you a clear view of your load and performance tests results.

Whether you’re deploying your solution on a vCenter or a public cloud provider, this tutorial will help you make monitoring easier in your Kubernetes environment. We’ll be learning how to monitor logs with an ElasticSearch + Logstash + KIbana (ELK) stack for dev as a single pod in Kubernetes. This solution can be used on one or more of your environments, provided you are comfortable using ElasticSearch indexes.

Getting Started

We’ll be deploying this stack in the cloud on Amazon Web Services (AWS). Before you start, ensure you have the following prerequisites:

  • An AWS account.
  • A dev machine with Kubernetes installed and kubectl available.
  • AWS CLI installed and configured with your access and secret keys. Alternately, you can use IAM Roles. In this tutorial, we’ll be using credentials.

We’ll be using Kubernetes v1.5.2, since deploying a cluster (master and minions) is straightforward with kube-up. (Kube-up is no longer supported in Kubernetes v1.6).

Setting Up the Dev Machine

We’re using an AWS Ubuntu 16.04 LTS t2.micro instance for this tutorial. You can use a PC, but note that this tutorial is Unix-oriented.

Install Kubernetes

For the Kubernetes install location, consider using a standard location such as /opt. However, if you don’t want to use sudo, you can install in your home or preferred work directory.

export RELEASE=v1.5.2wget https://github.com/kubernetes/kubernetes/releases/download/$RELEASE/kubernetes.tar.gz -O ~/kubernetes.tar.gztar xvfz ~/kubernetes.tar.gz

Install and Configure awscli

Refer to the instructions for installing the AWS command line interface.

For the awscli configuration, you’ll need your access key and secret key. Refer to the AWS documentation for information on managing access keys for your AWS account.

Note: If you are not comfortable using key credentials, you can use an AWS dev instance with IAM roles. The policies will provide full access to create, modify and delete resources in EC2, S3, Route53, IAM and VPC services.

To configure, run:

$ aws configure

and enter your credentials when prompted.

Finally, create a bucket in S3. You specify any name, but make a note of the name as you’ll need this name later. The bucket is used by Kubernetes to store your cluster(s) state. Refer to the AWS documentation for details on creating a bucket.

Getting Started With the Cluster

Our cluster will consist of one master and one minion, and we’ll pass the configuration via environment variables.

On your shell, export the following variables:

export KUBERNETES_PROVIDER=aws
export KUBERNETES_SKIP_DOWNLOAD=1
export KUBE_AWS_ZONE=sa-east-1a # Choose whatever region you want
export KUBE_OS_DISTRIBUTION=jessie # Debian
export AWS_S3_REGION=sa-east-1
export AWS_S3_BUCKET=[YOUR_BUCKET_NAME]
export NODE_ROOT_DISK_SIZE=10
export INSTANCE_PREFIX=kube
export NUM_NODES=1
export MASTER_SIZE=t2.large
export NODE_SIZE=t2.large

Note: According to the Kubernetes documentation, the minimal instance size to run applications in AWS is t2.large. This instance size is not free-tier-eligible, so you will be charged for the use of this kind of instance. You can attempt to use a smaller instance, but a smaller instance may not allow you to complete this tutorial.

Deploying the Kubernetes Cluster

Run the following command, and then wait for the return of the command prompt:

$ wget -q -O - https://get.k8s.io | bash

The following output should display:

Creating a kubernetes on aws…
… Starting cluster in sa-east-1a using provider aws
… calling verify-prereqs
… calling verify-kube-binaries
… calling kube-up
Starting cluster using os distro: jessie
Uploading to Amazon S3
+++ Staging server tars to S3 Storage: kublr-blog/devel
.
.
.
Kubernetes binaries at /opt/kubernetes/cluster/
You may want to add this directory to your PATH in $HOME/.profile
Installation successful!

Let’s take a moment to review what we’ve done:

  • We’ve configured our dev machine (your computer or an AWS instance).
  • We’ve installed Kubernetes and configured the awscli.
  • We overwrote the Kubernetes configuration with environment variables and launched the deployment.

We now have a functional Kubernetes cluster running, and we’re ready to deploy applications and the ELK stack.

From the three object management techniques that Kubernetes suggests, we are going to follow the imperative object configuration and create a deployment from a YAML file. This technique will create a Replication Set that will finally manage our pod. (Replication sets are a new — and equivalent — version of replication controllers.)

Create a new YAML file, and add the following content. In this example, we are creating elk.yml.

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: elk
spec:
replicas: 1
selector:
matchLabels:
stack: elk
template:
metadata:
labels:
app: elk
stack: elk
spec:
containers:
- name: elasticsearch
image: elasticsearch
ports:
- containerPort: 9200
- name: kibana
image: kibana
ports:
- containerPort: 5601
env:
- name: ELASTICSEARCH_URL
value: "http://localhost:9200"
- name: logstash
image: logstash
ports:
- containerPort: 8080
args:
- "-e"
- "input { http { } } output { elasticsearch { hosts => [‘localhost:9200’] } }"

This deployment will create a single pod (replicas: 1) with three containers.

Here are a few important notes about this process:

  • Ports are exposed in each container but are not mapped to the node, since they are in the same pod and can communicate with each other.
  • The pod itself is automatically assigned an IP; thus, it is seen as an endpoint. This means that the containers under the same pod are considered one service and can communicate between each other through localhost, as seen in the ElasticSearch endpoint in logstash configuration args.
  • Since we are not using an example application in this tutorial, we are using logstash’s API through HTTP to insert some test data and visualize it in Kibana.

Enter the following:

$ kubectl create -f elk.yml
deployment “elk” created

Our deployment is created.

Next, we can run some checkpoints:

$ kubectl get deploy
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
elk 1 1 1 1 44m
$ kubectl get rs
NAME DESIRED CURRENT READY AGE
elk-4061787054 1 1 1 44m
$ kubectl get pod
NAME READY STATUS RESTARTS AGE
elk-4061787054-zp8kp 3/3 Running 0 44m

Note that the pod name is a set of random characters. Make a note of this output as we will need the pod’s name in a future step.

Provided your stats look like the previous examples, the stack is alive and you can proceed. Confirm that the deployment appears as AVAILABLE (1) and that 3/3 containers are running in the pod with very few or zero restarts.

Want a stress-free K8S cluster management experience? Download our demo, Kublr-in-a-Box.

Expose the Stack

The following steps in this tutorial can vary depending on your deployment and your needs. We will expose the deployment to the cluster and make Kibana public so it is accessible through the Internet.

First, expose the deployment to the cluster by executing:

$ kubectl expose deployment/elk

This creates a service which is a network interface for the pod(s). You can see it by executing kubectl get svc:

$ kubectl get svc
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
elk 10.0.163.83 <none> 9200/TCP,5601/TCP,8080/TCP 4s
kubernetes 10.0.0.1 <none> 443/TCP 2d

Our services stack is now available to the Kubernetes cluster.

Next, we will expose the Kibana service so the dashboard is accessible to the admins. In this case, we will expose the service via LoadBalancer and use the ELB address to access Kubernetes’ public services. Alternately, you can expose the service directly to the public IP of the instance and manage your own domain names for accessing the service.

Let’s create a service by exposing the previous service that maps port 80 of the ELB to the Kibana port of the elk service:

$ kubectl expose svc elk —- port=80 —- target-port=5601 —- type=LoadBalancer —-name=elk-public
service “elk-public” exposed

You can see it by executing:

$ kubectl get svc
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
elk 10.0.163.83 <none> 9200/TCP,5601/TCP,8080/TCP 2m
elk-public 10.0.17.28 ad91ca2342146… 80/TCP 6s
kubernetes 10.0.0.1 <none> 443/TCP 2d

Now, you will want to access Kibana from your browser. To get the ELB address, you can go to the AWS console or describe the service and look for the line specifying the address:

$ kubectl describe svc elk-public

LoadBalancer Ingress:
a4df05b302a8611e7a3190xxxxxxxxxx-1209904062.sa-east-1.elb.amazonaws.com

Be patient if the page does not load immediately; it will take some time for the bound between the ELB and the instance to be In Service.

We are now up and running. Our kibana is exposed for monitoring, and our logstash and elasticsearch are available for sending logs. We’ll now test the stack to see it in action.

Testing the Stack

First, we’ll insert some logs.

Remember that we declared the input of Logstash to be an HTTP API so, to insert a test log, we need to access a node or container of the kubernetes cluster and use localhost in port 8080. To login on a container from the deployed pod, enter:

$ kubectl exec -it [POD_NAME] -- /bin/bash

You should get a prompt from a random container of your pod that looks similar to root@elk-xxxx$.

Next, run the following command to insert a test log into Logstash:

$ curl -XPUT ‘http://localhost:8080/twitter/tweet/1' -d ‘This is a test log’
ok

This should return an “ok” hidden at the beginning of the prompt.

Access kibana on the browser with the ELB URL. In the Management section, create an index with the default config (logstash-*) so it reads from that index in ElasticSearch. (You may need to deselect the Index contains time-based events option.)

Click Create, and go to the Discover section of the dashboard. You should see only the existing index selected and only one log — the log we inserted in the earlier step.

Now, you can customize your stack to better fit your needs. For example, you can change the Logstash configuration to input data from syslog, etc. You can also build your own custom images to minimize the external configuration, and you can expose the deployments or services using the resources you have available.

Here’s an important point to remember from this tutorial: when pods are tied to each other in this manner, a pod can run several containers and run more than one application.

You can further improve on this approach with good practices in security, performance, and usability. Consider these suggestions and potential next steps:

  • This tutorial focused on monitoring dev environments, but you should consider using this stack on every environment. You’ll want to analyze your requirements and develop capacity plans for the requirements in each phase (dev, staging, prod) and the phase-specific configurations. Consider using Kubernetes ConfigMaps to abstract the deployment files and even the custom images from different configurations. Refer to the Kubernetes documentation on configuring containers using a ConfigMap for more information.
  • The logstash container is in this pod as a proof of concept. If you are using a logstash container, you should run a logstash forwarder in each node of your solution (or at least the applications you want to monitor). Daemon Sets are the Kubernetes solution for this type of implementation. Daemon Sets run copies of a pod in the solution nodes or the ones filtered by selectors. You can find more information in the Kubernetes documentation on Daemon Sets.
  • If you are exposing Kibana to the Internet, you may want to use TLS for secure communication. Doing this in an AWS LoadBalancer is straightforward. Alternately, if you prefer to keep everything private, a good solution (besides TLS) is using a bastion host with ssh tunneling for accessing the service(s) of your cluster.
  • Since ElasticSearch is a storage service, you will want to keep information in volumes that are mounted in your computing infrastructure and intended for larger implementations. Use a multi-node approach with a replication factor for reliability and high availability.

We hope this tutorial makes your monitoring efforts a bit easier. Share your thoughts and questions in the comments section below.

Need a user-friendly tool to set up and manage your K8S cluster? Check out Kublr-in-a-Box. To learn more, visit kublr.com.

--

--

Production-ready cluster and application platform that speeds and simplifies the set-up and management of your K8S cluster. To learn more, visit kublr.com.