Prometheus Installation to monitor server

No comments

Prometheus is an open source monitoring system and time series database. It addresses many aspects of monitoring such as the generation and collection of metrics, graphing the resulting data on dashboards, and alerting on anomalies. The main features of Prometheus is that its a multi-dimensional data model with

System Update


$ sudo yum update

Downloading Prometheus

Once your server is ready with all of the updates, proceed to the installation setup of Prometheus. To do so, first we will download its latest available package on Github. We will be using the ‘Curl’ command to download Pometheus in a newly created directory using the following commands.
$ mkdir backup
$ cd backup
$ curl -LO "https://github.com/prometheus/prometheus/releases/download/0.17.0/prometheus-0.17.0.linux-amd64.tar.gz"
$ mv prometheus-0.17.0.linux-amd64.tar.gz prometheus-0.17.0rc1.linux-amd64.tar.gz

Prometheus Installation


We will be creating a new directory to install Prometheus into as it would be best practise to keep all the components of Prometheus within one parent directory. So, run the commands below to create a new directory in the home directory of your current user and extract the Prometheus package on it.
$ mkdir ~/Prometheus
$ cd ~/Prometheus
$ tar -zxvf ~/backup/prometheus-0.17.0rc1.linux-amd64.tar.gz
Now run the following command to verify the installation and check the version of ‘Prometheus’ and ‘Go’ installers.
$ ~/Prometheus/prometheus-0.17.0rc1.linux-amd64/prometheus -version


Installation of Node Exporter

Node Exporter is actually the Prometheus exporter for machine metrics, written in Go with pluggable metric collectors that exports a lot of metrics such as disk I/O statistics, memory usage, network statistics, CPU load and much more in a format that Prometheus recognizes.
The installation package for Node Exporter is also available on Github that can be downloaded from the Prometheus Node Exporter releases. Copy the Source link address to download the package using the ‘Curl’ command.
$ cd ~/backup/
$ curl -LO "https://github.com/prometheus/node_exporter/releases/download/0.12.0rc3/node_exporter-0.12.0rc3.linux-amd64.tar.gz"
Then extract this using the ‘tar’ command in a new directory ‘node_exporter’ under Prometheus using the following commands.
$ mkdir ~/Prometheus/node_exporter
$ cd ~/Prometheus/node_exporter/
$ tar -zxvf ~/backup/node_exporter-0.12.0rc3.linux-amd64.tar.gz 

Starting Node Exporter

Execute the node_exporter within the same directory where extracted, to run its service as shown below.
$ ./node_exporter

Node Exporter As a Service

Node Exporter has been placed under the home directory of our sudo user in ‘~/Prometheus/node_exporter’ directory. Now we are going to configure it as a service so that we can easily start and stop Node Exporter service when required.

$ sudo vim /etc/systemd/system/node_exporter.service
[Unit]
Description=Node Exporter

[Service]
User=vxx
ExecStart=/home/vxx/Prometheus/node_exporter/node_exporter

[Install]
WantedBy=default.target
Then start the service after reloading daemon or reboot the server.
$ sudo systemctl daemon-reload
$ sudo systemctl enable node_exporter.service
$ sudo systemctl start node_exporter.service
$ sudo systemctl status node_exporter.service
Once the node_exporter service is running, open your browser to view Node Exporter’s web interface by following the link below.
http://your_servers_ip:9100/metrics

Starting Prometheus Server

We are now ready to start the Prometheus server by creating a new configuration file in the Prometheus directory with the following code in it.
$ cd ~/Prometheus/prometheus-0.17.0rc1.linux-amd64/
$ vim prometheus.yml


global:
  scrape_interval:     15s # By default, scrape targets every 15 seconds.

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
    monitor: 'codelab-monitor'

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s

    target_groups:
      - targets: ['localhost:9090']
Close after making saving the changes using ‘:wq!’.
These configurations will create the ‘scrape_configs’ and a ‘job_name’ as ‘node’ that can be any name you wish. Then start the Prometheus server as a background process and redirect it to output the log files using the following command.
$ nohup ./prometheus > prometheus.log 2>&1 &
To view these logs you can use the command below.
$ tail ~/Prometheus/prometheus-0.17.0rc1.linux-amd64/prometheus.log

Prometheus Web Access

Now open your favourite web browser to access the Prometheus web console using your server’s IP address or FQDN as shown below.
http://your_servers_ip:9090
To make sure that the Prometheus server is fetching the data from the Node Exporter, click on the Graph and insert any metric chosen from the drop down, then click on the ‘Execute’ button to see the graph as shown below.
You can view the most commonly used metrics from the console templates that are available under the following path.
$ ls ~/Prometheus/prometheus-0.17.0rc1.linux-amd64/consoles
http://yourservers_ip:9090/consoles/node.html
For any help in configuration please don't forget to comment.

No comments :

Post a Comment

Docker Swarm Mode Walkthrough

No comments

Service Discovery In The Swarm Cluster

The old (standalone) Swarm required a service registry so that all its managers can have the same view of the cluster state. When instantiating the old Swarm nodes, we had to specify the address of a service registry. However, if you take a look at setup instructions of the new Swarm (Swarm Mode introduced in Docker 1.12), you’ll notice that we nothing is required beyond Docker Engines. You will not find any mention of an external service registry or a key-value store.
Does that mean that Swarm does not need service discovery? Quite the contrary. The need for service discovery is as strong as ever, and Docker decided to incorporate it inside Docker Engine. It is bundled inside just as Swarm is. The internal process is, essentially, still very similar to the one used by the standalone Swarm, only with less moving parts. Docker Engine now acts as a Swarm manager, Swarm worker, and service registry.
The decision to bundle everything inside the engine provoked a mixed response. Some thought that such a decision creates too much coupling and increases Docker Engine’s level of instability. Others think that such a bundle makes the engine more robust and opens the door to some new possibilities. While both sides have valid arguments, I am more inclined towards the opinion of the later group. Docker Swarm Mode is a huge step forward, and it is questionable whether the same result could be accomplished without bundling service registry inside the engine.
It is important to note that service registry bundled inside the engine is for internal use only. We cannot access it nor use it for our own purposes.
Knowing how Docker Swarm works, especially networking, the question that might be on your mind is whether we need service discovery (beyond Swarm’s internal usage). In The DevOps 2.0 Toolkit, I argued that service discovery is a must and urged everyone to set up Consul or etcd as service registries, Registrator as a mechanism to register changes inside the cluster, and Consul Template or confd as a templating solution. Do we still need those tools?

Do We Need Service Discovery?

It is hard to provide a general recommendation whether service discovery tools are needed when working inside a Swarm cluster. If we look at the need to find services as the main use case for those tools, the answer is usually no. We don’t need external service discovery for that. As long as all services that should communicate with each other are inside the same network, all we need is the name of the destination service. For example, all that the go-demo service needs to know to find the related database is that its DNS is go-demo-db. The Docker Swarm Networking And Reverse Proxy chapter proved that the proper networking usage is enough for most use cases.
However, finding services and load balancing requests among them is not the only reason for service discovery. We might have other usages of service registries or key-value stores. We might have a need to store some information in a way that it is distributed and fault tolerant.
An example of the need for a key-value store can be seen inside the Docker Flow: Proxy project. It is based on HAProxy which is a stateful service. It loads the information from a configuration file into memory. Having stateful services inside a dynamic cluster represents a challenge that needs to be solved. Otherwise, we might lose a state when a service is scaled, rescheduled after a failure, and so on.

Problems When Scaling Stateful Instances

Scaling services inside a Swarm cluster is easy, isn’t it? Just execute docker service scale SERVICE_NAME=NUMBER_OF_INSTANCES and, all of a sudden, the service is running multiple copies.
The previous statement is only partly true. The more precise wording would be that scaling stateless services inside a Swarm cluster is easy.
The reason that scaling stateless services is easy lies in the fact that there is no state to think about. An instance is the same no matter how long it runs. There is no difference between a new instance and one that run for a week. Since the state does not change over time, we can create new copies at any given moment, and they will all be exactly the same.
However, the world is not stateless. State is an unavoidable part of our industry. As soon as the first piece of information is created, it needs to be stored somewhere. The place we store data must be stateful. It has a state that changes over time. If we want to scale such a stateful service, there are, at least, two things we need to consider.
  1. How do we propagate a change of a state of one instance towards the rest of the instances?
  2. How to we create a copy (a new instance) of a stateful service and make sure that the state is copied as well.
We usually combine a stateless and stateful services into one logical entity. A back-end service could be stateless and rely on a database service as an external data storage. That way, there is a clear separation of concerns and a different lifecycle of each of those services.
Before we proceed, I must state that there is no silver bullet that makes stateful services scalable and fault-tolerant. Throughout the book, I will go through a couple of examples that might, or might not apply to your use case. An obvious, and very typical example of a stateful service is a database. While there are some common patterns, almost every database provides a different mechanism for data replication. That, in itself, is enough to prevent us from having a definitive answer that would apply to all. We’ll explore scalability of a Mongo DB later on in the book. We’ll also see an example with Jenkins that uses a file system for its state.
The first case we’ll tackle will be of a different type. We’ll discuss scalability of a service that has a state stored in its configuration file. To make things more complicated, the configuration is dynamic. It changes over time throughout the lifetime of the service. We’ll explore ways to make HAProxy scalable.
If we’d use the official HAProxy image, one of the challenges we would face is how to update the state of all the instances. How to change the configuration and reload each copy of the proxy.
We can, for example, mount an NFS volume on each node in the cluster and make sure that the same host volume is mounted inside all HAProxy containers. On the first look, it might seem that would solve the problem with the state since all instances would share the same configuration file. Any change to the config on the host would be available inside all the instances we would have. However, that, in itself, would not change the state of the service.
HAProxy loads the configuration file during initialization, and it is oblivious to any changes we might make to the configuration afterward. For the change of the state of the file to be reflected in the state of the service we need to reload it. The problem is that instances can run on any of the nodes inside the cluster. On top of that, if we adopt dynamic scaling (more on that later on), we might not even know how many instances are running. So, we’d need to discover how many instances we have, find out on which nodes they are running, get IDs of each of the containers, and, only then, send a signal to reload the proxy. While all this can be scripted, it is far from an optimum solution. Moreover, mounting an NFS volume is a single point of failure. If the server that hosts the volume fails, data is lost. Sure, we can create backups, but they would only provide a way to restore lost data partially. That is, we can restore a backup, but the data generated between the moment the last backup was created, and the node failure would be lost.
An alternative would be to embed the configuration into HAProxy images. We could create a new Dockerfile that would be based on haproxy and add the COPY instruction that would add the configuration. That would mean that every time we want to reconfigure the proxy, we’d need to change the config, build a new set of images (a new release), and update the proxy service currently running inside the cluster. As you can imagine, this is also not practical. It’s too big of a process for a simple proxy reconfiguration.
Docker Flow: Proxy uses a different, less conventional, approach to the problem. It stores a replica of its state in Consul. It also uses an undocumented Swarm networking feature (at least at the time of this writing).

The DevOps 2.1 Toolkit: Docker Swarm

The DevOps 2.1 Toolkit: Docker SwarmIf you liked this article, you might be interested in The DevOps 2.1 Toolkit: Docker Swarmbook. Unlike the previous title in the series (The DevOps 2.0 Toolkit: Automating the Continuous Deployment Pipeline with Containerized Microservices) that provided a general overlook of some of the latest DevOps practices and tools, this book is dedicated entirely to Docker Swarm* and the processes and tools we might need to **build, test, deploy, and monitor servicesrunning inside a cluster.
The book is still under “development”. You can get a copy from LeanPub. It is also available as The DevOps Toolkit Series bundle. If you download it now, before it is fully finished, you will get frequent updates with new chapters and corrections. More importantly, you will be able to influence the direction of the book by sending me your feedback.
I choose the lean approach to book publishing because I believe that early feedback is the best way to produce a great product. Please help me make this book a reference to anyone wanting to adopt Docker Swarm for cluster orchestration and scheduling.

No comments :

Post a Comment

Docker Security

No comments

Docker couldn’t make this default setting much stricter because there are so many types of containers, and between them they use pretty much all of the remaining system calls.
The other significant novelty here is the ability to set Seccomp profiles individually for containers. This provides the necessary flexibility to further harden specific containers based on their actual needs. However, it requires intimate knowledge of container behavior and the system calls it uses, and it’s likely that most developers and Docker users simply won’t have the wherewithal to carry out such changes judiciously. From a programmer’s perspective, it’s better to leave things open rather than risk a bug in the application behavior by mistakenly barring access.

Best Security Practices for Docker

Fortunately, we can enhance security without introducing much complexity or losing the advantages of containers. In order to do so, we need to put in place a number of best practices which can greatly simplify this task:
  1. First of all, don’t forget to apply all well known best practices to secure and harden your Linux server (regular and trusted updates, limit traffic with iptables, etc.). In this regard, a recurring and planned vulnerability scan would be of great help (at SecludIT we do vulnerability scanning, and we do it well);
  2. Security folks at Docker released a CIS Benchmark to address some security issues (download it here). Also, they kindly provided a script which automatically performs some of the most critical checks (Check it out);
  3. Don’t run apps with root privileges, not even within a container. In addition to this, take advantage of namespaces, thanks to which you can map a user within a container to a user (with no privileges) on the host;
  4. Don’t pull images from untrusted repositories. A malicious image may compromise your host. In this regard, the Docker community is working on a project aimed at providing a tool to assess the authenticity and the integrity of an image;
  5. Use seccomp to disable unused but potentially harmful system calls (by default seccomp already disables privileged system calls);
  6. Use tools such as AppArmor and SELinux to properly track the activity of your containers and stop any unusual and potentially malicious activity;
  7. Carefully analyze Docker capabilities (you can find the full list here) to minimize the privileges given to your containers. Be aware that Docker enables by default some capabilities (e.g. CHOWN which allows the container to arbitrarily change file owners bypassing all security controls) which are (mistakenly?) not considered harmful. However, it is always a good practice to drop all capabilities that are not necessary for your container. This will reduce the attack surface in case your container is compromised.


No comments :

Post a Comment

DNS setup in rhel

No comments



The Domain Name System (DNS) is a standard technology for managing public names of Web sites and other Internet domains. DNS technology allows you to type names into your Web browser like compnetworking.about.com and your computer to automatically find that address on the Internet. A key element of the DNS is a worldwide collection of DNS servers.

How DNS Works

The DNS is a distributed system, meaning that only the 13 root servers contain the complete database of names and addresses. All other DNS servers are installed at lower levels of the hierarchy and maintain only certain pieces of the overall database.
Most lower level DNS servers are owned by businesses or Internet Service Providers (ISPs). For example, Google maintains various DNS servers around the world that manage the google.com, google.co.uk, and other domains. Your ISP also maintains DNS servers as part of your Internet connection setup.
DNS is based on the client/server network architecture. Your Web browser functions as a DNS client (also called a DNS resolver) and issues requests to your Internet provider's DNS servers when navigating between Web sites.
When a DNS server receives a request not in its database (such as a geographically distant or rarely visited Web site), it temporarily transforms from a server to a DNS client. The server automatically passes that request to another DNS server or up to the next higher level in the server hierarchy as needed. Eventually the request arrives at a server that has the matching name and IP address in its database (all the way to the root level if necessary), and the response flows back through the chain of DNS servers to your client.
Publicly available DNS tools can be used to search for information related to Internet domains. Professional network administrators use these same basic tools on business networks.

No comments :

Post a Comment