Technical Stuff

Custom Operator (don't know anything)

Controller Runtime: The Kubernetes controller-runtime Project is a set of go libraries for building Controllers Operator SDK: The Operator SDK is a framework that uses the controller-runtime library to make writing operators easier 

https://sdk.operatorframework.io/docs/building-operators/ansible/quickstart/
https://sdk.operatorframework.io/docs/building-operators/ansible/tutorial/
https://sdk.operatorframework.io/docs/building-operators/ansible/reference/dependent-watches/
https://docs.openshift.com/container-platform/4.7/operators/operator_sdk/ansible/osdk-ansible-inside-operator.html


https://itnext.io/a-practical-kubernetes-operator-using-ansible-an-example-d3a9d3674d5b
https://two-oes.medium.com/building-custom-ansible-based-operator-for-openshift-4-ec681fa0466d




Example:

operator-sdk init --domain operator.redhatgov.io --plugins ansible

operator-sdk create api --group workshops --version v1alpha1 --kind Workshop --generate-role


It will create some directory structure(defaults file will be available)



Creating CRD:

kubectl create -f config/crd/bases/workshops.operator.redhatgov.io_workshops.yaml


Run custom Operator:
command: make deploy


create custom resource now
-----

apiVersion: workshops.operator.redhatgov.io/v1alpha1

#workshops.operator.redhatgov.io/v1alpha1

kind: Workshop

metadata:

  name: example-workshop

spec:

  # Add fields here

  cr_my_replicas: 1



kubectl create -f config/crd/bases/workshops_v1_workshop_cr.yaml

you can check logs of custom - operator

kubectl logs -f workshop-controller-manager-9f6ff675b-rbcqw -n workshop-system



In RBAC role.yaml
I updated below

### added by prakash ###

  - apiGroups:

      - "*"

    resources:

      - "*"

    verbs:

      - "*"






GCP Data Life Cycle

 Data lifecycle


Mainly data life cycle has 4 steps:

1. Ingest - ( 
to pull in the raw data )
2. Store - ( tstore in a format that is durable and can be easily accessed)
3. Process and analyze - (data is transformed from raw form into actionable information)
4. Explore and Visualize - (to convert the results of the analysis into a format that is easy to draw insights from)




Ingest
1) ingesting app data
2) ingesting Streaming data
3) ingesting batch data





















Store



Cloud Storage:

backing up and archiving 
storage and delivery of content
* cloud storage can be accessed by dataflow for transformation and loading into other systems such as Bigtable or BigQuery.
* For Hadoop and Spark jobs, data from Cloud Storage can be natively accessed by using Dataproc.
* BigQuery natively supports importing CSV, JSON, and Avro files from a specified Cloud Storage bucket.


Cloud Storage for Firebase:
good fit for storing and retrieving assets such as images, audio, video, and other user-generated content in mobile and web apps.



Cloud SQL:

fully managed, cloud-native RDBMS that offers both MySQL and PostgreSQL engines with built-in support for replication.
offers built-in backup and restoration, high availability, and read replicas.
* Cloud SQL supports RDBMS workloads up to 30 TB for both MySQL and PostgreSQL
* Data stored in Cloud SQL is encrypted both in transit and at rest
* For OLTP Cloud SQL is appropriate
For OLAP workloads, consider BigQuery
*  If your workload requires dynamic schemas, consider Datastore.
*  You can use
Dataflow or Dataproc to create ETL jobs that pull data from Cloud SQL and insert it into other storage systems.



Bigtable: Managed wide-column NoSQL
managed, high-performance NoSQL database service designed for terabyte- to petabyte-scale workloads

provides consistent, low-latency, and high-throughput storage for large-scale NoSQL data
Bigtable is built for real-time app serving workloads, as well as large-scale analytical workloads.
* Use case:
 1) 
Real-time app data
 2) 
Stream processing (pub/sub => dataflow(transform) => BigTable)
 3) 
IoT time series data (sensor/streamed data => Bigtable (time series schema))
 4) AdTech workloads (can be used to store and track ad impressions which can be used by dataproc and dataflow for processing and analysing)
5) data ingestion (cloud storage => dataflow/dataproc => Bigtable)
6) Analytical Workloads (Bigtable=> dataflow (complex aggrregation) => dataproc((
Dataproc can be used to execute Hadoop or Spark processing and machine-learning tasks.))
7) Apache HBase replacement 

note: 
While Bigtable is considered an OLTP system, it doesn't support multi-row transactions, SQL queries or joins. For those use cases, consider either Cloud SQL or Datastore.


fully managed relational database service for mission-critical OLTP apps
Like relational databases, Spanner supports schemas, ACID transactions, and SQL queries
Spanner also performs automatic sharding while serving data with single-digit millisecond latencies
uses cases for Spanner:
   1) financial services (
strong consistency across read/write operations without scarificing HA)
   2) Ad tech (low-latency querying without compromising scale or availability.)
   3) Retail and Global Supply Chain(Spanner offers automatic, global, synchronous replication with low latency, which means that data is always consistent and highly available.)


* NoSQL database that stores JSON data
JSON data can be synchronized in real time to connected clients across different platforms
to build a real-time experience serving millions of users without compromising responsiveness.
* Use Cases:
  1) chat ans social (
Store and retrieve images, audio, video, and other user-generated content.)
  2) Mobile Games (Keep track of game progress and statistics across devices and device platforms)


=============================================================
(Process and analyze)
to derive business value and insights from data, you must transform and analyze it.

 


GCP Helicopter Racing League

Helicopter Racing League

Overview

  • Regional league
  • Offers a paid service to stream the races all over the world with live telemetry and predictions throughout each race.

Solution Concept

  •  migrate their existing service to a new platform 
  • to expand their use of managed AI and ML services to facilitate race predictions.
  • they want to move the serving of their content, both real-time and recorded, closer to their users.

Existing Technical Environment

  •  public cloud-first company
  • core of their mission-critical applications runs on their current public cloud provider.
  • Video recording and editing is performed at the race tracks,
  •  the content is encoded and transcoded, where needed, in the cloud.
  • Enterprise-grade connectivity and local compute is provided by truck-mounted mobile data centers.
  • Their race prediction services are hosted exclusively on their existing public cloud provider.[CloudML/Tensorflow/MLWorkflow]
  • Existing content is stored in an object storage service on their existing public cloud provider.[Google Storage bucket]
  • Video encoding and transcoding is performed on VMs created for each job.
  • Race predictions are performed using TensorFlow running on VMs in the current public cloud provider.[CloudML/Tensorflow]

Business Requirement

HRL’s owners want to expand their predictive capabilities and reduce latency for their viewers in emerging markets. Their requirements are:

  • Support ability to expose the predictive models to partners.[Cloud Endpoint]
  • Increase predictive capabilities during and before races:(Race result/Mechanical Failure/Crowd sentiment) [Data Ingestion/Data storage/ Processing]
  • Increase telemetry and create additional insights.
  • Measure fan engagement with new predictions.
  • Enhance global availability and quality of the broadcasts.
  • Increase the number of concurrent viewers.
  • Minimize operational complexity.
  • Ensure compliance with regulations.
  • Create a merchandising revenue stream .[Cloud Endpoint]

Technical Requirements   

  • Maintain or increase prediction throughput and accuracy.
  • Reduce viewer latency.
  • Increase transcoding performance.
  • Create real-time analytics of viewer consumption patterns and engagement.
  • Create a data mart to enable processing of large volumes of race data

Executive Statement

  •  enhanced video streams [AutoML Video Intelligence/Video Intelligence API]
  • to include predictions of events within the race (e.g., overtaking).
  •  Our current platform allows us to predict race outcomes but lacks the facility to support real-time predictions during races and the capacity to process season-long results.

 

=======================Our Analysis==========================

  • Streaming ==>> Data Processing
  • Predictions ==>> Machine Learning

Enterprise grade connectivity ==>>   


Holistic security: Enterprise-grade necessitates a holistic approach towards security, across products, processes, and applications.
Integration: Enterprise-grade expects tools and technologies to add to and extend existing tools, such that end users don’t have to face any disruption in their routine work.
Productivity: Enterprise-grade requires technologies to be attuned to the idea of “more work in less time,” such that users don’t feel inclined to use consumer-grade alternatives for more convenience.
Support: vendors need to support not only the internal stakeholder but also make sure the technology gels well into the larger distributed ecosystem.
Granular control: Enterprise-grade technology must offer companies deep control over policies for accessibility of content, and the ability to manage user environments suitable for specific user groups




Reference Link:
[AutoML Video Intelligence/Video Intelligence API]

[Cloud Endpoint]

[Data Ingestion/Data storage/ Processing]







services category




Docker registry understanding

Today I am going to explain what is Docker Registry and basics of docker registry.
Some times  we should understand the things in proper way and from basics for better result. To achieve same thing i am just explaining the concepts and working of Docker Registry what i understood from my current project.



Basically Docker Images works on the basis of layer architecture i.e. a docker image consist of many unique supportive layers.may be same layer  utilized by another image. So we can say Docker image contains group of layers and image name and tag contains group of specific layers. Some times this concept becomes very important with respect to space saving.

So, again here we are going to explain layering concept of docker with scratch and using some examples.

Docker layers are distributed in docker registry in random manner each layers are independent to each other and can be used in any image. now which image will use which layer it depends on Dockerfile used by that particular image.

                                                                  
                                                                    Docker Registry

you can see in above pic docker layers are stored in Docker registry which will be used in docker image.We can say docker registry is made of docker layers.

Now we are going to explain how layers distribution can be happen to make docker images.some times same layer can be utilized in two or more than two images. But  the final goal will have unique image.


                                                  Layer combination for different images

You can see here how layers are combined to make image.this is hawk eye view of docker registry.
when some one will run "docker pull <image name>:<tag>"  ,it will download only respective layers.



If you have any issue please comment and share your feedback. So that i can improve our post quality.



Prometheus Installation to monitor server


Prometheus is an open source monitoring system and time series database. It addresses many aspects of monitoring such as the generation and collection of metrics, graphing the resulting data on dashboards, and alerting on anomalies. The main features of Prometheus is that its a multi-dimensional data model with

System Update


$ sudo yum update

Downloading Prometheus

Once your server is ready with all of the updates, proceed to the installation setup of Prometheus. To do so, first we will download its latest available package on Github. We will be using the ‘Curl’ command to download Pometheus in a newly created directory using the following commands.
$ mkdir backup
$ cd backup
$ curl -LO "https://github.com/prometheus/prometheus/releases/download/0.17.0/prometheus-0.17.0.linux-amd64.tar.gz"
$ mv prometheus-0.17.0.linux-amd64.tar.gz prometheus-0.17.0rc1.linux-amd64.tar.gz

Prometheus Installation


We will be creating a new directory to install Prometheus into as it would be best practise to keep all the components of Prometheus within one parent directory. So, run the commands below to create a new directory in the home directory of your current user and extract the Prometheus package on it.
$ mkdir ~/Prometheus
$ cd ~/Prometheus
$ tar -zxvf ~/backup/prometheus-0.17.0rc1.linux-amd64.tar.gz
Now run the following command to verify the installation and check the version of ‘Prometheus’ and ‘Go’ installers.
$ ~/Prometheus/prometheus-0.17.0rc1.linux-amd64/prometheus -version


Installation of Node Exporter

Node Exporter is actually the Prometheus exporter for machine metrics, written in Go with pluggable metric collectors that exports a lot of metrics such as disk I/O statistics, memory usage, network statistics, CPU load and much more in a format that Prometheus recognizes.
The installation package for Node Exporter is also available on Github that can be downloaded from the Prometheus Node Exporter releases. Copy the Source link address to download the package using the ‘Curl’ command.
$ cd ~/backup/
$ curl -LO "https://github.com/prometheus/node_exporter/releases/download/0.12.0rc3/node_exporter-0.12.0rc3.linux-amd64.tar.gz"
Then extract this using the ‘tar’ command in a new directory ‘node_exporter’ under Prometheus using the following commands.
$ mkdir ~/Prometheus/node_exporter
$ cd ~/Prometheus/node_exporter/
$ tar -zxvf ~/backup/node_exporter-0.12.0rc3.linux-amd64.tar.gz 

Starting Node Exporter

Execute the node_exporter within the same directory where extracted, to run its service as shown below.
$ ./node_exporter

Node Exporter As a Service

Node Exporter has been placed under the home directory of our sudo user in ‘~/Prometheus/node_exporter’ directory. Now we are going to configure it as a service so that we can easily start and stop Node Exporter service when required.

$ sudo vim /etc/systemd/system/node_exporter.service
[Unit]
Description=Node Exporter

[Service]
User=vxx
ExecStart=/home/vxx/Prometheus/node_exporter/node_exporter

[Install]
WantedBy=default.target
Then start the service after reloading daemon or reboot the server.
$ sudo systemctl daemon-reload
$ sudo systemctl enable node_exporter.service
$ sudo systemctl start node_exporter.service
$ sudo systemctl status node_exporter.service
Once the node_exporter service is running, open your browser to view Node Exporter’s web interface by following the link below.
http://your_servers_ip:9100/metrics

Starting Prometheus Server

We are now ready to start the Prometheus server by creating a new configuration file in the Prometheus directory with the following code in it.
$ cd ~/Prometheus/prometheus-0.17.0rc1.linux-amd64/
$ vim prometheus.yml


global:
  scrape_interval:     15s # By default, scrape targets every 15 seconds.

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
    monitor: 'codelab-monitor'

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s

    target_groups:
      - targets: ['localhost:9090']
Close after making saving the changes using ‘:wq!’.
These configurations will create the ‘scrape_configs’ and a ‘job_name’ as ‘node’ that can be any name you wish. Then start the Prometheus server as a background process and redirect it to output the log files using the following command.
$ nohup ./prometheus > prometheus.log 2>&1 &
To view these logs you can use the command below.
$ tail ~/Prometheus/prometheus-0.17.0rc1.linux-amd64/prometheus.log

Prometheus Web Access

Now open your favourite web browser to access the Prometheus web console using your server’s IP address or FQDN as shown below.
http://your_servers_ip:9090
To make sure that the Prometheus server is fetching the data from the Node Exporter, click on the Graph and insert any metric chosen from the drop down, then click on the ‘Execute’ button to see the graph as shown below.
You can view the most commonly used metrics from the console templates that are available under the following path.
$ ls ~/Prometheus/prometheus-0.17.0rc1.linux-amd64/consoles
http://yourservers_ip:9090/consoles/node.html
For any help in configuration please don't forget to comment.

Docker Swarm Mode Walkthrough

Service Discovery In The Swarm Cluster

The old (standalone) Swarm required a service registry so that all its managers can have the same view of the cluster state. When instantiating the old Swarm nodes, we had to specify the address of a service registry. However, if you take a look at setup instructions of the new Swarm (Swarm Mode introduced in Docker 1.12), you’ll notice that we nothing is required beyond Docker Engines. You will not find any mention of an external service registry or a key-value store.
Does that mean that Swarm does not need service discovery? Quite the contrary. The need for service discovery is as strong as ever, and Docker decided to incorporate it inside Docker Engine. It is bundled inside just as Swarm is. The internal process is, essentially, still very similar to the one used by the standalone Swarm, only with less moving parts. Docker Engine now acts as a Swarm manager, Swarm worker, and service registry.
The decision to bundle everything inside the engine provoked a mixed response. Some thought that such a decision creates too much coupling and increases Docker Engine’s level of instability. Others think that such a bundle makes the engine more robust and opens the door to some new possibilities. While both sides have valid arguments, I am more inclined towards the opinion of the later group. Docker Swarm Mode is a huge step forward, and it is questionable whether the same result could be accomplished without bundling service registry inside the engine.
It is important to note that service registry bundled inside the engine is for internal use only. We cannot access it nor use it for our own purposes.
Knowing how Docker Swarm works, especially networking, the question that might be on your mind is whether we need service discovery (beyond Swarm’s internal usage). In The DevOps 2.0 Toolkit, I argued that service discovery is a must and urged everyone to set up Consul or etcd as service registries, Registrator as a mechanism to register changes inside the cluster, and Consul Template or confd as a templating solution. Do we still need those tools?

Do We Need Service Discovery?

It is hard to provide a general recommendation whether service discovery tools are needed when working inside a Swarm cluster. If we look at the need to find services as the main use case for those tools, the answer is usually no. We don’t need external service discovery for that. As long as all services that should communicate with each other are inside the same network, all we need is the name of the destination service. For example, all that the go-demo service needs to know to find the related database is that its DNS is go-demo-db. The Docker Swarm Networking And Reverse Proxy chapter proved that the proper networking usage is enough for most use cases.
However, finding services and load balancing requests among them is not the only reason for service discovery. We might have other usages of service registries or key-value stores. We might have a need to store some information in a way that it is distributed and fault tolerant.
An example of the need for a key-value store can be seen inside the Docker Flow: Proxy project. It is based on HAProxy which is a stateful service. It loads the information from a configuration file into memory. Having stateful services inside a dynamic cluster represents a challenge that needs to be solved. Otherwise, we might lose a state when a service is scaled, rescheduled after a failure, and so on.

Problems When Scaling Stateful Instances

Scaling services inside a Swarm cluster is easy, isn’t it? Just execute docker service scale SERVICE_NAME=NUMBER_OF_INSTANCES and, all of a sudden, the service is running multiple copies.
The previous statement is only partly true. The more precise wording would be that scaling stateless services inside a Swarm cluster is easy.
The reason that scaling stateless services is easy lies in the fact that there is no state to think about. An instance is the same no matter how long it runs. There is no difference between a new instance and one that run for a week. Since the state does not change over time, we can create new copies at any given moment, and they will all be exactly the same.
However, the world is not stateless. State is an unavoidable part of our industry. As soon as the first piece of information is created, it needs to be stored somewhere. The place we store data must be stateful. It has a state that changes over time. If we want to scale such a stateful service, there are, at least, two things we need to consider.
  1. How do we propagate a change of a state of one instance towards the rest of the instances?
  2. How to we create a copy (a new instance) of a stateful service and make sure that the state is copied as well.
We usually combine a stateless and stateful services into one logical entity. A back-end service could be stateless and rely on a database service as an external data storage. That way, there is a clear separation of concerns and a different lifecycle of each of those services.
Before we proceed, I must state that there is no silver bullet that makes stateful services scalable and fault-tolerant. Throughout the book, I will go through a couple of examples that might, or might not apply to your use case. An obvious, and very typical example of a stateful service is a database. While there are some common patterns, almost every database provides a different mechanism for data replication. That, in itself, is enough to prevent us from having a definitive answer that would apply to all. We’ll explore scalability of a Mongo DB later on in the book. We’ll also see an example with Jenkins that uses a file system for its state.
The first case we’ll tackle will be of a different type. We’ll discuss scalability of a service that has a state stored in its configuration file. To make things more complicated, the configuration is dynamic. It changes over time throughout the lifetime of the service. We’ll explore ways to make HAProxy scalable.
If we’d use the official HAProxy image, one of the challenges we would face is how to update the state of all the instances. How to change the configuration and reload each copy of the proxy.
We can, for example, mount an NFS volume on each node in the cluster and make sure that the same host volume is mounted inside all HAProxy containers. On the first look, it might seem that would solve the problem with the state since all instances would share the same configuration file. Any change to the config on the host would be available inside all the instances we would have. However, that, in itself, would not change the state of the service.
HAProxy loads the configuration file during initialization, and it is oblivious to any changes we might make to the configuration afterward. For the change of the state of the file to be reflected in the state of the service we need to reload it. The problem is that instances can run on any of the nodes inside the cluster. On top of that, if we adopt dynamic scaling (more on that later on), we might not even know how many instances are running. So, we’d need to discover how many instances we have, find out on which nodes they are running, get IDs of each of the containers, and, only then, send a signal to reload the proxy. While all this can be scripted, it is far from an optimum solution. Moreover, mounting an NFS volume is a single point of failure. If the server that hosts the volume fails, data is lost. Sure, we can create backups, but they would only provide a way to restore lost data partially. That is, we can restore a backup, but the data generated between the moment the last backup was created, and the node failure would be lost.
An alternative would be to embed the configuration into HAProxy images. We could create a new Dockerfile that would be based on haproxy and add the COPY instruction that would add the configuration. That would mean that every time we want to reconfigure the proxy, we’d need to change the config, build a new set of images (a new release), and update the proxy service currently running inside the cluster. As you can imagine, this is also not practical. It’s too big of a process for a simple proxy reconfiguration.
Docker Flow: Proxy uses a different, less conventional, approach to the problem. It stores a replica of its state in Consul. It also uses an undocumented Swarm networking feature (at least at the time of this writing).

The DevOps 2.1 Toolkit: Docker Swarm

The DevOps 2.1 Toolkit: Docker SwarmIf you liked this article, you might be interested in The DevOps 2.1 Toolkit: Docker Swarmbook. Unlike the previous title in the series (The DevOps 2.0 Toolkit: Automating the Continuous Deployment Pipeline with Containerized Microservices) that provided a general overlook of some of the latest DevOps practices and tools, this book is dedicated entirely to Docker Swarm* and the processes and tools we might need to **build, test, deploy, and monitor servicesrunning inside a cluster.
The book is still under “development”. You can get a copy from LeanPub. It is also available as The DevOps Toolkit Series bundle. If you download it now, before it is fully finished, you will get frequent updates with new chapters and corrections. More importantly, you will be able to influence the direction of the book by sending me your feedback.
I choose the lean approach to book publishing because I believe that early feedback is the best way to produce a great product. Please help me make this book a reference to anyone wanting to adopt Docker Swarm for cluster orchestration and scheduling.