Custom Operator (don't know anything)

No comments

Controller Runtime: The Kubernetes controller-runtime Project is a set of go libraries for building Controllers Operator SDK: The Operator SDK is a framework that uses the controller-runtime library to make writing operators easier

https://sdk.operatorframework.io/docs/building-operators/ansible/quickstart/
https://sdk.operatorframework.io/docs/building-operators/ansible/tutorial/

https://sdk.operatorframework.io/docs/building-operators/ansible/reference/dependent-watches/

https://docs.openshift.com/container-platform/4.7/operators/operator_sdk/ansible/osdk-ansible-inside-operator.html

https://itnext.io/a-practical-kubernetes-operator-using-ansible-an-example-d3a9d3674d5b
https://two-oes.medium.com/building-custom-ansible-based-operator-for-openshift-4-ec681fa0466d

Example:

operator-sdk init --domain operator.redhatgov.io --plugins ansible

operator-sdk create api --group workshops --version v1alpha1 --kind Workshop --generate-role

It will create some directory structure(defaults file will be available)

Creating CRD:

kubectl create -f config/crd/bases/workshops.operator.redhatgov.io_workshops.yaml

Run custom Operator:
command: make deploy

create custom resource now
-----

apiVersion: workshops.operator.redhatgov.io/v1alpha1

#workshops.operator.redhatgov.io/v1alpha1

kind: Workshop

metadata:

name: example-workshop

spec:

# Add fields here

cr_my_replicas: 1

kubectl create -f config/crd/bases/workshops_v1_workshop_cr.yaml

you can check logs of custom - operator

kubectl logs -f workshop-controller-manager-9f6ff675b-rbcqw -n workshop-system

In RBAC role.yaml
I updated below

### added by prakash ###

- apiGroups:

- "*"

resources:

- "*"

verbs:

- "*"

No comments :

GCP Data Life Cycle

No comments

Data lifecycle

Mainly data life cycle has 4 steps:

1. Ingest - ( to pull in the raw data )

2. Store - ( to store in a format that is durable and can be easily accessed)

3. Process and analyze - (data is transformed from raw form into actionable information)

4. Explore and Visualize - (to convert the results of the analysis into a format that is easy to draw insights from)

Ingest
1) ingesting app data
2) ingesting Streaming data
3) ingesting batch data

Store

Cloud Storage:

* backing up and archiving
* storage and delivery of content

* cloud storage can be accessed by dataflow for transformation and loading into other systems such as Bigtable or BigQuery.
* For Hadoop and Spark jobs, data from Cloud Storage can be natively accessed by using Dataproc.
* BigQuery natively supports importing CSV, JSON, and Avro files from a specified Cloud Storage bucket.

Cloud Storage for Firebase:
* good fit for storing and retrieving assets such as images, audio, video, and other user-generated content in mobile and web apps.

Cloud SQL:

* fully managed, cloud-native RDBMS that offers both MySQL and PostgreSQL engines with built-in support for replication.
* offers built-in backup and restoration, high availability, and read replicas.
* Cloud SQL supports RDBMS workloads up to 30 TB for both MySQL and PostgreSQL
* Data stored in Cloud SQL is encrypted both in transit and at rest
* For OLTP Cloud SQL is appropriate
* For OLAP workloads, consider BigQuery
* If your workload requires dynamic schemas, consider Datastore.
* You can use Dataflow or Dataproc to create ETL jobs that pull data from Cloud SQL and insert it into other storage systems.

Bigtable: Managed wide-column NoSQL
* managed, high-performance NoSQL database service designed for terabyte- to petabyte-scale workloads

* provides consistent, low-latency, and high-throughput storage for large-scale NoSQL data

* Bigtable is built for real-time app serving workloads, as well as large-scale analytical workloads.
* Use case:
1) Real-time app data
2) Stream processing (pub/sub => dataflow(transform) => BigTable)
3) IoT time series data (sensor/streamed data => Bigtable (time series schema))
4) AdTech workloads (can be used to store and track ad impressions which can be used by dataproc and dataflow for processing and analysing)
5) data ingestion (cloud storage => dataflow/dataproc => Bigtable)
6) Analytical Workloads (Bigtable=> dataflow (complex aggrregation) => dataproc((Dataproc can be used to execute Hadoop or Spark processing and machine-learning tasks.))
7) Apache HBase replacement

note: While Bigtable is considered an OLTP system, it doesn't support multi-row transactions, SQL queries or joins. For those use cases, consider either Cloud SQL or Datastore.

Spanner: Horizontally scalable relational database

* fully managed relational database service for mission-critical OLTP apps

* Like relational databases, Spanner supports schemas, ACID transactions, and SQL queries

* Spanner also performs automatic sharding while serving data with single-digit millisecond latencies

* uses cases for Spanner:
1) financial services (strong consistency across read/write operations without scarificing HA)

2) Ad tech (low-latency querying without compromising scale or availability.)

3) Retail and Global Supply Chain(Spanner offers automatic, global, synchronous replication with low latency, which means that data is always consistent and highly available.)

Firestore: Flexible, scalable NoSQL database

* NoSQL database that stores JSON data

* JSON data can be synchronized in real time to connected clients across different platforms
* to build a real-time experience serving millions of users without compromising responsiveness.
* Use Cases:
1) chat ans social (Store and retrieve images, audio, video, and other user-generated content.)

2) Mobile Games (Keep track of game progress and statistics across devices and device platforms)

(Storing data warehouse data)

BigQuery: Managed data warehouse
* to store data directly in BigQuery for analysis
* to supports loading data through the web interface, command line tools, and REST API calls.
* When loading data in bulk, the data should be in the form of CSV, JSON, or Avro files
* For streaming data, you can use Pub/Sub and Dataflow in combination to process incoming streams and store the resulting data in BigQuery.
* In some workloads, however, it might be appropriate to stream data directly into BigQuery without additional processing.

=============================================================

(Process and analyze)
* to derive business value and insights from data, you must transform and analyze it.