GCP Data Life Cycle

No comments

 Data lifecycle


Mainly data life cycle has 4 steps:

1. Ingest - ( 
to pull in the raw data )
2. Store - ( tstore in a format that is durable and can be easily accessed)
3. Process and analyze - (data is transformed from raw form into actionable information)
4. Explore and Visualize - (to convert the results of the analysis into a format that is easy to draw insights from)




Ingest
1) ingesting app data
2) ingesting Streaming data
3) ingesting batch data





















Store



Cloud Storage:

backing up and archiving 
storage and delivery of content
* cloud storage can be accessed by dataflow for transformation and loading into other systems such as Bigtable or BigQuery.
* For Hadoop and Spark jobs, data from Cloud Storage can be natively accessed by using Dataproc.
* BigQuery natively supports importing CSV, JSON, and Avro files from a specified Cloud Storage bucket.


Cloud Storage for Firebase:
good fit for storing and retrieving assets such as images, audio, video, and other user-generated content in mobile and web apps.



Cloud SQL:

fully managed, cloud-native RDBMS that offers both MySQL and PostgreSQL engines with built-in support for replication.
offers built-in backup and restoration, high availability, and read replicas.
* Cloud SQL supports RDBMS workloads up to 30 TB for both MySQL and PostgreSQL
* Data stored in Cloud SQL is encrypted both in transit and at rest
* For OLTP Cloud SQL is appropriate
For OLAP workloads, consider BigQuery
*  If your workload requires dynamic schemas, consider Datastore.
*  You can use
Dataflow or Dataproc to create ETL jobs that pull data from Cloud SQL and insert it into other storage systems.



Bigtable: Managed wide-column NoSQL
managed, high-performance NoSQL database service designed for terabyte- to petabyte-scale workloads

provides consistent, low-latency, and high-throughput storage for large-scale NoSQL data
Bigtable is built for real-time app serving workloads, as well as large-scale analytical workloads.
* Use case:
 1) 
Real-time app data
 2) 
Stream processing (pub/sub => dataflow(transform) => BigTable)
 3) 
IoT time series data (sensor/streamed data => Bigtable (time series schema))
 4) AdTech workloads (can be used to store and track ad impressions which can be used by dataproc and dataflow for processing and analysing)
5) data ingestion (cloud storage => dataflow/dataproc => Bigtable)
6) Analytical Workloads (Bigtable=> dataflow (complex aggrregation) => dataproc((
Dataproc can be used to execute Hadoop or Spark processing and machine-learning tasks.))
7) Apache HBase replacement 

note: 
While Bigtable is considered an OLTP system, it doesn't support multi-row transactions, SQL queries or joins. For those use cases, consider either Cloud SQL or Datastore.


fully managed relational database service for mission-critical OLTP apps
Like relational databases, Spanner supports schemas, ACID transactions, and SQL queries
Spanner also performs automatic sharding while serving data with single-digit millisecond latencies
uses cases for Spanner:
   1) financial services (
strong consistency across read/write operations without scarificing HA)
   2) Ad tech (low-latency querying without compromising scale or availability.)
   3) Retail and Global Supply Chain(Spanner offers automatic, global, synchronous replication with low latency, which means that data is always consistent and highly available.)


* NoSQL database that stores JSON data
JSON data can be synchronized in real time to connected clients across different platforms
to build a real-time experience serving millions of users without compromising responsiveness.
* Use Cases:
  1) chat ans social (
Store and retrieve images, audio, video, and other user-generated content.)
  2) Mobile Games (Keep track of game progress and statistics across devices and device platforms)


=============================================================
(Process and analyze)
to derive business value and insights from data, you must transform and analyze it.

 


No comments :

Post a Comment