1-GCP-Data-Comparison

文章目录
  1. 1. 1-GCP-Data-Comparison
    1. 1.1. Compare
    2. 1.2. Load data
    3. 1.3. Cloud storage
      1. 1.3.1. User case
      2. 1.3.2. Compare

1-GCP-Data-Comparison

Compare

  1. BigQuery is used for Enterprise data warehouse and building reports and extracting insights.
  2. Firestore, real time query or offline query, native mode for app/mobile, datasource mode for server.
  3. Bigtable, NoSQL, meets the requirements for consistent low latency, scaling throughput seamlessly, and petabyte-scale. Bigtable is ideal for IoT, gives consistently sub-10ms latency, and can be used at a petabyte scale.
  4. Spanner, full SQL support.

image-GCP-Data-Compare

Load data

  1. What’s the difference? Insert the data into Firestore using Native mode. Insert the data into Firestore using Datastore mode.
    • Use Firestore in Datastore mode for new server projects.
      Firestore in Datastore mode allows you to use established Datastore server architectures while removing fundamental Datastore limitations. Datastore mode can automatically scale to millions of writes per second.
    • Use Firestore in Native mode for new mobile and web apps.
      Firestore offers mobile and web client libraries with real-time and offline features. Native mode can automatically scale to millions of concurrent clients.
  2. Dataprep by Trifacta is an intelligent data service for visually exploring, cleaning, and preparing structured and unstructured data for analysis, reporting, and machine learning. Because Dataprep is serverless and works at any scale, there is no infrastructure to deploy or manage. Your next ideal data transformation is suggested and predicted with each UI input, so you don’t have to write code.
  3. Dataflow is a fully managed service that can be used to process both streams and batches of data.

Cloud storage

User case

Cymbal Direct drones continuously send data during deliveries. You need to process and analyze the incoming telemetry data. After processing, the data should be retained, but it will only be accessed once every month or two. Your CIO has issued a directive to incorporate managed services wherever possible. You want a cost-effective solution to process the incoming streams of data. What should you do?

Ingest data with ClearBlade IoT Core, and then publish to Pub/Sub. Use Dataflow to process the data, and store it in a Nearline Cloud Storage bucket.

Compare

users access no more than once a month(duration).

  1. Standard: no min duration
  2. Nearline Cloud Storage bucket: at least 30 days
  3. Coldline Cloud Storage bucket: at least 90 days
  4. Archive: 365 days