Storage types Google Professional Data Engineer GCP
- Two types – solid-state drives (SSD) or hard disk drives (HDD).
 - SSD storage is the most efficient and cost-effective choice for most use cases.
 - HDD storage is sometimes appropriate for very large data sets (>10 TB) that are not latency-sensitive or are infrequently accessed.
 - HDD use cases
- store at least 10 TB of data.
 - not to be used for user-facing or latency-sensitive application.
 - workload is Batch workloads or Data archival
 
 
- Application profiles, or app profiles for instances using replication,
 - app profiles control how applications connect to the instance’s clusters.
 - Without replication, app profiles provide separate identifiers for each of applications
 
- A cluster is a service in a specific location.
 - Cluster belongs to a single instance
 - An instance can have up to 4 clusters
 - application requests are handled by one of the clusters in the instance.
 - cluster is located in a single zone.
 - An instance’s clusters must each be in unique zones.
 - can create more cluster in any zone if Bigtable is available.
 - instances with only 1 cluster do not use replication.
 
- Each cluster in an instance has 1 or more nodes
 - Nodes are compute resources to manage data.
 - Bigtable splits all data from tables into smaller tablets.
 - Tablets are stored on disk, separate from the nodes but in the same zone as the nodes.
 - A tablet is associated with a single node.
 - Each node
- Keep track of specific tablets on disk.
 - Handle incoming reads and writes for its tablets.
 - Perform maintenance tasks on its tablets
 
 
- Select or create a GCP project.
- A project name must be between 4 and 30 characters.
 - A project ID is suggested which can be edited and is 6 to 30 characters, with a lowercase letter as the first character and last character cannot be a hyphen.
 
 - Make sure billing is enabled for Google Cloud project.
 - Enable the Cloud Bigtable and Cloud Bigtable Admin APIs.
 
Labels –
- a key-value pair
 - helps you organize GCP resources
 - Can attach a label to each resource
 - filter the resources based on their labels.
 
- By default, can provision maximum thirty Cloud Bigtable nodes/zone in each Google Cloud project.
 - For more use the node request form.
 - After creating a Bigtable instance, can update following settings
- number of nodes in each cluster
 - number of clusters in the instance
 - application profiles for the instance
 - labels for the instance
 - display name for the instance
 
 
- can add clusters to an existing instance,
 - a maximum of 4 clusters per instance can be present
 - Clusters can be in any region if Bigtable is available
 
- Can delete all but 1 of the clusters is needed
 - Deleting all but 1 cluster automatically disables replication.
 
- monitor Bigtable instance using Cloud Console and Cloud Monitoring
 - A high-level overview is given
 - Key Visualizer tool gives drill down
 
Google Professional Data Engineer (GCP) Free Practice TestTake a Quiz
		