Stay Calm and Carry on | Take 30% off Sitewide |use TOGETHER at checkout

Google Professional Data Engineer (GCP) Online Course

Google Professional Data Engineer (GCP) Online Course

This course is a really comprehensive guide to the Google Cloud Platform - it has ~20 hours of content and ~60 demos. The Google Cloud Platform is not currently the most popular cloud offering out there - that's AWS of course - but it is possibly the best cloud offering for high-end machine learning applications. That's because TensorFlow, the super-popular deep learning technology is also from Google.


Course Silent Features

  • Certification stuff - Covers pretty much all of the material you ought to need to get past the Google Data Engineer and Cloud Architect certification tests
  • Compute and Storage - AppEngine, Container Enginer (aka Kubernetes) and Compute Engine
  • Big Data and Managed Hadoop - Dataproc, Dataflow, BigTable, BigQuery, Pub/Sub
  • TensorFlow on the Cloud - what neural networks and deep learning really are, how neurons work and how neural networks are trained.
  • DevOps stuff - StackDriver logging, monitoring, cloud deployment manager
  • Security - Identity and Access Management, Identity-Aware proxying, OAuth, API Keys, service accounts
  • Networking - Virtual Private Clouds, shared VPCs, Load balancing at the network, transport and HTTP layer; VPN, Cloud Interconnect and CDN Interconnect
  • Hadoop Foundations: A quick look at the open-source cousins (Hadoop, Spark, Pig, Hive and HBase)


In this course, you will Learn and understand the following concepts deeply:-

  • Deploy Managed Hadoop apps on the Google Cloud
  • Build deep learning models in the cloud using TensorFlow
  • Make informed decisions about Containers, VMs and AppEngine
  • Use big data technologies such as BigTable, Dataflow, Apache Beam and Pub/Sub


Course Curriculum

1. Introduction

  • Theory, Practice and Tests
  • Why Cloud?
  • Hadoop and Distributed Computing
  • On-premise, Colocation or Cloud?
  • Introducing the Google Cloud Platform
  • Lab: Setting Up A GCP Account
  • Lab: Using The Cloud Shell

2. Compute Choices

  • Compute Options
  • Google Compute Engine (GCE)
  • More GCE
  • Lab: Creating a VM Instance
  • Lab: Editing a VM Instance
  • Lab: Creating a VM Instance Using The Command Line
  • Lab: Creating And Attaching A Persistent Disk
  • Google Container Engine - Kubernetes (GKE)
  • More GKE
  • Lab: Creating A Kubernetes Cluster And Deploying A Wordpress Container
  • App Engine
  • Contrasting App Engine, Compute Engine and Container Engine
  • Lab: Deploy and Run An App Engine App

3. Storage

  • Storage Options
  • Quick Take
  • Cloud Storage
  • Lab: Working With Cloud Storage Buckets
  • Lab: Bucket And Object Permissions
  • Lab: Life cycle Management On Buckets
  • Lab: Running a Program On a VM Instance And Storing Results on Cloud Storage
  • Transfer Service
  • Lab: Migrating Data Using the Transfer Service

4. Cloud SQL, Cloud Spanner ~ OLTP ~ RDBMS

  • Cloud SQL
  • Lab: Creating A Cloud SQL Instance
  • Lab: Running Commands On Cloud SQL Instance
  • Lab: Bulk Loading Data Into Cloud SQL Tables
  • Cloud Spanner
  • More Cloud Spanner
  • Lab: Working With Cloud Spanner

5. BigTable ~ HBase = Columnar Store.

  • BigTable Intro
  • Columnar Store
  • Denormalised
  • Column Families
  • BigTable Performance
  • Lab: BigTable demo

6. Datastore ~ Document Database

  • Datastore
  • Lab: Datastore demo

7. BigQuery ~ Hive ~ OLAP

  • BigQuery Intro
  • BigQuery Advanced
  • Lab: Loading CSV Data Into Big Query
  • Lab: Running Queries On Big Query
  • Lab: Loading JSON Data With Nested Tables
  • Lab: Public Datasets In Big Query
  • Lab: Using Big Query Via The Command Line
  • Lab: Aggregations And Conditionals In Aggregations
  • Lab: Subqueries And Joins
  • Lab: Regular Expressions In Legacy SQL
  • Lab: Using The With Statement For SubQueries

8. Dataflow ~ Apache Beam

  • Data Flow Intro
  • Apache Beam
  • Lab: Running A Python Data flow Program
  • Lab: Running A Java Data flow Program
  • Lab: Implementing Word Count In Dataflow Java
  • Lab: Executing The Word Count Dataflow
  • Lab: Executing MapReduce In Dataflow In Python
  • Lab: Executing MapReduce In Dataflow In Java
  • Lab: Dataflow With Big Query As Source And Side Inputs
  • Lab: Dataflow With Big Query As Source And Side Inputs 2

9. Dataproc ~ Managed Hadoop

  • Data Proc
  • Lab: Creating And Managing A Dataproc Cluster
  • Lab: Creating A Firewall Rule To Access Dataproc
  • Lab: Running A PySpark Job OnDataproc
  • Lab: Running ThePySpark REPL Shell And Pig Scripts On Dataproc
  • Lab: Submitting A Spark Jar ToDataproc
  • Lab: Working With Dataproc Using TheGCloud CLI

10. Pub/Sub for Streaming.

  • Pub Sub
  • Lab: Working With Pubsub On The Command Line
  • Lab: Working WithPubSub Using The Web Console
  • Lab: Setting Up A Pubsub Publisher Using The Python Library
  • Lab: Setting Up A Pubsub Subscriber Using The Python Library
  • Lab: Publishing Streaming Data IntoPubsub
  • Lab: Reading Streaming Data FromPubSub And Writing To BigQuery
  • Lab: Executing A Pipeline To Read Streaming Data And Write To BigQuery
  • Lab: Pubsub Source BigQuery Sink

11. Datalab ~ Jupyter

  • Data Lab
  • Lab: Creating And Working On A Datalab Instance
  • Lab: Importing And Exporting Data Using Datalab
  • Lab: Using the Charting API InDatalab

12. TensorFlow and Machine Learning

  • Introducing Machine Learning
  • Representation Learning
  • NN Introduced
  • Introducing TF
  • Lab: Simple Math Operations
  • Computation Graph
  • Tensors
  • Lab: Tensors
  • Linear Regression Intro
  • Placeholders and Variables
  • Lab: Placeholders
  • Lab: Variables
  • Lab: Linear Regression with Made-up Data
  • Image Processing
  • Images As Tensors
  • Lab: Reading and Working with Images
  • Lab: Image Transformations
  • Introducing MNIST
  • K-Nearest Neigbors as Unsupervised Learning
  • One-hot Notation and L1 Distance
  • Steps in the K-Nearest-Neighbors Implementation
  • Lab: K-Nearest-Neighbors
  • Learning Algorithm
  • Individual Neuron
  • Learning Regression
  • Learning XOR
  • XOR Trained

13. Regression in TensorFlow

  • Lab: Access Data from Yahoo Finance
  • Non TensorFlow Regression
  • Lab: Linear Regression - Setting Up a Baseline
  • Gradient Descent
  • Lab: Linear Regression
  • Lab: Multiple Regression in TensorFlow
  • Logistic Regression Introduced
  • Linear Classification
  • Lab: Logistic Regression - Setting Up a Baseline
  • Logit
  • Softmax
  • Argmax
  • Lab: Logistic Regression
  • Estimators
  • Lab: Linear Regression using Estimators
  • Lab: Logistic Regression using Estimators

14. Vision, Translate, NLP and Speech: Trained ML APIs

  • Lab: Taxicab Prediction - Setting up the dataset
  • Lab: Taxicab Prediction - Training and Running the model
  • Lab: The Vision, Translate, NLP and Speech API
  • Lab: The Vision API for Label and Landmark Detection

15. Networking

  • Virtual Private Clouds
  • VPC and Firewalls
  • XPC or Shared VPC
  • VPN
  • Types of Load Balancing
  • Proxy and Pass-through load balancing
  • Internal load balancing

16. Ops and Security

  • StackDriver
  • StackDriver Logging
  • Cloud Deployment Manager
  • Cloud Endpoints
  • Security and Service Accounts
  • Auth and End-user accounts
  • Identity and Access Management
  • Data Protection

17. Appendix: Hadoop Ecosystem

  • Introducing the Hadoop Ecosystem
  • Hadoop
  • HDFS
  • MapReduce
  • Yarn
  • Hive
  • Hive vs. RDBMS
  • HQL vs. SQL
  • OLAP in Hive
  • Windowing Hive
  • Pig
  • More Pig
  • Spark
  • More Spark
  • Streams Intro
  • Microbatches
  • Window Types

Tags: Google Professional Data Engineer (GCP) Online Course