Data Organization Google Professional Data Engineer GCP

  1. Home
  2. Data Organization Google Professional Data Engineer GCP
  • Every row in Bigtable is indexed by a single row key.
  • Row keys are byte strings that may be up to 64 KB
  • Row keys enables to retrieve several related rows quickly as a single, contiguous scan over the row key.
  • Each row may contain thousands of columns
  • Bigtable only scans the row key when performing lookups.
  • Reads and writes are performed at the row level.
  • Data is stored in scalable tables, each of which is a sorted key/value map.
  • Each row describes a single entity,
  • Columns contain individual values for each row.
  • Each row is indexed by a single row key
  • Columns if related are grouped into a column family.
  • Column is identified by column family and a column qualifier.
  • Each row/column intersection can contain multiple cells, or versions, at different timestamps
  • Tables are sparse; if a cell does not contain any data, it does not take up any space.

For example, suppose you’re building a social network for United States presidents—let’s call it Prezzy. Each president can follow posts from other presidents. The following illustration shows a Cloud Bigtable table that tracks who each president is following on Prezzy:

In above figure,

  • The table has single column family – “follows” having many column qualifiers.
  • Above, uses column qualifiers as data as Bigtable handles sparseness and will be able to add new ones quickly.
  • Row key is username and assuming they are evenly spread, it gives quick access and update