Exporting Data Google Professional Data Engineer GCP

  1. Home
  2. Exporting Data Google Professional Data Engineer GCP
  • can export the data in several formats.
  • can export up to 1 GB of data to a single file.
  • If exporting more than 1 GB of data, export to multiple files.
  • need permissions to access the BigQuery table for export

 

Export limitations

  • cannot export table data to a local file, Google Sheets, or to Google Drive.
  • can export up to 1 GB of table data to a single file.
  • cannot export nested and repeated data in CSV format.
  • For nested and repeated data use Avro and JSON exports.
  • If export in JSON format, INT64 (integer) data types are encoded as JSON strings.
  • cannot export data from multiple tables in a single export job.
  • cannot choose a compression type other than GZIP if exporting by Cloud Console or the classic BigQuery web UI.

You cannot change the location of a dataset after it is created, but you can make a copy of the dataset. You cannot move a dataset from one location to another, but you can manually move (recreate) a dataset.

Data format and compression

Data formats and compression types for exported data, are

 

Data format Supported compression types Details
CSV GZIP You can control the CSV delimiter in exported data by using the –field_delimiter CLI flag or the configuration.extract.fieldDelimiter. extract job property.

Nested and repeated data is not supported.

JSON GZIP Nested and repeated data is supported.
Avro DEFLATE, SNAPPY GZIP is not supported for Avro exports.

Nested and repeated data is supported.

Exporting data stored in BigQuery

export table data by:

  • Using the Cloud Console or the classic BigQuery web UI
  • Using the bq extract CLI command
  • Submitting an extract job via the API or client libraries

Exporting data into one or more files

  • The destinationUris property indicates the location(s) and file name(s)
  • BigQuery supports a single wildcard operator (*) in each URI.
  • The wildcard can appear anywhere in the URI except as part of the bucket name.
  • With wildcard BigQuery creates multiple sharded files based on the supplied pattern.
  • wildcard is replaced with a number (starting at 0), left-padded to 12 digits.
Menu