Prefixes and Delimiters

  • support for prefix and delimiter parameters in key name listing
  • It helps to organize, browse, and retrieve the objects within a bucket hierarchically.
  • Use forward slash (/) or backslash (\) as a delimiter and then use key names with embedded delimiters to emulate a file and folder hierarchy within the flat object key namespace of a bucket.

Storage Classes

  • Range of storage classes suitable for different use cases.
  • S3 Standard offers
    • high durability
    • high availability
    • low latency
    • high performance object storage
  • S3 Infrequent Access (Standard-IA) useful for long-lived, less frequently accessed data.
  • S3 Reduced Redundancy Storage (RRS) – offers low durability of 4 nines at reduced cost. Suitable for derived data which can be reproduced.
  • Glacier storage – low-cost cloud storage for data with no real-time access, like archives and long-term backups. Data retrieval time is of several hours.

Object Lifecycle Management

  • similar to automated storage tiering
  • As per data natural lifecycle
    • starts “hot” (i.e, frequently accessed) data
    • moving to “warm” (i.e., less frequently accessed) data
    • on maturing and ending is “cold” (i.e., long-term backup or archive) data
    • eventual deletion.
  • S3 gives lifecycle configuration rules, for data lifecycle
  • Rules help to reduce storage costs by
    • automatic transition of data from one storage class to another
    • automatic deletion of data after a time period
  • For example rules may be as
    • Store backup data initially in Amazon S3 Standard.
    • After 30 days, transition to Amazon Standard-IA.
    • After 90 days, transition to Amazon Glacier.
    • After 3 years, delete.
  • Lifecycle configurations is linked to bucket and apply to
    • all objects in the bucket or
    • only to objects specified by a prefix.

Encryption

  • use Amazon S3 SSL API endpoints for data in flight
  • For Amazon S3 data at rest, use variations of Server-Side Encryption (SSE)
  • In SSE, S3
  • encrypts data at object level as it writes it to disks
  • decrypts it for when accessed.
  • SSE by S3 and Amazon KMS uses 256-bit Advanced Encryption Standard (AES).
  • For client side data, use Client-Side Encryption, by encrypting data on client before sending it to Amazon S3.

AWS-Managed Keys (SSE-S3)

  • SSE-S3 is encryption solution
  • AWS handles key management and key protection for Amazon S3.
  • every object is encrypted with a unique key
  • actual object key itself is then further encrypted by a separate master key.
  • A new master key is issued monthly, with AWS rotating the keys.
  • Encrypted data, encryption keys, and master keys are all stored separately on secure hosts

AWS KMS Keys ( SSE-KMS )

  • Amazon handles key management and protection for Amazon S3
  • It offers more benefits against SSE-S3 as
    • separate permissions for using master key to protect objects in S3 together .

Customer-Provided Keys (SSE-C)

  • It is a key
  • used if want own encryption keys
  • no need for own client-side encryption library.
  • With SSE-C, AWS performs encryption or decryption of objects with full control of keys

Client-Side Encryption

  • It involves encrypting data on client side of the application before sending it to S3.
  • The two options for using data encryption keys
  • Use an AWS KMS-managed customer master key
  • Use a Client Side Master Key

Versioning

  • It protects data against accidental or malicious deletion
  • keeps multiple versions of each object in the bucket
  • It does by identifying a unique version ID
  • For every version of every object stored in S3 bucket, versioning permits to
    • preserve
    • retrieve
    • restore

MFA Delete

  • delete or adds another layer of data protection above bucket versioning.
  • deletes required additional authentication
  • It can only be enabled by the root account
  • permanently delete an object version or change the versioning state of a bucket.
  • It needs authentication code (a temporary, one-time password) generated by a hardware or virtual Multi-Factor Authentication (MFA) device.

Pre-Signed URLs

  • all Amazon S3 objects by default are private
  • object owner can share objects with others by creating a pre-signed URL,
  • It needs user’s credentials to grant time-limited permission to download the objects.

Multipart Upload

  • To support uploading or copying of large objects by Multipart Upload API.
  • It permits large objects as a set of parts
  • It provides
    • better network utilization via parallel transfers
    • ability to pause and resume
    • upload objects where size is unknown.
  • It is a three-step process
    • 1. Initiation
    • 2. Uploading the parts
    • 3. Completion (or abort).
  • Parts can be uploaded independently in arbitrary order, with re-transmission if needed.
  • After all part upload, S3 assembles parts in order to create an object.
  • use it for objects larger than 100 Mbytes, and for objects larger than 5G

Range GETs

  • Used download (or GET) only a portion of an object in S3 and Glacier
  • Parameters passed as
  • Range HTTP header in GET request
  • equivalent parameters in one of SDK wrapper libraries
  • It is useful with
    • large objects and poor connectivity
    • download only a known portion of large Glacier backup.

Cross-Region Replication

  • Asynchronously replicate all new objects in source bucket in one AWS region to a target bucket in another region
  • Any metadata and ACLs associated with the object are also replicated
  • If setup, changes to source’s data, metadata, or ACLs trigger replication to destination bucket.
  • For cross-region replication, versioning is must for both source and destination buckets
  • IAM policy also needed to give S3 permission to replicate objects.

Logging

  • It tracks requests to S3 bucket
  • Logging is off by default, but can be enabled.
  • Choose storage or target bucket for logs if allowing logging for a bucket
  • best practice to specify a prefix, such as logs/ or yourbucketname/logs/, to easily identify logs.

Event Notifications

  • sent in response to actions taken on objects uploaded or stored in S3.
  • configure event notifications based on object name prefixes and suffixes.
  • Helps to
    • run workflows
    • send alerts
    • perform other actions in response to changes in the objects in S3.
  • event notifications set up at bucket level
  • configure by
    • Amazon S3 console
    • REST API
    • AWS SDK
  • S3 publish notifications
    • when new objects are created (by a PUT, POST, COPY, or multipart upload completion)
    • when objects are removed (by a DELETE)
    • Amazon S3 detects that an RRS object was lost.

Best Practices, Patterns, and Performance

  • Use Amazon S3 storage in hybrid IT environments and applications like data in on-premises file systems, databases, and compliance archives over the Internet to Amazon S3 or Amazon Glacier, while the primary application or database storage remains on-premises.
  • Amazon S3 as bulk “blob” storage for data, while keeping an index to that data in another service, like DynamoDB or RDS. Thus quick searches and complex queries on key names without listing keys continually.
Menu