S3 Advanced
Replicationโ
Cross-Region Replication (CRR) vs Same-Region Replication (SRR)โ
| Feature | CRR | SRR |
|---|---|---|
| Regions | Source โ different region | Source โ same region |
| Use case | DR, latency, compliance | Aggregate logs, dev/prod copies |
| Versioning | Required on both buckets | Required on both buckets |
| Existing objects | โ Not replicated by default (use S3 Batch) | โ Same |
| Delete marker | Not replicated by default (opt-in) | Not replicated by default |
Replication does not replicate delete markers or object deletions by default โ to prevent accidental cross-region deletes. Enable Delete Marker Replication explicitly.
S3 Transfer Accelerationโ
Client โ CloudFront Edge Location โ AWS Backbone โ S3 Bucket
- Speeds up uploads from distant clients
- Uses CloudFront edge network as an entry point
- Extra cost per GB transferred
- Separate endpoint:
bucket.s3-accelerate.amazonaws.com
S3 Select & Glacier Selectโ
Query data inside S3 objects without downloading the whole file:
SelectObjectContentRequest request = SelectObjectContentRequest.builder()
.bucket("my-data-lake")
.key("orders/2024-01.csv")
.expressionType(ExpressionType.SQL)
.expression("SELECT * FROM S3Object s WHERE s.status = 'FAILED'")
.inputSerialization(InputSerialization.builder()
.csv(CSVInput.builder().fileHeaderInfo(FileHeaderInfo.USE).build())
.compressionType(CompressionType.NONE)
.build())
.outputSerialization(OutputSerialization.builder()
.csv(CSVOutput.builder().build())
.build())
.build();
Reduces data transfer and processing cost significantly.
Object Lambdaโ
Transform S3 objects on the fly during a GET request:
Client GET request
โ
S3 Object Lambda Access Point
โ
Lambda function (transform: redact PII, resize image, format conversion)
โ
Transformed response to client
Use cases: redact SSN/email from CSV, resize images, add watermarks.
MFA Deleteโ
- Requires MFA to permanently delete a versioned object or suspend versioning
- Only the bucket owner (root account) can enable MFA Delete
- CLI only (not console)
Lifecycle Rulesโ
{
"Rules": [{
"ID": "ArchiveOldLogs",
"Status": "Enabled",
"Filter": { "Prefix": "logs/" },
"Transitions": [
{ "Days": 30, "StorageClass": "STANDARD_IA" },
{ "Days": 90, "StorageClass": "GLACIER_IR" },
{ "Days": 365, "StorageClass": "DEEP_ARCHIVE" }
],
"Expiration": { "Days": 2555 },
"AbortIncompleteMultipartUpload": { "DaysAfterInitiation": 7 }
}]
}
S3 Object Ownership & ACLsโ
- Bucket owner enforced (recommended): ACLs disabled, bucket owner owns all objects
- ACLs are legacy โ use bucket policies and IAM instead
๐งช Practice Questionsโ
Q1. A company replicates S3 objects from us-east-1 to eu-west-1 for compliance. A user deletes an object in us-east-1. Will the object be deleted in eu-west-1?
A) Yes โ deletions are always replicated
B) No โ delete markers are not replicated by default
C) Yes โ if the bucket policy allows cross-region deletes
D) No โ only new objects are replicated, not modifications
โ Answer & Explanation
B โ By default, S3 replication does not replicate delete markers. This protects against accidental or malicious cross-region deletes. You must explicitly enable Delete Marker Replication.
Q2. A developer has a 10GB CSV file in S3. They only need rows where status = 'ERROR'. What is the MOST cost-effective approach?
A) Download the file and filter locally
B) Use Lambda to stream and filter the file
C) Use S3 Select with a SQL expression
D) Use Athena to query the file
โ Answer & Explanation
C โ S3 Select executes the filter server-side, returning only matching rows. You only pay for the data scanned and returned, avoiding full file download. Athena is better for complex analytics; S3 Select is simpler for single-object queries.