Skip to main content

Amazon S3

Key exam themes: Encryption types, presigned URLs, CORS, event notifications, storage classes, bucket policies.


๐Ÿ”ฐ What Is Amazon S3?โ€‹

Amazon S3 (Simple Storage Service) is an object storage service offering virtually unlimited storage with 99.999999999% (11 nines) durability. Objects are stored in buckets and accessed via unique keys.

Analogy: S3 is like an infinite filing cabinet in the cloud. Each drawer is a bucket, each file inside is an object. You can organize files with "folders" (key prefixes), lock them with encryption, and set rules to automatically archive old files.

S3 vs Other Storageโ€‹

FeatureS3 (Object)EBS (Block)EFS (File)
AccessHTTP/HTTPS APIAttached to EC2NFS mount
ScalabilityUnlimitedFixed size (up to 64 TB)Auto-scaling
SharingAny number of clientsSingle EC2 (or multi-attach io2)Multiple EC2
Use caseStatic files, backups, data lakeDatabases, OS volumesShared file system
Durability11 ninesVolume-level11 nines

Storage Classesโ€‹

ClassUse CaseMin DurationRetrievalAvailability
StandardFrequently accessedNoneInstant99.99%
Standard-IAInfrequent, fast retrieval needed30 daysInstant99.9%
One Zone-IAInfrequent, recreatable data30 daysInstant99.5%
Glacier InstantArchive, quarterly access90 daysInstant99.9%
Glacier FlexibleArchive, hours acceptable90 days1-12 hours99.99%
Glacier Deep ArchiveLong-term (7-10yr)180 days12-48 hours99.99%
Intelligent-TieringUnknown access patternsNoneInstant99.9%

Glacier Flexible Retrieval Optionsโ€‹

OptionSpeedCost
Expedited1-5 minutesHighest
Standard3-5 hoursMedium
Bulk5-12 hoursLowest
Exam: Storage Class Selection
  • "Rarely accessed but needs millisecond retrieval" โ†’ Glacier Instant Retrieval
  • "Unknown access pattern" โ†’ Intelligent-Tiering (auto-moves between tiers)
  • "Can lose one AZ, infrequent access" โ†’ One Zone-IA (cheapest IA)
  • "Legal compliance, 7+ year retention" โ†’ Glacier Deep Archive

Versioningโ€‹

  • Enable per bucket โ€” objects get a VersionId
  • DELETE without specifying version โ†’ adds a delete marker (old versions preserved)
  • DELETE with VersionId โ†’ permanently deletes that specific version
  • Once enabled, versioning can be suspended but never fully disabled
  • MFA Delete โ€” requires MFA to permanently delete or suspend versioning

Versioning Behaviorโ€‹

PUT object.txt (v1) โ†’ { VersionId: "abc", Content: "Hello" }
PUT object.txt (v2) โ†’ { VersionId: "def", Content: "World" }
DELETE object.txt โ†’ { VersionId: "ghi", DeleteMarker: true }

GET object.txt โ†’ 404 (latest is delete marker)
GET object.txt?versionId=abc โ†’ "Hello" (still exists!)
DELETE object.txt?versionId=ghi โ†’ Removes delete marker, v2 is latest again

Encryptionโ€‹

TypeKey ManagementWho Manages?Audit Trail
SSE-S3AWS-managed (AES-256)AWSโŒ No CloudTrail
SSE-KMSKMS key (CMK or AWS-managed)You + KMSโœ… CloudTrail
SSE-CCustomer-provided keyYou send key per requestโŒ (your responsibility)
Client-SideKey never leaves clientYouโŒ (your responsibility)

SSE-KMS Considerationsโ€‹

PUT request โ†’ S3 โ†’ KMS:Encrypt โ†’ encrypted object stored
GET request โ†’ S3 โ†’ KMS:Decrypt โ†’ decrypted object returned

Each request counts toward KMS API quotas!
- 5,500 requests/sec (us-east-1) or 10,000/sec (some regions)
- High-throughput buckets with SSE-KMS may need to request quota increase
Encryption Exam Rules
  • SSE-KMS โ†’ audit trail in CloudTrail + KMS quota limit
  • SSE-C โ†’ you send key with every request (HTTPS required!)
  • SSE-S3 โ†’ default encryption, no extra cost, no audit
  • Bucket keys (with SSE-KMS) โ†’ reduces KMS API calls by 99%

Force Encryption via Bucket Policyโ€‹

{
"Effect": "Deny",
"Principal": "*",
"Action": "s3:PutObject",
"Resource": "arn:aws:s3:::my-bucket/*",
"Condition": {
"StringNotEquals": {
"s3:x-amz-server-side-encryption": "aws:kms"
}
}
}

Default Encryptionโ€‹

{
"ServerSideEncryptionConfiguration": {
"Rules": [{
"ApplyServerSideEncryptionByDefault": {
"SSEAlgorithm": "aws:kms",
"KMSMasterKeyID": "arn:aws:kms:us-east-1:123:key/abc-def"
},
"BucketKeyEnabled": true
}]
}
}

Presigned URLsโ€‹

Generate time-limited URLs that grant temporary access to private objects:

// Generate presigned GET URL (download)
S3Presigner presigner = S3Presigner.create();
PresignedGetObjectRequest presigned = presigner.presignGetObject(b -> b
.signatureDuration(Duration.ofHours(1))
.getObjectRequest(r -> r.bucket("my-bucket").key("reports/2024-Q4.pdf")));
URL downloadUrl = presigned.url();

// Generate presigned PUT URL (upload directly from client browser)
PresignedPutObjectRequest presignedPut = presigner.presignPutObject(b -> b
.signatureDuration(Duration.ofMinutes(15))
.putObjectRequest(r -> r
.bucket("my-bucket")
.key("uploads/" + UUID.randomUUID() + ".jpg")
.contentType("image/jpeg")));
URL uploadUrl = presignedPut.url();
PropertyDetails
PermissionsInherits permissions of the signer (IAM user/role)
Default expiryConfigurable; max 7 days (IAM user), 12 hours (STS temp creds)
Use caseClient-side uploads/downloads without exposing AWS credentials
SecurityIf signer's permissions are revoked, URL stops working immediately
Exam: Direct Upload Pattern

"Allow browser to upload directly to S3 without going through your server" โ†’ Generate a presigned PUT URL on your backend, return it to the client.


Event Notificationsโ€‹

DestinationUse CaseSetup Complexity
SNSFan-out to multiple subscribersLow
SQSQueue for async processingLow
LambdaDirect serverless processingLow
EventBridgeComplex routing, filtering, replayMedium

Event Typesโ€‹

s3:ObjectCreated:* โ€” PUT, POST, COPY, CompleteMultipartUpload
s3:ObjectRemoved:* โ€” DELETE, DeleteMarkerCreated
s3:ObjectRestore:* โ€” Glacier restore initiated/completed
s3:Replication:* โ€” Replication success/failure
s3:LifecycleExpiration:* โ€” Object expired by lifecycle

EventBridge Integrationโ€‹

{ "EventBridgeConfiguration": {} }

EventBridge provides: filtering, multiple targets, archive & replay, schema registry โ€” much more powerful than native S3 notifications.


CORS (Cross-Origin Resource Sharing)โ€‹

When a browser at domain-a.com requests resources from S3 at domain-b.com:

<CORSConfiguration>
<CORSRule>
<AllowedOrigin>https://myapp.example.com</AllowedOrigin>
<AllowedMethod>GET</AllowedMethod>
<AllowedMethod>PUT</AllowedMethod>
<AllowedHeader>*</AllowedHeader>
<MaxAgeSeconds>3000</MaxAgeSeconds>
<ExposeHeader>x-amz-request-id</ExposeHeader>
</CORSRule>
</CORSConfiguration>
CORS Misconception

CORS is NOT a security control โ€” it only tells browsers whether to allow cross-origin responses. Direct API calls (curl, SDK) bypass CORS entirely. Use bucket policies and IAM for actual access control.


Multipart Uploadโ€‹

PropertyValue
RecommendedObjects >100 MB
RequiredObjects >5 GB
Max parts10,000
Part size5 MB โ€“ 5 GB
ParallelismParts uploaded in parallel
// SDK v2 handles multipart automatically with TransferManager
S3TransferManager transferManager = S3TransferManager.create();
FileUpload upload = transferManager.uploadFile(UploadFileRequest.builder()
.putObjectRequest(PutObjectRequest.builder()
.bucket("my-bucket")
.key("large-file.zip")
.build())
.source(Paths.get("/path/to/large-file.zip"))
.build());
upload.completionFuture().join();
Best Practice

Always create a lifecycle rule to abort incomplete multipart uploads after N days โ€” orphaned parts incur storage costs!


S3 Access Pointsโ€‹

Simplify bucket policies for large teams:

Bucket "data-lake"
โ”œโ”€โ”€ Access Point "finance-ap" โ†’ /finance/* (finance team only)
โ”œโ”€โ”€ Access Point "analytics-ap" โ†’ /analytics/* (data scientists)
โ””โ”€โ”€ Access Point "public-ap" โ†’ /public/* (read-only, anyone)
  • Each access point has its own DNS name and IAM policy
  • Can restrict to a specific VPC (VPC-only access point)
  • Simplifies managing complex bucket policies with many principals

Bucket Policies vs IAM Policiesโ€‹

FeatureBucket PolicyIAM Policy
Attached toS3 bucketIAM user/role/group
ScopeCross-account, anonymousSame account only
Use casePublic access, cross-accountUser-level permissions
DenyCan explicitly deny any principalApplies to attached principal

Common Bucket Policy Patternsโ€‹

// Force HTTPS only
{
"Effect": "Deny",
"Principal": "*",
"Action": "s3:*",
"Resource": ["arn:aws:s3:::bucket/*", "arn:aws:s3:::bucket"],
"Condition": { "Bool": { "aws:SecureTransport": "false" } }
}
// Cross-account access
{
"Effect": "Allow",
"Principal": { "AWS": "arn:aws:iam::987654321098:root" },
"Action": ["s3:GetObject", "s3:PutObject"],
"Resource": "arn:aws:s3:::my-bucket/*"
}

๐Ÿ† Best Practicesโ€‹

Securityโ€‹

  1. Block all public access by default โ€” enable only when explicitly needed
  2. Use SSE-KMS with bucket keys for audit + cost optimization
  3. Enable versioning + MFA Delete for critical data
  4. Force HTTPS via bucket policy condition

Costโ€‹

  1. Use Lifecycle Rules to transition to cheaper tiers automatically
  2. Abort incomplete multipart uploads via lifecycle rule
  3. Use Intelligent-Tiering when access patterns are unknown
  4. S3 Select โ€” filter data server-side instead of downloading entire objects

Performanceโ€‹

  1. Multipart upload for files >100MB โ€” parallel parts
  2. Transfer Acceleration for distant clients (uses CloudFront edge)
  3. Prefix partitioning โ€” distribute objects across prefixes for high request rates
  4. S3 supports 3,500 PUT/5,500 GET per prefix per second โ€” use multiple prefixes

๐ŸŽฏ DVA-C02 Exam Tipsโ€‹

S3 Exam Cheat Sheet
  1. SSE-KMS = CloudTrail audit trail but KMS quota limits
  2. SSE-C = you provide key with every request, HTTPS mandatory
  3. Presigned URL = temporary access inheriting signer's permissions
  4. CORS = browser-only, not a security mechanism
  5. Versioning DELETE = adds delete marker (object not actually deleted)
  6. Multipart = required >5GB, recommended >100MB
  7. S3 Event โ†’ EventBridge gives more filtering than native notifications
  8. Lifecycle rule = auto-transition storage class + abort multipart
  9. Bucket keys with SSE-KMS reduces API calls by 99%
  10. S3 Access Points simplify complex multi-team bucket policies

๐Ÿงช Practice Questionsโ€‹

Q1. Allow client browser to directly upload to S3 without going through your server. Best approach?

A) API Gateway proxy to S3
B) Presigned PUT URL
C) Make bucket public
D) Transfer Acceleration

โœ… Answer & Explanation

B โ€” Presigned PUT URL lets the client upload directly with time-limited, credential-free access. No server in the upload path.


Q2. All objects must use customer-managed KMS key with audit trail. Which encryption?

A) SSE-S3
B) SSE-KMS
C) SSE-C
D) Client-Side

โœ… Answer & Explanation

B โ€” SSE-KMS uses a CMK in KMS, and every encrypt/decrypt is logged in CloudTrail.


Q3. Versioning enabled. User deletes a file. What happens?

A) Permanently deleted
B) All versions deleted
C) Delete marker added; previous versions preserved
D) Moved to Glacier

โœ… Answer & Explanation

C โ€” DELETE without VersionId adds a delete marker. All previous versions remain intact.


Q4. High-throughput application using SSE-KMS encryption starts getting ThrottlingException. What should you do?

A) Switch to SSE-S3
B) Request KMS quota increase
C) Enable S3 Bucket Keys
D) Both B and C

โœ… Answer & Explanation

D โ€” S3 Bucket Keys reduce KMS API calls by ~99% (uses a bucket-level key to derive per-object keys). Also request a KMS quota increase if needed.


Q5. A React app on app.example.com fetches images from S3. Requests fail with CORS error. What to configure?

A) IAM policy on the React app
B) S3 bucket policy allowing the domain
C) S3 CORS configuration allowing app.example.com
D) CloudFront distribution

โœ… Answer & Explanation

C โ€” CORS is a browser mechanism. Configure S3 CORS rules to allow the origin domain. Bucket policies control access, not CORS headers.


๐Ÿ”— Resourcesโ€‹