Batch Operation
๐ฆ What is S3 Batch Operation?¶
Amazon S3 Batch Operations allow you to perform bulk actions on many S3 objects (from thousands to billions) using a single job.
Instead of writing a loop or running manual scripts, Batch Operations let you:
Copy
Delete
Restore
Modify tags
Modify ACLs
Invoke Lambda on each object
๐ฏ Real-World Use Cases¶
| Use Case | Operation Type |
|---|---|
| Migrate objects to a new bucket | Copy |
| Remove outdated data | Delete |
| Restore objects from Glacier | Restore |
| Tag archived objects for compliance | Set Object Tags |
| Apply new ACL or permissions | Set Object ACLs |
| Run virus scan or compression using Lambda | Invoke Lambda |
| Trigger Replication on existing objects | Batch Replication |
๐ง How It Works¶
-
Create a manifest file โ A CSV or JSON list of objects you want to operate on.
-
Define job settings โ What operation to run (e.g., copy, delete, lambda)
-
Choose IAM role โ Used to execute actions on your behalf
-
Submit the job โ AWS will run it asynchronously
-
Monitor status โ Track progress and logs via AWS Console or CloudWatch
๐ Sample Manifest File (CSV Format)¶
-
Stored in an S3 bucket
-
Must be in the same region as the job
โ Supported Operations¶
| Operation | Description |
|---|---|
| Copy | Copy objects to new bucket/prefix |
| Delete | Delete selected objects |
| Set Object Tags | Add/update tags |
| Set Object ACLs | Change object-level permissions |
| Restore | Restore Glacier/Deep Archive objects |
| Lambda Invoke | Custom logic per object (e.g., scan, resize) |
| Batch Replication | Replicate pre-existing objects |
๐ IAM Role Requirements¶
The execution role must include permissions like:
{
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject"
],
"Resource": "arn:aws:s3:::my-bucket/*"
}
Also needs:
๐ ๏ธ Terraform Example โ Batch Operation to Delete Objects¶
resource "aws_s3control_batch_job" "delete_job" {
account_id = "123456789012"
manifest {
format = "S3BatchOperations_CSV_20180820"
location {
bucket = "arn:aws:s3:::my-batch-manifests"
key = "manifests/delete-list.csv"
}
}
operation {
s3_delete_object {}
}
report {
bucket = "arn:aws:s3:::my-batch-reports"
format = "Report_CSV_20180820"
enabled = true
prefix = "batch-reports/"
report_scope = "AllTasks"
}
priority = 10
role_arn = "arn:aws:iam::123456789012:role/S3BatchOpsExecutionRole"
job_status = "Active"
}
๐งพ Pricing¶
| Item | Cost |
|---|---|
| Batch Job Fee | $0.25 per 1,000 operations |
| Lambda Invocation | Standard Lambda pricing applies |
| S3 PUT/GET/Delete | Billed per usual rates |
| Glacier Restore | Extra cost for retrieval + batch fees |
๐ Monitoring & Reporting¶
| Tool | Purpose |
|---|---|
| CloudWatch | View job status, failures, duration |
| Reports | CSV of success/failure per object |
| AWS Console | Visualize job progress |
โ ๏ธ Limitations¶
| Limitation | Detail |
|---|---|
| Manifest must be in S3 | Cannot upload from local |
| Region-locked | Must run in the same region as the manifest |
| Not real-time | Jobs run asynchronously |
| Size limit | 1,000,000,000 objects max per job |
โ TL;DR Summary¶
| Feature | Description |
|---|---|
| What is it? | Bulk processing of S3 objects via a job |
| Key operations | Copy, Delete, Restore, Tag, Lambda |
| Manifest file | List of S3 objects (CSV or JSON) |
| Job control | IAM role executes the batch actions |
| Use cases | Migration, cleanup, tagging, replication |
| Cost | $0.25 per 1,000 operations + usage fees |
๐ Example Use Case: Replicate Existing Objects¶
Want to replicate files uploaded before replication rules were added? Use:
-
S3 Inventory โ to generate list
-
S3 Batch Job โ with
Replicationoperation