Amazon S3 Object Lambda – Overview
Amazon
S3 Object Lambda allows twtech to transform, filter, or modify
data as it is retrieved from S3 without changing the underlying
object.
This is achieved by integrating
AWS Lambda with S3 Access Points.
What Does It Do
With S3 Object Lambda, twtech can:
·
Redact
or mask sensitive data.
·
Convert
formats (e.g., XML to JSON).
·
Add
custom headers or metadata.
·
Filter
rows/columns from CSV or JSON files.
How It Works
1.
Create
a Lambda function that processes S3 object requests.
2.
Create
an Object Lambda Access Point, associated with:
o The
S3 Access Point (used as source data).
o The
Lambda function.
3.
Clients
access objects via the Object Lambda Access Point hostname.
4.
Lambda
is invoked on each GET/HEAD request, and the function returns
the transformed object.
Sample Use Case: Redact SSNs from a
CSV
1. Lambda function
Python example (redacting SSNs):
# python
import boto3
import io
import csv
def
lambda_handler(
event, context):
s3 = boto3.client(
's3')
# Get original object from S3
response = s3.get_object(
Bucket=event[
'getObjectContext'][
'inputS3Url'].split(
'/')[
2].split(
'.')[
0],
Key=event[
'userRequest'][
'url'].split(
'/')[-
1]
)
data = response[
'Body'].read().decode(
'utf-8')
# Redact SSNs
redacted_data = data.replace(
r"\d{3}-\d{2}-\d{4}",
"***-**-****")
# Return transformed data
return {
'status_code':
200,
'body': redacted_data
}
2. Create S3 Access Point
# bash
aws s3control create-access-point \
--account-id 12345678xxxx \
--name source-ap \
--bucket twtech-s3bucket
3. Create Object Lambda Access Point
# bash
aws s3control create-object-lambda-access-point \
--account-id 123456789xxxx \
--name twtech-object-lambda-ap \
--configuration file://lambda-config.json
lambda-config.json
# json
{
"SupportingAccessPoint":
"arn:aws:s3:us-east-2:12345678xxxx:accesspoint/source-ap",
"TransformationConfigurations":
[
{
"Actions":
["GetObject"],
"ContentTransformation":
{
"AwsLambda":
{
"FunctionArn":
"arn:aws:lambda:us-east-2:123456xxxxx:function:redact-ssn"
}
}
}
]
}
4. Access Object Through Object Lambda
Use this hostname format:
# python
twtech-
object-
lambda-ap-
1234567xxxxx.s3-
object-
lambda.us-east-
2.amazonaws.com
Perform an S3 GET using the SDK or signed
request to retrieve the transformed
object.
Security Notes
·
twetech must grant s3-object-lambda.amazonaws.com
permission to invoke your Lambda.
·
twtech Lambda should be idempotent and handle
errors gracefully.
·
Set proper timeouts and memory — Lambda affects
object retrieval latency.
twtech Ideal Use Cases
·
Serving dynamically
transformed content
·
Multi-tenant
content filtering
·
Data
redaction for compliance (PII, PHI)
· Generating virtual views over raw S3 data.
No comments:
Post a Comment