Amazon S3 Object Lambda – Overview
Scope:
- Intro,
- Architecture,
- What it does,
- How It Works,
- Sample Use Case: Redact SSNs from a CSV,
- Step-by-Step deployment (CLI),
- Access Object Through Object Lambda,
- Security,
- Use Cases.
Intro:
- Amazon S3 Object Lambda allows twtech to transform, filter, or modify data as it is retrieved from its S3 bucket without changing the underlying object.
- This is achieved by integrating AWS Lambda with S3 Access Points.
Architecture
What It Does
With S3 Object Lambda, twtech can:
- Redact or mask sensitive data.
- Convert formats (e.g., XML to JSON).
- Add custom headers or metadata.
- Filter rows/columns from CSV or JSON files.
How It Works
1.
Create
a Lambda function that processes S3 object requests.
2.
Create
an Object Lambda Access Point, associated with:
o The
S3 Access Point (used as source data).
o The
Lambda function.
3.
Clients
access objects via the Object Lambda Access Point hostname.
4.
Lambda
is invoked on each GET/HEAD request, and the function returns
the transformed object.
Sample Use Case: Redact SSNs from a
CSV
1. Lambda function
twtech Python Sample (redacting SSNs):
# pythonimport boto3import ioimport csvdef lambda_handler(event, context): s3 = boto3.client('s3') # Get original object from S3 response = s3.get_object( Bucket=event['getObjectContext']['inputS3Url'].split('/')[2].split('.')[0], Key=event['userRequest']['url'].split('/')[-1] ) data = response['Body'].read().decode('utf-8') # Redact SSNs redacted_data = data.replace(r"\d{3}-\d{2}-\d{4}", "***-**-****") # Return transformed data return { 'status_code': 200, 'body': redacted_data }2. Step-by-Step deployment (CLI)
Create S3 Access Point
# bashaws s3control create-access-point \ --account-id accountID \ --name twtechsource-ap \ --bucket twtech-s3bucket3. Create Object Lambda Access Point
# bashaws s3control create-object-lambda-access-point \ --account-id accountID \ --name twtech-object-lambda-ap \ --configuration file://lambda-config.json# lambda-config.json
# json{ "SupportingAccessPoint": "arn:aws:s3:us-east-2:accountID:accesspoint/twtechsource-ap", "TransformationConfigurations": [ { "Actions": ["GetObject"], "ContentTransformation": { "AwsLambda": { "FunctionArn": "arn:aws:lambda:us-east-2:accountID:function:redact-ssn" } } } ]}4. Access Object Through Object Lambda
- Use this hostname format:
# python twtech-object-lambda-ap-accountID.s3-object-lambda.us-east-2.amazonaws.comNB:
- twtech Performs an S3 GET using the SDK or signed request to retrieve the transformed object.
Security
- twetech must grant
s3-object-lambda.amazonaws.compermission to invoke your Lambda. - twtech Lambda should be idempotent and handle errors gracefully.
- twtech Sets proper timeouts and memory — Lambda affects object retrieval latency.
twtech Ideal Use Cases
- Serves dynamically transformed content
- Multi-tenant content filtering
- Data redaction (hide or remove sensitive data) for compliance (PII, PHI)
- Generates virtual views (a view in a database management system ...DBMS) over raw S3 data.
No comments:
Post a Comment