Sunday, November 23, 2025

AWS Schema Conversion Tool (SCT) | Overview.


An Overview of AWS Schema Conversion Tool (SCT).

 Scope:

  •        Tailored for DevOps/Cloud/DevSecOps/SRE/Platform focus Engineers,
  •        Architecture,
  •        Workflows,
  •        Edge cases,
  •        Performance optimizations,
  •        Operational best practices.

Breakdown:

  •        SCT – Internal Architecture Overview,
  •        SCT Workflow – End-to-End,
  •        SCT Conversion Internals,
  •        SCT Agent,
  •        Failure Modes,
  •        Resiliency & Operational Best Practices,
  •        Top SCT Optimization Tips (Production Grade),
  •        SCT Cheat Sheet – Quick Reference.

Intro:

  •        AWS Schema Conversion Tool (SCT) helps transform database schemas, objects, SQL, and sometimes ETL code from one engine to another.
  •        AWS Schema Conversion Tool (SCT) helps transform most commonly commercial open-source/AWS-native migrations such as:

Source

          Target

Oracle

Aurora PostgreSQL / RDS PostgreSQL

SQL Server

Aurora MySQL/PostgreSQL

Teradata

Amazon Redshift

Netezza

Amazon Redshift

On-prem ETL

AWS Glue

Oracle PL/SQL packages

PostgreSQL PL/pgSQL functions

NB:

  • Schema Conversion Tool (SCT)  performs automated conversion where possible and flags manual actions via detailed reports.

1. SCT – Internal Architecture Overview

A. Logical Components:

1.     Metadata Collector

  •    Introspects source DB
  •    Reads tables, columns, views, constraints, sequences, stored procedures, triggers, user-defined types
  •    Builds an internal intermediate representation (IR)

2.     Conversion Engine

  •    Pattern-matching engine
  •    PL/SQL PL/pgSQL converters
  •    SQL dialect transformers
  •    Procedural logic rewriter
  •    Function rewriting (ex: Oracle VARCHAR2 TEXT/VARCHAR)
  •    Dependency graph resolution
  •    Unsupported feature annotator

3.     Assessment Engine

  •    Generates “complexity scores”
  •    Flags incompatible constructs
  •    Produces “estimated manual effort” hours
  •    Creates detailed assessment reports (HTML/PDF)

4.     Data Extractors (optional)

  •    For homogeneous or heterogeneous migrations
  •    Direct integration with AWS DMS for data movement
  •    Supports schema copy, seed data, and slow-changing data extraction

5.     ETL Conversion Module (Optional)

  •    Converts Informatica, SSIS, Oracle ODI, DataStage into Glue scripts
  •    Creates PySpark/Glue ETL jobs

6.     SCT Agent (for large workloads)

  •    Runs conversions at scale
  •    Offloads extraction from the UI client
  •    Supports parallel extract
  •    Runs in VPC or on-prem
  •    EBS-backed storage for extraction

2. SCT Workflow – End-to-End

1. Environment Setup

  •         Install SCT client (Mac/Windows/Linux)
  •         Install SCT Agent (optional)
  •         Configure DB connectivity (on-prem VPN/Direct Connect)

2. Run Schema Assessment

Generates:

  •         DB feature parity report
  •         Complexity score
  •         Incompatibility lists
  •         Workload breakdown (automated vs manual conversion)

3. Perform Schema Conversion

  •         Auto-create converted schema
  •         Manually fix complex code
  •         Validate dependency resolution
  •         Export SQL and apply to target

4. Apply Converted Schema to Target

  •         Push converted DDL to:
    •    Aurora PG/MySQL
    •    RDS PostgreSQL/MySQL
    •    Redshift

5. Connect SCT to AWS DMS

  •         DMS migrates actual data
  •         SCT validates schema compatibility for DMS tasks
  •         Post-migration: SCT may run SQL-based data validation scripts

6. Post-Conversion Verification

  •         Object comparison
  •         SQL validation
  •         Application integration tests

3. SCT Conversion Internals

 SQL Dialect Translation

Examples:

Oracle

PostgreSQL

NVL(x, y)

COALESCE(x, y)

SYSDATE

NOW()

CONNECT BY

recursive CTE

NUMBER(p,s)

NUMERIC(p,s)

RAW

BYTEA

NB:

Triggers, sequences, synonyms, and specific PL/SQL constructs often require manual rewriting.

B. Stored Procedures & Functions

SCT rewrites:

  •         Variable declarations
  •         Cursor operations
  •         Exception handling
  •         Package structures
  •         LOOP and EXIT logic
  •         FORALL batch INSERT
  •         BULK COLLECT arrays
  •         Oracle collections PostgreSQL arrays
  •         DBMS_OUTPUT RAISE NOTICE

Limitations:

  •         Oracle packages don’t map 1:1 to PostgreSQL packages
  •         Autonomous transactions require redesign
  •         Oracle REF CURSOR types require rewrite

C. Schema Differences / Object Mappings

SCT handles:

  •         Primary Key differences
  •         Foreign key cascades
  •         Constraint naming issues
  •         Collation/encoding differences
  •         Partitioning (Oracle “Range/Hash” PostgreSQL native partitioning)

D. Performance Optimization Conversions

  •         Oracle hints removed or replaced
  •         Oracle indexes mapped to PostgreSQL/B-tree
  •         Oracle function-based indexes often require materialized columns
  •         Oracle partitioning PG native or sharding
  •         Temp table behavior differences handled explicitly

4. SCT Agent

Purpose

  •        Used when the dataset is >1TB, or when migrating from on-prem databases that are slow to extract.

Capabilities

  •         Parallel schema extract
  •         Parallel data unload
  •         WAN-friendly extraction (throttling, resume, compression)
  •         Writes data in chunked Parquet files
  •         Transfers to S3 bucket
  •         Integrates with AWS DMS (full load)

Flow

Source DB  SCT Agent (Parquet Extracts)  S3  DMS  Target DB

Agent Tuning Tips

  •         Increase worker threads based on CPU
  •         Tune fetch size for Oracle
  •         Prefer dedicated network path (DX/VPN)
  •         Use ephemeral SSD to speed up extract
  •         Do not place agent behind multiple NAT layers

5. Failure Modes

1. Unsupported DB Features

Common problematic objects:

  •         Autonomous transactions
  •         Custom PL/SQL packages
  •         Complex views with analytic functions
  •         Materialized views with refresh logic
  •         Oracle XMLTYPE / JSON handling
  •         Oracle Spatial and Graph
  •         Oracle AQ / Streams
  •         SQL Server CLR objects
  •         Advanced partitioning templates

2. Dependency Errors

Occurs when objects reference unconverted items.

Fix:

  •         Dependency graph validation
  •         Reorder or manually script the DDL

3. Large Object Failures (LOB/BLOB/CLOB)

Causes:

  •         Incompatible data types
  •        Incorrect character set mappings
  •        Corrupt blobs
  •        Timeout during extract

4. Network/Connectivity & Agent Failures

  •         SCT Agent pauses mid-transfer
  •         SCT UI freezes (common on Windows)
  •         Extract stalls on specific table due to locking

5. DMS + SCT Integration Failures

  •         Schema mismatch
  •         Missing transformations
  •         Unsupported key types
  •         Large tables requiring chunking

6. Resiliency & Operational Best Practices

A. Before Conversion

  •         Enable binary logging on target
  •         Enable row-based replication if using PostgreSQL
  •         Pre-size target storage with growth expectations
  •         Validate source DB for corruption
  •         Ensure source DB stats are updated (Oracle ANALYZE/DBMS_STATS)

B. During Conversion

  •         Use SCT Agent for datasets >1TB
  •         Break schema migration into slices (per schema, per module)
  •         Exclude unused legacy schema objects
  •         Pin SCT to a dedicated VM for stability
  •         Increase SCT JVM heap memory for large conversions (command-line flag)

C. During Data Migration (DMS Phase)

Recommended:

  •         Full load + CDC strategy
  •         Parallel table loading
  •         Disable foreign keys during load
  •         Disable heavy-index tables during load
  •         Enable retries with exponential backoff

D. After Conversion

  •         Run SCT data validation queries
  •         Compare row counts
  •         Compare checksums
  •         Test stored procedures using unit tests
  •         Run application read/write tests
  •         Validate concurrency patterns (especially Oracle PostgreSQL)

7. Top SCT Optimization Tips (Production Grade)

 Increase SCT Performance

  •         Run SCT on a Linux workstation (Windows is slower)
  •         Use SCT Agent for large extractions
  •         Avoid network latency (run closer to DB)
  •         Increase conversion memory (JVM flags: -Xms4G -Xmx16G)

 Reduce Manual Conversion Work

  •         Prioritize code-based objects (functions/packages) first
  •         Use automated pre-check rules
  •         Convert materialized views into regular views temporarily
  •         Pre-rewrite Oracle CONNECT BY to recursive CTEs
  •        Migrate sequences carefully and map ownership properly

 Improve Migration Accuracy

  •         Freeze source schema before extraction
  •         Always version-control converted SQL
  •         Run automated regression tests on PL/pgSQL code
  •         Validate numeric precision and date/time semantics

8. SCT Cheat Sheet – Quick Reference

Supported Conversion Targets

  •         Aurora PostgreSQL / MySQL
  •         RDS PostgreSQL / MySQL
  •         Amazon Redshift
  •         AWS Glue (ETL)

Best Use Cases

  •         Commercial DB Open-source
  •         Oracle to PostgreSQL (most common)
  •         SQL Server to MySQL/PostgreSQL
  •         Netezza/Teradata Redshift

Not a Good Fit When

  •         Heavy Oracle PL/SQL packages with advanced features
  •         Complex proprietary DB extensions
  •         OLTP systems relying on vendor-specific features
  •         Legacy applications using Oracle AQ, Spatial, InterMedia, etc.

No comments:

Post a Comment

Amazon EventBridge | Overview.

Amazon EventBridge - Overview. Scope: Intro, Core Concepts, Key Benefits, Link to official documentation, Insights. Intro: Amazon EventBridg...