Sunday, November 23, 2025

AWS Schema Conversion Tool (SCT) | Overview.

AWS Schema Conversion Tool (SCT) - Overview.

 Scope:

  • Intro,
  • SCT – Internal Architecture Overview,
  • SCT Workflow – End-to-End,
  • SCT Conversion Internals (SQL Dialect Translation Samples),
  • SCT Agent,
  • Failure Modes,
  • Resiliency & Operational Best Practices,
  • Top SCT Optimization Tips (Production Grade),
  • SCT Cheat Sheet – Quick Reference.

Intro:

    •  AWS Schema Conversion Tool (SCT) helps transform:
      • Database schemas, 
      • Objects, 
      • SQL
      • Sometimes ETL code from one engine to another.
    •  AWS Schema Conversion Tool (SCT) helps transform most commonly commercial open-source/AWS-native migrations such as:

Source

          Target

Oracle

Aurora PostgreSQL / RDS PostgreSQL

SQL Server

Aurora MySQL/PostgreSQL

Teradata

Amazon Redshift

Netezza

Amazon Redshift

On-prem ETL

AWS Glue

Oracle PL/SQL packages

PostgreSQL PL/pgSQL functions

NB:

    • Schema Conversion Tool (SCT) performs automated conversion where possible and flags manual actions via detailed reports.

1. SCT – Internal Architecture Overview

A. Logical Components:

1.     Metadata Collector

    •    Introspects source DB
    •    Reads tables, columns, views, constraints, sequences, stored procedures, triggers, user-defined types
    •    Builds an internal intermediate representation (IR)

2.     Conversion Engine

    •    Pattern-matching engine
    •    PL/SQL PL/pgSQL converters
    •    SQL dialect transformers
    •    Procedural logic rewriter
    •    Function rewriting (ex: Oracle VARCHAR2 TEXT/VARCHAR)
    •    Dependency graph resolution
    •    Unsupported feature annotator

3.     Assessment Engine

    •    Generates “complexity scores”
    •    Flags incompatible constructs
    •    Produces “estimated manual effort” hours
    •    Creates detailed assessment reports (HTML/PDF)

4.     Data Extractors (optional)

    •    For homogeneous or heterogeneous migrations
    •    Direct integration with AWS DMS for data movement
    •    Supports schema copy, seed data, and slow-changing data extraction

5.     ETL Conversion Module (Optional)

    • Converts Informatica, SSIS, Oracle ODI, DataStage into Glue scripts
    • Creates PySpark/Glue ETL jobs

6.     SCT Agent (for large workloads)

    •    Runs conversions at scale
    •    Offloads extraction from the UI client
    •    Supports parallel extract
    •    Runs in VPC or on-prem
    •    EBS-backed storage for extraction

2. SCT Workflow – End-to-End

1. Environment Setup

    • Install SCT client (Mac/Windows/Linux)
    • Install SCT Agent (optional)
    • Configure DB connectivity (on-prem VPN/Direct Connect)

2. Run Schema Assessment Generates:

    • DB feature parity report
    • Complexity score
    • Incompatibility lists
    • Workload breakdown (automated vs manual conversion)

3. Perform Schema Conversion

    • Auto-create converted schema
    • Manually fix complex code
    • Validate dependency resolution
    • Export SQL and apply to target

4. Apply Converted Schema to Target

    •  Push converted DDL to:
      •    Aurora PG/MySQL
      •    RDS PostgreSQL/MySQL
      •    Redshift

5. Connect SCT to AWS DMS

    • DMS migrates actual data
    • SCT validates schema compatibility for DMS tasks
    • Post-migration: SCT may run SQL-based data validation scripts

6. Post-Conversion Verification

    • Object comparison
    • SQL validation
    • Application integration tests

3. SCT Conversion Internals (SQL Dialect Translation Samples):

Oracle

PostgreSQL

NVL(x, y)

COALESCE(x, y)

SYSDATE

NOW()

CONNECT BY

recursive CTE

NUMBER(p,s)

NUMERIC(p,s)

RAW

BYTEA

NB:

    • Triggers, sequences, synonyms, and specific PL/SQL constructs often require manual rewriting.

B. Stored Procedures & Functions

SCT rewrites:

    • Variable declarations
    • Cursor operations
    • Exception handling
    • Package structures
    • LOOP and EXIT logic
    • FORALL batch INSERT
    • BULK COLLECT arrays
    • Oracle collections PostgreSQL arrays
    • DBMS_OUTPUT RAISE NOTICE

Limitations:

    • Oracle packages don’t map 1:1 to PostgreSQL packages
    • Autonomous transactions require redesign
    • Oracle REF CURSOR types require rewrite

C. Schema Differences / Object Mappings

SCT handles:

    • Primary Key differences
    • Foreign key cascades
    • Constraint naming issues
    • Collation/encoding differences
    • Partitioning (Oracle “Range/Hash” PostgreSQL native partitioning)

D. Performance Optimization Conversions

    • Oracle hints removed or replaced
    • Oracle indexes mapped to PostgreSQL/B-tree
    • Oracle function-based indexes often require materialized columns
    • Oracle partitioning PG native or sharding
    • Temp table behavior differences handled explicitly

4. SCT Agent

Purpose

    • Used when the dataset is >1TB, 
    • or when migrating from on-prem databases that are slow to extract.

Capabilities

    • Parallel schema extract
    • Parallel data unload
    • WAN-friendly extraction (throttling, resume, compression)
    • Writes data in chunked Parquet files
    • Transfers to S3 bucket
    • Integrates with AWS DMS (full load)

Flow

Source DB  SCT Agent (Parquet Extracts)  S3  DMS  Target DB

Agent Tuning Tips

    • Increase worker threads based on CPU
    • Tune fetch size for Oracle
    • Prefer dedicated network path (DX/VPN)
    • Use ephemeral SSD to speed up extract
    • Do not place agent behind multiple NAT layers

5. Failure Modes

1. Unsupported DB Features

  • Common problematic objects:
    •         Autonomous transactions
    •         Custom PL/SQL packages
    •         Complex views with analytic functions
    •         Materialized views with refresh logic
    •         Oracle XMLTYPE / JSON handling
    •         Oracle Spatial and Graph
    •         Oracle AQ / Streams
    •         SQL Server CLR objects
    •         Advanced partitioning templates

2. Dependency Errors

    • Occurs when objects reference unconverted items.

Fix:

    • Dependency graph validation
    • Reorder or manually script the DDL

3. Large Object Failures (LOB/BLOB/CLOB) Causes:

    • Incompatible data types
    • Incorrect character set mappings
    • Corrupt blobs
    • Timeout during extract

4. Network/Connectivity & Agent Failures

    • SCT Agent pauses mid-transfer
    • SCT UI freezes (common on Windows)
    •  Extract stalls on specific table due to locking

5. DMS + SCT Integration Failures

    •  Schema mismatch
    •  Missing transformations
    •  Unsupported key types
    •  Large tables requiring chunking

6. Resiliency & Operational Best Practices

A. Before Conversion

    •  Enable binary logging on target
    •  Enable row-based replication if using PostgreSQL
    •  Pre-size target storage with growth expectations
    •  Validate source DB for corruption
    •  Ensure source DB stats are updated (Oracle ANALYZE/DBMS_STATS)

B. During Conversion

    • Use SCT Agent for datasets >1TB
    • Break schema migration into slices (per schema, per module)
    • Exclude unused legacy schema objects
    • Pin SCT to a dedicated VM for stability
    • Increase SCT JVM heap memory for large conversions (command-line flag)

C. During Data Migration (DMS Phase) Recommendation:

    • Full load + CDC strategy
    • Parallel table loading
    • Disable foreign keys during load
    • Disable heavy-index tables during load
    • Enable retries with exponential backoff

D. After Conversion

    • Run SCT data validation queries
    • Compare row counts
    • Compare checksums
    • Test stored procedures using unit tests
    • Run application read/write tests
    • Validate concurrency patterns (especially Oracle PostgreSQL)

7. Top SCT Optimization Tips (Production Grade)

 Increase SCT Performance

    •         Run SCT on a Linux workstation (Windows is slower)
    •         Use SCT Agent for large extractions
    •         Avoid network latency (run closer to DB)
    •         Increase conversion memory (JVM flags: -Xms4G -Xmx16G)

 Reduce Manual Conversion Work

    •  Prioritize code-based objects (functions/packages) first
    •  Use automated pre-check rules
    •  Convert materialized views into regular views temporarily
    •  Pre-rewrite Oracle CONNECT BY to recursive CTEs
    •  Migrate sequences carefully and map ownership properly

 Improve Migration Accuracy

    • Freeze source schema before extraction
    • Always version-control converted SQL
    • Run automated regression tests on PL/pgSQL code
    • Validate numeric precision and date/time semantics

8. SCT Cheat Sheet – Quick Reference

Supported Conversion Targets

    •  Aurora PostgreSQL / MySQL
    •  RDS PostgreSQL / MySQL
    •  Amazon Redshift
    •  AWS Glue (ETL)

Best Use Cases

    •  Commercial DB Open-source
    •  Oracle to PostgreSQL (most common)
    •  SQL Server to MySQL/PostgreSQL
    •  Netezza/Teradata Redshift

Not a Good Fit When

    •  Heavy Oracle PL/SQL packages with advanced features
    •  Complex proprietary DB extensions
    •  OLTP systems relying on vendor-specific features
    •  Legacy applications using Oracle AQ, Spatial, InterMedia, etc.




No comments:

Post a Comment

Amazon EventBridge | Overview.

Amazon EventBridge - Overview. Scope: Intro, Core Concepts, Key Benefits, Link to official documentation, What EventBridge  Really  Is (Deep...