AWS Media Billing Platform

Objectives

  • Programmatic Ad Sales delivery and billing reporting codebase
  • Fix and Refactor legacy codebase for speed, bug fixes and feature enhancements
Existing Challenges

  • Debugging BI logic on big-data required python pandas dataframe live debugging. i.e. Analysts can’t view and debug reporting issues without a developer
  • Pipeline execution time is too long
Solutions

  • Git Action/Terraform CICD, EKS Container, Py Pandas ETL to/from incremental S3 Glue-db tables w Airflow orchestration
  • Reworking of business dataset analysis workflow from Py Pandas DFs to Glue/Athena tables
Benefits

  • Analysts can now work directly with data via SQL in Glue DB Tables. Once fixes are found the logic can be integrated by developer via normal sprint workflow
  • Pipelines are Reliable and Fast