Category: Media & Entertainment

  • DE Staff Re-org Strategy and Workflow

    DE Staff Re-org Strategy and Workflow

    Objectives

    • New DE and Analytics department workflow
    • Re-structure Stakeholder Analytics requests workflow and implementation
    • Re-structure Data Modeling and Development for reporting workflow and implementation
    Existing Challenges

    • Balancing new biz objectives and priorities with existing workloads and operations
    • Inter-business-unit technical bureaucracies and politics
    Solutions

    • Evaluated existing ETL operations for new initiatives. i.e. bronze, silver, gold executive discoverable datasets
    • Formed dev squads, funneled stakeholder requests through analysts with PMO oversight
    • Jira Agile sprints to properly estimate and triage the work
    Benefits

    • Visibility of all operations across BI units
    • Accountability of resources across BI units
    • Efficient priority management relative to available resources

  • AWS Matillion, SQS, Lambda, Redshift, CDK

    AWS Matillion, SQS, Lambda, Redshift, CDK

    Objectives

    • Low-code rapid prototype ETL ecosystem for pilot data analyst effort
    Existing Challenges

    • Existing dev resources are tied up
    • ccess and ops bureaucracy takes a while
    Solutions

    • Services, assets and CDK IaC, IAM setup and workflow
    • Data-Mart solution for self-service isolated dataset BI work
    Benefits

    • Enclosed ETL ecosystem analysts can track and use
    • Big data capable

  • AWS Media Billing Platform

    AWS Media Billing Platform

    Objectives

    • Programmatic Ad Sales delivery and billing reporting codebase
    • Fix and Refactor legacy codebase for speed, bug fixes and feature enhancements
    Existing Challenges

    • Debugging BI logic on big-data required python pandas dataframe live debugging. i.e. Analysts can’t view and debug reporting issues without a developer
    • Pipeline execution time is too long
    Solutions

    • Git Action/Terraform CICD, EKS Container, Py Pandas ETL to/from incremental S3 Glue-db tables w Airflow orchestration
    • Reworking of business dataset analysis workflow from Py Pandas DFs to Glue/Athena tables
    Benefits

    • Analysts can now work directly with data via SQL in Glue DB Tables. Once fixes are found the logic can be integrated by developer via normal sprint workflow
    • Pipelines are Reliable and Fast

  • AWS Big Data Media Platform

    AWS Big Data Media Platform

    Objectives

    • Goal was to replace Databricks vendor platform and redshift-centric ecosystem to a native AWS EMR,JupyterNotebooks hive-centric and redshift-datamart ecosystem.
    • Improve data analysis access, and ETL speed, reliability and cost effectivenss
    • Medallion (bronze, silver, gold) data triage for engagement, ad-sales and content data domains
    Existing Challenges

    • Disparate Data Access: Can’t easily gain access and query data across domains
    • Database Issues Slow queries. queries lock up or time out. Db load competition. Usage limits
    Solutions

    • AWS EMR w DBT Spark ETL. EMR Jupyter Notebooks for analysis. Git Action w Terraform CICD.
    • S3 Glue-dbs for lake storage. Redshift datamart reporting storage
    Benefits

    • Long-term Flexibility, Reliability, Robustness
    • Cost Manageable and Flexible