Big Data and Financials Analysis Platform
Objectives
- AWS hosted platform focused on the storage, processing, and presentation of customer and product data
- Spark Vendor Databricks implementation and workflow
Existing Challenges
- DB Storage costs
- Long Pipeline runtimes
- Multiple biz unit data accessibility and separation
- PII, SOX, etc. regulatory compliance
- Public enterprise IT, infosec compliance
Solutions
- Databricks for Spark big data processing, Jupyter Notebooks analysis, Hive tables on S3 for SQL
- S3 for Data Lake source of truth storage and Utility staging and processing storage
- Redshift for Data Warehouse availability to different business unit dashboarding and reports
- ECS Containers for bespoke native codebase applications
- Terraform IaC
- Jenkins code releases and env separation
Benefits
- Analyst and DE accessible
- Scalable
- Enterprise compliant
- Cost manageable’
- Flexible