Img-Reboot

Streamlining Data Management with AWS for a Global Food Manufacturer

Summary

A major CPG client was dependent on data downloads from various vendor partners, multiple Excel files, and a Microsoft SQL Server instance to power their dashboards. This setup required substantial effort to manage and maintain, leading to challenges with insight delivery delays, user experience, scalability, and automation.

Since the client was already using a data warehouse on AWS, it made sense to centralize their data processing and reporting there. Leveraging AWS services provided advantages like seamless scalability, faster processing times, robust logging, and efficient error handling.

Goal

To design a solution that would provide the client with a dashboard featuring fully refreshed weekly data, incorporating both descriptive and predictive analytics. This would allow the client to focus on their business operations instead of worrying about data maintenance.

Solution Architecture:

 

Approach

The process began with data ingestion from various sources, utilizing the client's existing data ingestion framework. Tredence designed and developed the Revenue Growth Management (RGM) warehouse within the Redshift cluster and created the data science pipeline and web application using AWS serverless architecture.

  1. Data Science Pipeline
    • The pipeline operates using a serverless SageMaker instance, which is triggered by a Lambda function on a scheduled basis through an Amazon EventBridge rule. This setup trains the model and ensures it stays updated with the latest data, delivering accurate predictive outputs.
    • The data science model is integrated into the web application via an Amazon SageMaker endpoint. When users want to simulate a result in the frontend, they adjust the relevant parameters and submit them for simulation. This action triggers an Amazon CloudWatch event, which subsequently activates the Amazon SageMaker model. The model then returns the result to the web application through the same SageMaker endpoint.
  2. Web Application Architecture
    • The web application features a data layer consisting of a Redshift warehouse for analytical queries and an RDS instance to store the web application's metadata.
    • The application is hosted using Elastic Beanstalk, positioned behind an AWS load balancer. The backend APIs are developed using Node.js, while the frontend is built with React.
    • For security and user management, the application is integrated with Azure Active Directory (AD) to provide Single Sign-On (SSO) for the client.
  3. Development and Deployment
    • AWS CodeBuild, CodePipeline, and CloudFormation were utilized to develop, build, and deploy the solution on AWS.

AWS Services Used

  • Redshift: Data warehouse for analytical queries.
  • SageMaker: Serverless instance for training and deploying machine learning models.
  • API Gateway: Facilitates access to the SageMaker endpoint.
  • Lambda: Executes code in response to triggers, such as EventBridge rules.
  • AWS Glue: Automates the process of extracting, transforming, and loading data from various sources into a data store or warehouse.
  • EventBridge: Schedules events to trigger Lambda functions.
  • CloudWatch: Monitors events and triggers the SageMaker model.
  • RDS: Stores metadata for the web application.
  • Elastic Beanstalk: Hosts the web application, providing load balancing.
  • Load Balancer: Distributes incoming traffic to Elastic Beanstalk instances.
  • CodeBuild: Builds the application.
  • CodePipeline: Manages the continuous integration/continuous deployment (CI/CD) pipeline.
  • CloudFormation: Provisions and manages AWS resources.

Key Benefits

  1. Centralized Data Management: Simplified data processing and reporting by consolidating information within a central AWS data warehouse.
  2. Enhanced Scalability: Effortlessly scale data processing and analytics capabilities through AWS services.
  3. Improved Performance: Accelerated data processing and analytics with AWS Redshift and SageMaker, ensuring rapid insights.
  4. Robust Logging and Monitoring: Reliable system performance and error handling through comprehensive monitoring and logging with AWS CloudWatch.
  5. Automated Model Training: Regular updates and accurate predictive analytics through automated model training with SageMaker and EventBridge.
  6. User-Friendly Interface: An intuitive web application that enables users to easily simulate results and visualize data.
  7. Secure Access: Enhanced security and user management with Single Sign-On (SSO) integration using Azure AD.
  8. Efficient Deployment: Streamlined development, build, and deployment processes with AWS CodeBuild, CodePipeline, and CloudFormation.
  9. Reduced Costs: Cost-effective infrastructure management using AWS serverless architecture and on-demand resource scaling.

Results

Icon Boost

Operational Efficiency:
Significantly reduced overhead in data management, allowing the client to concentrate on core business operations.

Icon Boost

Predictive analytics:
Leveraging the web application and ML models, the client is better positioned to achieve higher goals in a competitive market.

Icon Boost

Dynamic ML Framework:
Delivered a fully dynamic and configurable, ready-to-use, end-to-end machine learning framework.

Talk To Us