Img-Reboot

Unifying Cruise Line Reservations with Microservices and Amazon MSK

Summary

A global leader in the cruise line industry decided to address the limitations in its legacy reservation systems by developing a custom reservation system based on microservices architecture. This transition aimed to replace brand-specific reservation systems with a unified approach, leveraging the scalability and fault tolerance of microservices over monolithic design.

As part of this transformation, downstream systems needed to adopt an event-based message consumption pattern, facilitated by a real-time platform such as Amazon MSK.

Tredence played a crucial role in implementing a near real-time data pipeline, helping the client transition to an event-driven model. This shift not only made their pipeline real-time instead of batch-based but also enabled the client to derive actionable insights in near real-time, leveraging the finest data granularity with Amazon MSK.

Goal

Faced with maintaining multiple batch-oriented data pipelines due to their legacy reservation system, our client recognized the need to eliminate silos and consolidate into a unified model.

Their vision was to develop a unified, fault-tolerant, and scalable near real-time data ingestion pipeline from their microservices-based system to the ODS layer in Snowflake.

To achieve this transformation, Tredence partnered with the client to leverage Amazon MSK on AWS. Together, we built a fully dynamic, fault-tolerant, scalable, and configurable near real-time streaming pipeline. 

A Dynamic and real-time stream processing framework using Amazon MSK

Amazon MSK offers a fully managed service for building and running applications that use Apache Kafka to process streaming data.

The new pipeline aimed to leverage Amazon MSK's capabilities with the following core functionalities:

  • Automatic discovery of new topics
  • Dynamic onboarding of new topics by launching connectors
  • Automatic schema evolution
  • Deserialization capability
  • Dynamic flattening of complex/nested message structures to the N-th level
  • Self-healing with automatic relaunch of failed connectors
  • Consumer lag reporting

These enhancements not only modernized the pipeline to a truly distributed, fault-tolerant, and scalable design but also enabled the client to derive actionable insights from near real-time data. Additionally, the pipeline provided access to data at the finest event level for deeper analysis.

Approach

Understanding the need, Tredence designed and developed a dynamic, fault-tolerant, and fully scalable real-time ingestion framework. This framework established an end-to-end streaming ingestion pipeline, seamlessly connecting a microservices-based source application to an Operational Data Store (ODS) layer within Snowflake.

The solution harnessed the robust streaming capabilities of Amazon MSK, complemented by various other AWS services tailored to meet the pipeline's specific requirements.

Below is an architectural diagram illustrating the platform's solution.

Integrating our solution into the customer's ingestion landscape effectively addressed their key challenges:

  • Implemented end-to-end real-time ingestion capability, replacing traditional batch flows.
  • Established a unified pipeline to enhance code maintainability.

AWS Services Used

The solution is primarily built around AWS services, with a few exceptions like the Schema Registry and Snowflake.

The following components were used to build the pipeline:

  • Amazon MSK: Event message communication
  • Confluent Kafka Schema Binary: Schema Registry
  • AWS S3: Message persistence
  • AWS EC2: Jumphost instance and placeholder for automation scripts
  • AWS IAM: Identity and access management
  • AWS CloudWatch: Monitoring
  • AWS SNS: Notifications and alerts
  • Snowflake: Building the target ODS layer
  • Shell Scripts: Automation scripts

 

Key Benefits

The introduction of the real-time ingestion framework into the client's ingestion landscape has enabled evolution from batch-oriented ingestion to real-time capability. This advancement enables  actionable analytics in near real-time, a capability not achievable with the traditional batch-oriented approach.

The following high-value achievements are being offered:

  • End-to-end near real-time processing
  • Dynamic topic discovery and onboarding of new topics
  • Automatic schema evolution
  • Enhanced metadata capability
  • Self-healing through automatic failure handling
  • Consumer lag reporting for proactive resolution of slowness on the consumer side

 

Results

Icon Boost

A fully dynamic and configurable ready to use end-to-end streaming framework

Icon Boost

Automated data consumption pipelines scale up

Icon Boost

Low maintenance and operations cost

Talk To Us