Gareth du Toit

Cloud Solutions Architect

Deploying a Scalable Microservices Architecture for MTA Application Modernization

Solution Overview

The new architecture is composed of several core components, each distributed across availability zones to maximize resilience:

  • Frontend Application: This application processes and visualizes data from Logger instances, which capture output from the backend MTA application instances. The Frontend provides users with insights into mail flow and usage metrics.
  • MTA Backend Application Instances: These instances handle the core MTA functions and are designed to scale within the AKS cluster based on demand.
  • Logger Instances: The dedicated Logger services capture logs from the backend MTA instances, providing essential data for monitoring and analysis.
  • CMS Instance: This instance manages reporting flows, delivering insights to users on platform metrics, including mail flow statistics and usage metrics.

All inter-service communication occurs within the AKS cluster and external requests are secured by making use of TLS/SSL certificates for authentication and session connections to ensuring data protection standards are being met.
Additionally, making use AKS network policies being leveraged as an Intrusion Prevention System (IPS), further secure the external inbound communication channels.

Security and Data Integrity

Standardized Global Deployment with Terraform

Security is a foundational aspect of this architecture. TLS/SSL certificates secure all service-to-service communications, and AKS network policies add an effective IPS layer for internal traffic. Additionally, Azure Traffic Manager’s integration with SQL traffic allows for seamless failover between the primary and replica PostgreSQL instances, minimizing downtime and enhancing data reliability.

With IaC through Terraform, this solution is both consistent and scalable across regions. Standardized configurations enable rapid deployment, reduce manual intervention, and ensure that every deployment follows best practices. This approach allows for uniform, reliable infrastructure, regardless of where the solution is deployed.

High Availability and Scalability Features

To enhance availability and handle varying traffic loads, the architecture incorporates several advanced configurations:

Availability Zones:
This zonal distribution increases fault tolerance, minimizing the impact of failures within a single zone.

Dynamic Horizontal Pod Autoscaler (HPA):
The solution leverages AKS CronJobs to dynamically adjust HPA configurations based on anticipated traffic patterns. This allows the system to scale more efficiently by modifying resource limits based on the time of day or day of the week, ensuring that the infrastructure meets expected demands without excessive resource allocation.

Standby Nodes:
To reduce scaling time during high load periods, standby nodes are kept in a “ready” state rather than being fully deallocated when node pools scale down. This approach minimizes the time needed to bring nodes online, although it does incur storage costs for nodes in standby.

Node Pool Availability Zones:
Each application stack (e.g., MTA/Logger-01 and MTA/Logger-02) is paired with its own node pool assigned to a separate availability zone.
This zonal distribution enables load balancing across the application stack while enhancing resilience in case of zonal outages.

Azure Traffic Manager:
This service plays a critical role in the architecture, providing traffic management for two key areas: MTA resolution and SQL connectivity.
Azure Traffic Manager manages incoming connections for the MTA application through a weighted distribution strategy, enabling seamless traffic control and minimal service interruption during maintenance.
It also supports database failover by routing SQL traffic from the application stack to the appropriate PostgreSQL instance (primary or replica) based on availability. In case of a zonal outage or scheduled maintenance, traffic can be rerouted to the replica instance to ensure continuous service.

Preparing for the Future: Enhanced Log Shipping

Currently, dedicated Logger instances manage log data. However, future iterations of this solution will incorporate a messaging layer, enabling faster, more efficient log shipping. This will improve data transfer, streamline logging, and support advanced monitoring capabilities, paving the way for enhanced operational insights.

Conclusion

This architecture exemplifies modern best practices for scalable, resilient, and secure cloud deployments. By leveraging Kubernetes, Terraform, and Azure Traffic Manager, it transforms the MTA platform into a flexible and globally deployable solution, ensuring smooth operations and readiness for future growth.