Cloudelligent Helps Customer Slash Costs by up to 20% - Cloudelligent

Cloudelligent Helps Customer Slash Costs by up to 20%

Executive Summary

Our customer offers machine learning, cloud-based auditing software for the wholesale finance audit industry. To achieve their objectives, they required assistance to support their Amazon Web Services (AWS) infrastructure and modernize the non-production and production environments. They also faced a lack of segregation of workloads and a broken pipeline for continuous deployment. Cloudelligent’s services helped the customer solve these challenges, paving way for their internal team to work on product development more efficiently and reliably, and reduce costs by up to 20%.

industry

  • AWS Services Implemented
  • EC2
  • RDS
  • Athena
  • Lambda
  • S3
  • Redshift
  • Elastic Beanstalk
  • DynamoDB
  • CloudTrail

About the Customer

The customer is part of the SaaS industry that provides quick and efficient auditing software to the wholesale financial services sector. Their applications offer comprehensive support for multiple auditing and verification processes, enabling businesses to assess and mitigate issues quickly.

Customer Challenge

Our customer faced the following IT challenges:

  • They needed to test and validate the multi-Availability Zones at the primary to ensure that their database was highly available.
  • They needed to redefine the backup retention at the read replica and validate their existing Amazon RDS DR.
  • They also needed to automate their time-consuming administrative tasks such as resource provisioning, patching, database setup, and backups.

If these issues were not addressed, it would have resulted in additional cost charges on their monthly bill. They would have also experienced disruption of day-to-day operations within their AWS environment.

Why Amazon Web Services?

The customer chose AWS as their cloud provider because of the various benefits the platform has to offer such as high performance, resiliency, scalability, and agility. Moreover, leveraging the latest AWS tools and technologies would streamline their development and delivery workflows which ensure a decreased time to market.

Why the Customer Chose Cloudelligent?

As an AWS Advanced Consulting Partner, Cloudelligent was well-equipped to take on the customer’s challenges and design well-architected solutions for backup, restore, and disaster recovery use cases. Cloudelligent also had the expertise and experience with the modification, upgradation, and testing of Terraform stacks to deploy AWS Elastic Beanstalk and other serverless resources. They could also help the customer to create and extend an Azure pipeline to deploy code on a Region designated for Disaster Recovery (DR). With the replication of FTP and VPN servers on the DR Region, the customer would be able to implement the perfect solution to their problem.

Cloudelligent’s Solution

Cloudelligent was able to successfully solve the customer’s challenges by:

  1. Setting up a Landing Zone
  2. Implementing Infrastructure as Code (IaC) for new environments as well as maintaining existing ones
  3. Implementing DevOps best practices around deployment automation, GitOps, CI/CD, observability, and security
  4. Implementing Disaster Recovery strategy based on RTO/RPO
  5. AWS accounts were set up, among which 2 accounts were the focus that contained all the infrastructure related to the deployed applications; one for dev/staging and the other for production. Both accounts have the same infrastructure deployed in us-east-1 and us-west-2. The infrastructure on us-west-2 is designated for disaster recovery purposes.

Services Implemented

The primary Amazon Web Services used to solve the customer challenges are listed below:

  • AWS Elastic Beanstalk: Serves as an environment to deploy applications on EC2 instances built by Azure DevOps.
  • Amazon VPC: Single VPC with 6 subnets, 3 in each Availability Zone in both Regions.
  • AWS ALB (Application Load Balancer): Hosted on the Availability Zones and directs traffic to Auto Scaling groups.
  • Amazon EC2: Used as a Bastion host for RDS servers, for VPN connection, SFTP gateway connectivity, and for applications deployed via Elastic Beanstalk.
  • Amazon RDS: Serves as a database for applications and has a cross Region replicated read replica.
  • Amazon S3: Serves the purpose to store the static content of applications, logs, and Terraform state.
  • Amazon API Gateway: Integrated with AWS Lambda and mainly provides data imported from DynamoDB tables based on the request generated.
  • AWS Lambda: Serves the purpose of Data Import for multiple audits from DynamoDB Tables.
  • Amazon SQS: Requests received from API gateway are asynchronously managed and then are responded to accordingly.
  • Amazon SNS: Serves as a source of notifications via emails to topic subscribed to alert in case of warnings.
  • Amazon DynamoDB: Data is stored in DynamoDB tables that is then used as reports for audits by Lambda functions.
  • AWS IAM: Serves as a fine-grain control over which users can access the various AWS accounts and resources.
  • Amazon Kinesis: Serves to analyze data, audio, and video in real-time.
  • AWS CloudFormation: Elastic Beanstalk and Control Tower is deployed via CloudFormation stack.

Third Party Tools and Services

  • Azure DevOps: All the code repositories and CI/CD are managed by Azure DevOps till the code is built. It then triggers Elastic Beanstalk to deploy the code over EC2 instances.
  • New Relic: The entire AWS infrastructure and all applications are monitored in real time with detailed analysis using New Relic and New Relic APM respectively.
  • nOps: Integrated with the AWS environment via IAM role and helps to improve the architecture using best practices and AWS Well-Architected Framework.

Initially, the code was deployed by Terraform but further upgrades were manually made to the architecture.

Workflow

A multi-account setup was created by using AWS Control Tower, details of which are as follows:

  • 2 organizational units, for the shared accounts and individual accounts each can be provisioned by the users.
  • 3 shared accounts, which are the management accounts and isolated accounts for log archive and security audit.
  • A native cloud directory with preconfigured groups and single sign-on access.
  • 20 preventive guardrails to enforce policies and 2 detective guardrails to detect configuration violations.

The Landing Zone was set up with a directory to manage user identities and single sign-on to provide the customer’s users with federated access across accounts. It offers preconfigured user groups and permission sets for the customer to easily manage specialized roles within their organization.

Applications for asset management and asset reporting are running on EC2 instances, both associated to two separate load balancers and are auto scaled accordingly. All the data related to the application is stored in an RDS MySQL database that has a read replica in us-west-2. The applications are deployed on EC2 instances through an Elastic Beanstalk environment which is triggered by a third-party Azure DevOps service. All the code repositories and CI/CD are managed over Azure DevOps till the code build. After the code is built, it triggers Elastic Beanstalk to deploy the code over EC2 instances. The environment of Elastic Beanstalk is stored in an S3 bucket along with the resources. 30 additional S3 buckets were also created to store Terraform states, backups for RDS, code for Lambda functions in zipped form, and exported .csv files of audits. All the buckets are private and encrypted.

The customer utilizes AWS Lambda services extensively integrated with API Gateways, DynamoDB, SQS and SNS. Majority of these Lambda functions can be utilized by the customer for multiple audits. In addition, Lambda functions are integrated with API Gateways which serves the purpose to fetch or post data to the DynamoDB tables based on the HTTP request from API Gateway. All the requests generated are asynchronously managed by SQS and in case of failure or any warning, and the customer is notified by a topic created in SNS.

The complete infrastructure was also deployed in us-west-2 that serves the purpose of DR. The DynamoDB tables have multi-Region replica created there, and the S3 buckets are also cross Region replicated. For monitoring purposes, the customer uses Amazon CloudWatch as well as New Relic as a third-party tool for a deeper diagnosis. New Relic is integrated using CloudFormation to create a data firehose for real-time diagnosis. In addition, applications are monitored using APM and synthetics on New Relic.

Results and Benefits

Teaming up with Cloudelligent proved to be beneficial for the customer since Cloudelligent was highly equipped to provide customized and well-architected solutions for their IT challenges. Their collaboration yielded several benefits:

Enhanced Reliability and Scalability
With Cloudelligent’s help, the customer can automatically scale their applications and cloud operations to accommodate any sudden spike in usage and better serve their clients.

Cost Optimization
Cloudelligent reviewed the customer’s existing AWS resources and any resource with low utilization was stopped or right-sized which enabled them to lower costs by up to 20%.

Improved Operational Efficiency
Cloudelligent designed a new testing infrastructure and helped build automated test configurations. Serverless computing resources enabled the customer to automate time-consuming administrative tasks such as resource provisioning, patching, database setup, and backups.