Open to new opportunities

Hi, I'm
Lê Minh Trung

Senior SRE Engineer & Infrastructure Leader

I build cloud infrastructure that holds under pressure — 99.99% uptime, zero-downtime deployments, and systems that scale to tens of millions of monthly users. 8+ years hardening infrastructure from bare-metal to fully-managed AWS.

Lê Minh Trung
🏆
AWS Certified
Solutions Architect Pro
8+ Years
Cloud & SRE
0+
Years of Experience
0%
Uptime SLA Achieved
0k+
Requests Per Second
0% ↓
AWS Cost Saved

Where I've Shipped Things

From petabyte-scale on-prem storage to multi-region AWS architectures serving real traffic.

Vietlink Ads
Sr. SRE Engineer — SRE Leader  ·  Team: 4–5 SREs
Feb 2023 — Present  ·  3 yrs
99.99% Uptime SLA
5k–10k+ RPS
30–40% AWS Cost ↓
30–40% Deploy Time ↓
System Architecture
  • Designed end-to-end AWS system architecture — multi-AZ, service mesh, network topology, VPC design
  • Architected infrastructure supporting tens of millions of monthly visits
Reliability Engineering
  • Managed EKS/ECS clusters with autoscaling; maintained 99.99% uptime SLA every quarter
  • Defined and owned SLOs / SLIs and error budget policies across critical services
  • Benchmarked & tuned to sustain 5,000–10,000+ RPS; cut AWS spend 30–40%
IaC, CI/CD & DevSecOps
  • Full IaC via Terraform: EKS, ECS, EC2, RDS, SQS, Lambda, StepFunctions, API Gateway
  • CI/CD with GitHub Actions, CodeBuild, CodeDeploy, CodePipeline & ArgoCD — integrated SAST (SonarQube) & DAST (OWASP ZAP) security gates; cut deployment time 30–40%
  • Embedded integration & load testing (Locust, k6) as pipeline gates before prod promotion
  • Optimized Docker images with multi-stage builds — lean base images, build time reduced 30–40%
  • Led monolith → microservices; VM → containerized workloads
Capacity Planning & Observability
  • SRE workload estimation, capacity planning, infra cost forecasting & budget alignment
  • Investigated & resolved production incidents end-to-end; authored runbooks and RCA reports
  • Observability: CloudWatch + DataDog + NewRelic; on-call rotation with structured post-mortem culture
EKSECSTerraform GitHub ActionsCodeBuildCodeDeployCodePipelineArgoCDSonarQube OWASP ZAPLocustk6 DataDogNewRelicCloudWatch
VNG Corporation
System / DevOps Engineer
2018 — Feb 2023  ·  5 yrs
300+ VMs managed
50+ Physical servers
PB-scale Ceph storage
On-Prem Cloud & Scale
  • Built & operated on-premises CloudStack cloud; designed network topology from scratch
  • Managed 50+ physical servers and 300+ VMs via Ansible & Terraform
  • Engineered Ceph distributed storage reaching petabyte scale
Container Migration & Orchestration
  • Deployed bare-metal Kubernetes clusters; migrated Docker workloads from 300+ VMs to K8s
Automation, CI/CD & Security
  • CI/CD pipelines with Jenkins & GitLab CI across multiple internal product teams
  • Monitoring with Prometheus + Grafana + ELK; applied ISO 27001 security standards
KubernetesCloudStackCeph AnsibleTerraformJenkins GitLab CIPrometheusELK

Core Competencies

From bare-metal on-prem to multi-region AWS — covering architecture, reliability, security, and delivery at production scale.

  Cloud Infrastructure (AWS)
EKSECS EC2S3 RDSLambda Route53ALB/NLB SQS/SNSStepFunctions API GatewayCloudWatch
  Container Orchestration
Kubernetes Docker Helm ArgoCD HPA / VPA Multi-stage Build CloudStack
  Infrastructure as Code
Terraform CloudFormation Ansible Workload Estimation Cost Forecasting Capacity Planning
  CI/CD & DevSecOps
CodePipeline GitHub Actions GitLab CI Jenkins SonarQube (SAST) OWASP ZAP (DAST) Locust k6 ISO 27001
  Observability & APM
Prometheus Grafana DataDog NewRelic CloudWatch ELK Stack Loki APM
  Networking, Storage & Data
VPC / Subnets SG / NACLs Nginx KrakenD MySQL PostgreSQL MongoDB Redis Ceph

Certifications

AWS dual-certified and Anthropic-trained — cloud infrastructure from architecture to AI tooling.

🏆
AWS Certified Solutions Architect
Amazon Web Services  ·  Professional Level
Verified on Credly
☁️
AWS Certified Solutions Architect
Amazon Web Services  ·  Associate Level
Verified on Credly
🤖
Claude 101
Anthropic  ·  Certificate of Completion
Verified by Skilljar
Claude Code in Action
Anthropic  ·  Certificate of Completion
Verified by Skilljar

The Person Behind the Infra

I'm a Senior SRE Engineer and Infrastructure Leader based in Ho Chi Minh City, Vietnam. Over 8 years in IT — 4 building on-premises infrastructure from scratch, and 5 deep in AWS — I've developed a focus on systems that are genuinely reliable, not just SLA-compliant on paper.

At Vietlink Ads, I lead the SRE team and own the AWS infrastructure serving tens of millions of users monthly. Before that, at VNG Corporation — Vietnam's largest internet company — I worked on everything from bare-metal Kubernetes to petabyte-scale Ceph clusters.

My philosophy: infrastructure should be invisible. Fast, resilient, and cost-efficient — so engineering teams can ship features without worrying about the floor beneath them.

I've operated across diverse engagement models — in-house product teams, outsourced projects, and hybrid setups — and I'm equally comfortable in Agile sprints or Waterfall delivery cycles.

English: Professional working proficiency — technical documentation, cross-functional communication, and async collaboration with international teams.

🎓  VNUHCM — University of Science
Aug 2013 – Jun 2017
Bachelor's Degree in Physics — Informatics
Thesis: Installation and Deployment of Ceph Distributed Storage System  — Grade 9/10
🏆  Key Achievements
  • Maintained 99.99% uptime SLA — every single quarter
  • Cut AWS infrastructure costs by 30–40% while improving throughput
  • Reduced deployment time 50–70% with automated CI/CD
  • Led full migration: monolith → microservices, VMs → containers
  • Benchmarked systems to sustain 5,000–10,000+ RPS under load

Let's build something
rock-solid together.

Open to Senior SRE, Platform Engineering, and Cloud Architecture roles. I bring 8+ years of battle-tested experience in infrastructure that scales, saves money, and simply stays up.

Send me an email