April 4, 2026 · Tim Fraser, Cloud Operations Lead
A Simple AWS Monitoring Setup for Agency-Hosted Client Sites
Enterprise monitoring stacks — Datadog, New Relic, Prometheus with Grafana — are powerful. They're also expensive, complex to configure, and overkill for an agency running 10 client WordPress sites on AWS.
Agencies don't need 200 metrics per server. They need to know when something needs attention. Here's a minimum viable monitoring setup using tools already in your AWS account.
Uptime checks
Route 53 health checks handle this natively. Create an HTTP or HTTPS check for each client site, 30-second interval, failure threshold of 3. Point the alarm at an SNS topic that emails your team. Cost: $0.50-$0.75/month per endpoint.
For sites behind CloudFront, also monitor 5xxErrorRate on the distribution — this catches backend failures that CloudFront might mask by serving cached content.
SSL certificate expiry
ACM certificates auto-renew, but auto-renewal depends on DNS validation records existing. When a client migrates DNS or someone "cleans up" old records, the validation CNAME disappears and renewal silently fails.
Set a CloudWatch alarm when any certificate drops below 30 days on DaysToExpiry. An expired certificate doesn't just show a browser warning — modern browsers effectively block access. For clients, the experience is "my site is down."
Disk usage
Disk space fills up. Log files, uploads, database binary logs, package caches. A 20GB root volume on a t3.small will fill within 6-18 months without proper log rotation.
For EC2, install the CloudWatch agent and publish disk_used_percent. Alarm at 80%. For RDS, monitor FreeStorageSpace — an RDS instance out of storage stops accepting writes, which looks like an application crash. Alarm at 20% free remaining.
Error rates
Monitor HTTP 5xx errors at the load balancer. ALBs publish HTTPCode_Target_5XX_Count to CloudWatch automatically. Set an alarm for more than 10 errors in a 5-minute period, adjusted for traffic volume.
A sudden spike usually means a deployment problem or database connection issue. A gradual increase suggests a memory leak or growing data the application handles poorly.
Don't alarm on 4xx errors — bots probing for vulnerabilities generate constant 404s. Alarm fatigue defeats the purpose.
Cost alerts
Set up an AWS Budget per client at expected monthly cost plus 20% headroom. AWS emails you at 80%, 100%, and 120%. At $0.02/day per budget, even 20 clients costs under $15/month.
This catches: instances left oversized after troubleshooting, traffic spikes generating data transfer charges, storage crossing a pricing tier, and test resources never cleaned up.
What not to do
Don't build dashboards. Dashboards only work if someone checks them daily. Alerts that arrive in your inbox don't need you to remember anything. Don't collect every metric. CloudWatch detailed monitoring (1-minute intervals) costs extra and provides no benefit when the question is "is it working?" rather than "what's the p99 latency?" Don't add third-party tools. CloudWatch alarms plus Route 53 health checks cover the essentials. Another tool means another subscription, another login, another thing to maintain.How plainfra replaces the manual check
The setup above catches acute problems — things broken right now. It doesn't catch slow-building issues: disk at 70% and growing, certificates at 45 days and not yet renewed, an instance oversized for six months.
plainfra fills this gap with a weekly review across all connected accounts. It checks uptime, certificates, disk trends, error rates, costs, plus security groups, backup status, and unused resources. The output is a prioritised report: urgent items at the top, informational at the bottom.
For an agency managing multiple client sites, this is 5 minutes of reading per week versus an hour of clicking through consoles per client per month. The monitoring stack handles emergencies. plainfra handles everything else.
Try plainfra free → 50K tokens, 7 days, no charge. Or see the interactive demo →.