Metrics that Matter for your Systems
Here are some of the metrics we've found to be the most effective in optimizing a system:
How long does it take...
- for a new engineer to understand the system?
- for a new engineer to comfortably work on the system?
- to add capacity to support increased load?
- to restore production data?
- to securely off-board an engineer?
For each important part of the system, how long to...
- be notified of the part's failure?
- to replace the part?
- discover and apply a security patch?
Of all the important parts of the system, what percentage are...
- regularly reviewed for security issues?
What is the percentage of...
- application uptime?
- system-related bugs?
Why they're awesome
You know what's really great about using these metrics? They force us to do the right things in order to improve them.
Each metric is generally improved by one of these:
- more simplification
- more automation
- better documentation
- better monitoring
- better security
You'll notice that the metrics don't measure anything about how "cool" the technologies are or how well they allow the engineers to show off their super-genius skills. All we care about is optimizing your business value so that your company can meet its goals as fast as possible.
While your competitors mindlessly throw money and engineers at their systems (HUGE mistake), you can achieve much better results by thoughtfully Simplifying, Automating, Documenting, Monitoring, and Securing your systems.
Original: 08 Sep 2013