Writing Custom Metrics to Stackdriver in Golang

Posted on Wed 13 February 2019 in golang • Tagged with gcp, stackdriver, monitoring, tutorial

Instrumentation is a critical part of any application. Along with system counters like cpu, heap, free disk, etc-- it's important to create application-level metrics to make sure health is measured closer to your customer's experience.

Example metrics could be user-registration, password-change, profile-change, etc. If you see a major spike or dip in these metrics, a wider problem could be indicated.

For this example a custom metric was needed, and no infrastructure was in place for harvesting it (e.g. collectd). Golang is handy for creating an easy-to-install daemon which performs the measurement and periodically harvests the data into stackdriver.

The …

Continue reading

Delegating Admin Credentials using IAM Roles and Cloudwatch Alerts

Posted on Sat 12 December 2015 in aws • Tagged with aws, cloudwatch, alerts, monitoring

It's hard to strike the right balance with admin rights--either the rights are too strict and people can't get work done or they're too lenient and you have security issues.

As a compromise, AWS provides the AssumeRole feature which lets admins temporarily escalate their role to perform a task.

It's important when setting this up that you alert the team when it's used. Here we'll talk about how to set up the roles, give teams access to the roles and create an alert system when the roles are assumed.

Create The Temporary Admin Role

Use the IAM console to create …

Continue reading

On Software Scaffolding

Posted on Thu 09 July 2015 in aws • Tagged with monitoring, software

waterloo_bridge_1815 A new lightrail line is being built in my city with bridges passing over the major boulevards.  Seeing the elaborate scaffolding evoked comparisons to software engineering.  What does scaffolding look like in software? Does software need to be erected like a bridge via scaffolding?  Without a doubt: yes.

Here are some elements of software “scaffolding”:

  • Error log instrumentation with a formal error log schema (i.e. errors are uniquely identifiable in a MECE schema)
  • Operational instrumentation with reports , dashboards and alerts
  • Performance profiling on methods, database calls, rest calls, system calls and any blocking IO.
  • Client-side performance instrumentation and sampling …
Continue reading