Writing Custom Metrics to Stackdriver in Golang
Instrumentation is a critical part of any application. Along with system counters like cpu, heap, free disk, etc– it’s important to create application-level metrics to make sure health is measured closer to your customer’s experience.
Example metrics could be user-registration, password-change, profile-change, etc. If you see a major spike or dip in these metrics, a wider problem could be indicated.
For this example a custom metric was needed, and no infrastructure was in place for harvesting it (e.g. collectd). Golang is handy for creating an easy-to-install daemon which performs the measurement and periodically harvests the data into stackdriver.
The overall flow for custom metrics in stackdriver goes like this:
- Create a custom metric with the stackdriver Metric Descriptor API – this assigns a data type, dimensions, name & resource category to the metric.
- Push metric values to the metric, recording it’s value, resources and additional dimensions.
In the following example, we wanted to measure the number of files being
uploaded by a backup agent. The number of files was indicated by lines in a
logfile, which we used as our indicator (see Line Counter
and WatchFile
).
By sending the line count to stackdriver every 5 s, we were able to create a
dashboard and alert monitoring backup activity.
1. Create the Metric Descriptor
For your first metric, it’s easiest to use the API explorer which will auto complete the definition. As the project matures, call mectricDescriptors.create
via gcloud
- Visit Explorer
- name: backup-count
- valueType: INTEGER
- remove labels. We will assign a relevant resource using the writeMetric call
2. Collect and Harvest Custom Metric with Golang
For this example, we’re trivally tracking the line count. It can be expanded
to parse success vs error lines and send those as two metrics, e.g.
backup-count
vs error-backup-count
The code below creates a 5-sec timer which counts the file’s lines and sends that count to stackdriver. stackdriver reporting will allow you to run statistics e.g. avg() & max() and allow you to see the backup-count growth over time
package main
import (
"bytes"
"context"
"fmt"
"io"
"log"
"os"
"time"
monitoring "cloud.google.com/go/monitoring/apiv3"
timestamp "github.com/golang/protobuf/ptypes/timestamp"
metricpb "google.golang.org/genproto/googleapis/api/metric"
monitoredres "google.golang.org/genproto/googleapis/api/monitoredres"
monitoringpb "google.golang.org/genproto/googleapis/monitoring/v3"
)
const metricType = "custom.googleapis.com/photos/backup-count"
const (
projectID = "YOUR-GCP-PROJECT-ID"
instanceID = "INSTANCE-ID-FOR-TESTING"
INTERVAL = 3
zone = "us-east1-c"
)
func writeTimeSeriesValue(projectID, metricType string, value int) error {
ctx := context.Background()
c, err := monitoring.NewMetricClient(ctx)
if err != nil {
return err
}
now := ×tamp.Timestamp{
Seconds: time.Now().Unix(),
}
req := &monitoringpb.CreateTimeSeriesRequest{
Name: "projects/" + projectID,
TimeSeries: []*monitoringpb.TimeSeries{{
Metric: &metricpb.Metric{
Type: metricType,
},
Resource: &monitoredres.MonitoredResource{
Type: "gce_instance",
Labels: map[string]string{
"project_id": projectID,
"instance_id": instanceID,
"zone": zone,
},
},
Points: []*monitoringpb.Point{{
Interval: &monitoringpb.TimeInterval{
StartTime: now,
EndTime: now,
},
Value: &monitoringpb.TypedValue{
Value: &monitoringpb.TypedValue_Int64Value{
Int64Value: int64(value),
},
},
}},
}},
}
log.Printf("writeTimeseriesRequest: %+v\n", req)
err = c.CreateTimeSeries(ctx, req)
if err != nil {
return fmt.Errorf("could not write time series value, %v ", err)
}
return nil
}
// [END monitoring_read_timeseries_simple]
func LineCounter(r io.Reader) (int, error) {
buf := make([]byte, 32*1024)
count := 0
lineSep := []byte{'\n'}
for {
c, err := r.Read(buf)
count += bytes.Count(buf[:c], lineSep)
switch {
case err == io.EOF:
return count, nil
case err != nil:
return count, err
}
}
}
// given interval, return the # of lines in the file
func WatchFile(interval time.Duration, filename string) <-chan int {
handle, err := os.Open(filename)
ticker := time.NewTicker(interval)
sizeChan := make(chan int)
go func() {
for _ = range ticker.C {
lineCount, _ := LineCounter(handle)
sizeChan <- lineCount
if err != nil {
fmt.Println("error")
}
handle.Seek(0, 0)
}
}()
return sizeChan
}
func main() {
if len(os.Args) < 2 {
fmt.Printf("%s FILENAME\n", os.Args[0])
os.Exit(-1)
}
sizeChan := WatchFile(INTERVAL*time.Second, os.Args[1])
for {
select {
case <-sizeChan:
curSize := <-sizeChan
fmt.Printf("Lines in %s : %d\n", os.Args[1], curSize)
writeTimeSeriesValue(projectID, metricType, curSize)
}
}
}