Conversation
This removes tests for valid metrics from the TestInvalidMetricsLexer test. It also codifies handling of "foo:" and ":foo" for tags as valid tags. This may not be the design we want, but it's the code we have, so now we enforce it with tests.
MovieStoreGuy
left a comment
There was a problem hiding this comment.
Just some questions on things, I would like to enforce compile time interface checks where possible.
| Prometheus Remote Writer Backend | ||
| -------------------------------- | ||
| The `promremotewriter` backend supports sending metrics to a Prometheus Remote Writer compatible backend. It currently | ||
| drops events. At present there is no authentication supported. |
There was a problem hiding this comment.
What should be expected with dropped events?
There was a problem hiding this comment.
Prometheus doesn't do events, they need to be managed by something external (ie, grafana). So that will probably end up as a grafana-events backend which drops metrics or something. Haven't really thought about it too hard.
|
|
||
| 29.0.2 | ||
| ------ | ||
| - New Relic backend handles infinities and NaN better |
There was a problem hiding this comment.
Is this change part of this change set?
There was a problem hiding this comment.
No, it was an "unreleased" change in 29.0.2, this just documents when the behavior went in.
| @@ -0,0 +1,229 @@ | |||
| // Code generated by protoc-gen-go. DO NOT EDIT. | |||
There was a problem hiding this comment.
Is it okay to check in the auto generated code?
There was a problem hiding this comment.
Yep, we do it already with the other PB stuff. It simplifies the need to deal with protoc.
| @@ -0,0 +1,29 @@ | |||
| syntax = "proto3"; | |||
There was a problem hiding this comment.
Is there a reference that this proto file was built from?
There was a problem hiding this comment.
Yeah, came from the prom remote write spec, I can add a ref to it.
- this
There was a problem hiding this comment.
Yes please, just in case we need to chase it up.
| ) | ||
|
|
||
| // flush represents a send operation. | ||
| type flush struct { |
There was a problem hiding this comment.
Does this struct implement an interface?
| } | ||
|
|
||
| func (f *flush) maybeFlush() { | ||
| if uint(len(f.writeRequest.Timeseries))+20 >= f.metricsPerBatch { // flush before it reaches max size and grows the slice |
There was a problem hiding this comment.
It's inherited from datadog, but it's meant to be the maximum cardinality of a single metric. That is, a timer is going to have like 10 metrics it emits. 20 just gives it a buffer.
The intent is len(...) < metricsPerBatch, rather than len(...) > metricsPerBatch+epsilon, because if there's a limit on the backend, then it might be a hard limit, so going 1 over is a hard fail.
There was a problem hiding this comment.
Could I ask you to make the number a constant with the file to make it easier in future to understand the code a bit more?
| const ( | ||
| // BackendName is the name of this backend. | ||
| BackendName = "promremotewriter" | ||
| defaultUserAgent = "gostatsd" // TODO: Add version |
There was a problem hiding this comment.
Worth raising an issue for?
There was a problem hiding this comment.
Nah, pretty sure it's a theme.
| ) | ||
|
|
||
| // Client represents a Prometheus Remote Writer client. | ||
| type Client struct { |
There was a problem hiding this comment.
Is there an interface that this implements?
There was a problem hiding this comment.
Yup, gostatsd.Backend. It's not formally checked, but the code won't compile if it doesn't implement it (NewClientFromViper will fail its return type)
There was a problem hiding this comment.
Not worth doing the compiler time interface check within the file?
There was a problem hiding this comment.
We do, NewClientFromViper will fail if it doesn't match.
| now := clock.FromContext(ctx).Now().UnixNano() / 1_000_000 | ||
| prw.processMetrics(now, metrics, func(writeReq *pb.PromWriteRequest) { | ||
| atomic.AddUint64(&prw.batchesCreated, 1) | ||
| go func() { |
There was a problem hiding this comment.
Any concern if this is called mutliple times?
There was a problem hiding this comment.
I don't love it, but SendMetricsAsync needs to complete fast - If it blocks, everything blocks. MetricFlusher.flushData ensures receiving of data to continue, and enforces that we won't initiate a second flush before the first one completes.
| bufCompressed := snappy.Encode(bufBorrowed, raw) | ||
| if cap(bufBorrowed) < cap(bufCompressed) { | ||
| // it overflowed the buffer and had to allocate, so we'll keep the larger buffer for next time | ||
| *buffer = bytes.NewBuffer(bufCompressed) |
There was a problem hiding this comment.
is it worth looking at https://pkg.go.dev/sync#Pool for this functionality instead of doing the memory management for go?
There was a problem hiding this comment.
snappy is creating/growing the byte array if needed, the best we can do is detect if it needed to grow, and re-use it. The bytes.Buffer is used in the semaphore, so we'll reach a local maxima and then stop allocating more data. There's not a large amount of byte slices allocated.
|
Once again we need to ignore coveralls, I think it's just sad because there's new code without sufficient coverage to actively increase the %. |
|
Oh hey I never got this merged. I'll clean it up later. |
|
Soz |
|
See #388 (comment) |
Probably easiest to just look at 6ca4ea3, it's based off the Datadog backend, with a lot of reused stuff.