Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sources should report stream latency of stuck events #104

Merged
merged 2 commits into from
Jan 9, 2025

Conversation

istreeter
Copy link
Contributor

Currently in common-streams apps, the application (i.e. not this library) is responsible for tracking latency of events consumed from the stream. Because of the way it is implemented in the apps, if an event gets stuck (e.g. cannot be written to destination) then the latency drops to zero for the period when the app is retrying the stuck event.

This commit aims to fix the problem where the latency_millis metric wrongly drops to zero.

  1. The source itself now tracks the latency of the event which is currently being processed by the downstream application.
  2. The Metrics class allows metric starting values to be wrapped in and Effect, i.e. F[Metrics.State]. This means the application can pass in the latency reported by the Source.

istreeter added a commit to snowplow-incubator/snowplow-lake-loader that referenced this pull request Jan 3, 2025
Common streams 0.10.0 brings significant to changes to the Kinesis and
Pubsub sources:

- PubSub source completely re-written to be a wrapper around UnaryPull
  snowplow-incubator/common-streams#101
- Kinesis source is more efficient when the stream is re-sharded
  snowplow-incubator/common-streams#102
- Kinesis source better tuned for larger deployments
  snowplow-incubator/common-streams#99

And improvements to latency metrics:
- Sources should report stream latency of stuck events
  snowplow-incubator/common-streams#104
istreeter added a commit to snowplow-incubator/snowplow-bigquery-loader that referenced this pull request Jan 3, 2025
Common streams 0.10.0 brings significant to changes to the Kinesis and
Pubsub sources:

- PubSub source completely re-written to be a wrapper around UnaryPull
  snowplow-incubator/common-streams#101
- Kinesis source is more efficient when the stream is re-sharded
  snowplow-incubator/common-streams#102
- Kinesis source better tuned for larger deployments
  snowplow-incubator/common-streams#99

And improvements to latency metrics:
- Sources should report stream latency of stuck events
  snowplow-incubator/common-streams#104
Currently in common-streams apps, the _application_ (i.e. not this
library) is responsible for tracking latency of events consumed from the
stream.  Because of the way it is implemented in the apps, if an event
gets stuck (e.g. cannot be written to destination) then the latency
drops to zero for the period when the app is retrying the stuck event.

This commit aims to fix the problem where the `latency_millis` metric
wrongly drops to zero.

1. The source itself now tracks the latency of the event which is
   currently being processed by the downstream application.
2. The Metrics class allows metric starting values to be wrapped in and
   Effect, i.e. `F[Metrics.State]`. This means the application can pass
   in the latency reported by the Source.
The `SourceAndAck` is now responsible for pushing the latency metric.  This
consolidates the metric across multiple apps into one place.

The `TokenedEvents` now does not contain a stream timestamp.  The downstream
app has no business knowing the stream timestamp, now that latency is
calculated inside common-streams lib.

The `LowLevelSource` no longer provides a `lastLiveness`.  Instead, the low
level source is expected to periodically emit a `None` whenever it is
healthy. The higher level `SourceAndAck` then takes responsibility for
monitoring whether the low level source is healthy.

As a result of these changes, the `SourceAndAck` emits a latency metric
of zero whenever the source is genuinely healthy but there are no events
to be pulled. This means the app's latency metric is only zero if there
are no events.  This is better than the previous situation, where the
app might mis-leadingly emit a zero latency when the app is stuck, e.g.
cannot connect to warehouse.
@istreeter istreeter merged commit 78da1d5 into develop Jan 9, 2025
1 check passed
@istreeter istreeter deleted the fix-latency-metric branch January 9, 2025 17:46
istreeter added a commit to snowplow-incubator/snowplow-lake-loader that referenced this pull request Jan 10, 2025
Common streams 0.10.0 brings significant to changes to the Kinesis and
Pubsub sources:

- PubSub source completely re-written to be a wrapper around UnaryPull
  snowplow-incubator/common-streams#101
- Kinesis source is more efficient when the stream is re-sharded
  snowplow-incubator/common-streams#102
- Kinesis source better tuned for larger deployments
  snowplow-incubator/common-streams#99

And improvements to latency metrics:
- Sources should report stream latency of stuck events
  snowplow-incubator/common-streams#104
istreeter added a commit to snowplow-incubator/snowplow-bigquery-loader that referenced this pull request Jan 14, 2025
Common streams 0.10.0 brings significant to changes to the Kinesis and
Pubsub sources:

- PubSub source completely re-written to be a wrapper around UnaryPull
  snowplow-incubator/common-streams#101
- Kinesis source is more efficient when the stream is re-sharded
  snowplow-incubator/common-streams#102
- Kinesis source better tuned for larger deployments
  snowplow-incubator/common-streams#99

And improvements to latency metrics:
- Sources should report stream latency of stuck events
  snowplow-incubator/common-streams#104
istreeter added a commit to snowplow-incubator/snowplow-bigquery-loader that referenced this pull request Jan 14, 2025
Common streams 0.10.0 brings significant to changes to the Kinesis and
Pubsub sources:

- PubSub source completely re-written to be a wrapper around UnaryPull
  snowplow-incubator/common-streams#101
- Kinesis source is more efficient when the stream is re-sharded
  snowplow-incubator/common-streams#102
- Kinesis source better tuned for larger deployments
  snowplow-incubator/common-streams#99

And improvements to latency metrics:
- Sources should report stream latency of stuck events
  snowplow-incubator/common-streams#104
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants