-
Notifications
You must be signed in to change notification settings - Fork 490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
groupBy not grouping by a tag created with eval #1765
Comments
It seems that the exact feature was added in #731. @nathanielc does the usage look correct to you? |
@petetnt Your script and usage look fine. The groupBy node has to buffer some data in order to ensure it recreates the groups correctly. Could that be the issue? Does it still happen if you simplify the script and just have an |
Seems that you might be right @nathanielc, simplied the script to the following and the issue still persists: var out_temp = batch
|query('SELECT mean(temperature) FROM xxx')
.every(10s)
.period(120d)
.groupBy(time(1h))
.fill(0)
|httpOut('1')
|eval(lambda: string(floor("mean")))
.as('bucket')
.tags('bucket')
.keep('mean')
|httpOut('2')
|groupBy('bucket')
|httpOut('3') Endpoints 1 and 2 give expected outputs, 3 is again Also ruled out regressions by trying with Kapacitor 1.3.4 and 1.2.0 which both had the same behaviour (except 1.2.0 returned Not quite sure what to try next 🤔 |
Can you change out the |
Here you go, it seems that var out_temp = batch
|query('SELECT mean(temperature) FROM "xxxxx" LIMIT 5')
.every(60s)
.period(120d)
.groupBy(time(1h))
.fill(0)
|log()
.prefix('ghissue1')
|eval(lambda: string(floor("mean")))
.as('bucket')
.tags('bucket')
.keep('mean')
|log()
.prefix('ghissue2')
|groupBy('bucket')
|log()
.prefix('ghissue3')
|
@petetnt Can you let the script run for a few minutes so that ghissue1 and 2 each have a few batches of data being processed? My guess at this point is that the groupBy is just buffering the data more than expected. After the run can you share the logs and the |
Sure thing. We thought about the buffering thing too, when couple of times the data seemed to show up after coming back from a coffee break or two only never to be seen again. I let it run for 20 minutes just now, there's the results:
The logs repeat showed nothing new, basically this for 15 minutes:
I'll let it run in the background with Thanks alot for your help, really appreciated! |
@petetnt Thanks for all these details. At this point it definitely looks like a bug. I should have enough to go on. I'll take a look at the code in a bit. |
Awesome! Do ping back if you need any additional details or need something to be tested, much appreciated! |
Okay, eventually the
No idea how long the script was running for it to appear though? Some hours maybe 🤔 (Should have written the logs to a file 🗄 ) |
Possibly related to #1249 (comment). Not sure why though, the data does match always 🤔 |
Here's an output of the data when it's working. The grouping works correctly, but it also quadruples all the values for some reason?? {
"series": [{
"name": "outdoor_temperatures",
"tags": {
"bucket": "14"
},
"columns": [
"time",
"mean"
],
"values": [
[
"2017-09-21T11:00:00Z",
14
],
[
"2017-09-21T11:00:00Z",
14
],
[
"2017-09-21T11:00:00Z",
14
],
[
"2017-09-21T11:00:00Z",
14
],
[
"2017-09-21T12:00:00Z",
14.1
],
[
"2017-09-21T12:00:00Z",
14.1
],
[
"2017-09-21T12:00:00Z",
14.1
],
[
"2017-09-21T12:00:00Z",
14.1
],
[
"2017-09-21T14:00:00Z",
14
],
[
"2017-09-21T14:00:00Z",
14
],
[
"2017-09-21T14:00:00Z",
14
],
[
"2017-09-21T14:00:00Z",
14
]
]
},
{
"name": "outdoor_temperatures",
"tags": {
"bucket": "13"
},
"columns": [
"time",
"mean"
],
"values": [
[
"2017-09-21T13:00:00Z",
13.8
],
[
"2017-09-21T13:00:00Z",
13.8
],
[
"2017-09-21T13:00:00Z",
13.8
],
[
"2017-09-21T13:00:00Z",
13.8
]
]
},
{
"name": "outdoor_temperatures",
"tags": {
"bucket": "0"
},
"columns": [
"time",
"mean"
],
"values": [
[
"2017-09-21T10:00:00Z",
0
],
[
"2017-09-21T10:00:00Z",
0
],
[
"2017-09-21T10:00:00Z",
0
],
[
"2017-09-21T10:00:00Z",
0
]
]
}
]
} |
We tried this with even smaller data sets |
I have spent a while debugging this now with @valstu (first time figuring out Golang code at scale 😂 ) and here's some findings: In there lines: https://github.com/influxdata/kapacitor/blob/master/group_by.go#L135-L137 The https://github.com/influxdata/kapacitor/blob/master/group_by.go#L137 is only called once an hour -> meaning that data comes through only once a hour. The https://github.com/influxdata/kapacitor/blob/master/group_by.go#L76-L80 Possible fix pending at #1773 |
Spend hours debugging this and facing a similar issue the "groupBy not grouping by a tag created with eval" |
We are running into an issue with TICKscript and its groupBy behaviour.
We have two sets of measurements, indoor_temperatures and outdoor_temperatures, which we query with a batch.
The queries look as follows:
If we HTTP out both of them, they create the following sets of data:
Now we do a full join of them, which gives us expected results
httpOut:
Which looks perfect. The issue raises when we want to round the
out_temp_mean.mean
down and groupBy itSo we go ahead and extend the script
After which the output STILL looks as it should:
Now only thing left is to group the values by the new tag bucket:
After which everything goes awry and we are greeted with
series: null
Is this expected behaviour? A bug? Or something else?
The text was updated successfully, but these errors were encountered: