Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

periodic compaction seconds not working as expected #12165

Closed
zaidoon1 opened this issue Dec 19, 2023 · 15 comments
Closed

periodic compaction seconds not working as expected #12165

zaidoon1 opened this issue Dec 19, 2023 · 15 comments
Assignees

Comments

@zaidoon1
Copy link
Contributor

My understanding is that with periodic compaction seconds, we can schedule files to be compacted. And when compaction runs, tombstones are removed to keep db size small among other things.

My db size is small (around 5gb) so leaving all the defaults in place for rocksdb will mean I will accumulate a lot of tombstones so before trying to use periodic compactions seconds (because it wasn't exposed via c api so I couldn't use it), I used less optimal settings to force compaction:

opts.set_max_bytes_for_level_base(25 * 1024 * 1024);
opts.set_target_file_size_base(32 * 1024 * 1024);

which was fine but it's not ideal. After periodic compaction seconds was exposed via c api, I removed the two settings above and set periodic compaction seconds to run every 3 hours (i delete data on a cron every 4 hours).

As we can see, periodic compaction seconds doesn't seem to force compaction? or at least doesn't delete tombstones like I expected:

Screenshot 2023-12-19 at 11 34 18 AM

Here is the LOG file as well in case it helps:
LOG.txt

Any idea why periodic compaction seconds is not working? Or at least is not doing the same thing as setting the "bad" options

@cbi42
Copy link
Member

cbi42 commented Dec 19, 2023

periodic_compaction_seconds compacts file to the same level. If a file with tombstones is not in the bottommost level, it may not be helpful. You may want to set ttl too which compacts old files down.

You may want to consider using delete triggered compaction if the goal is to remove files with many tombstones:

class CompactOnDeletionCollectorFactory

i delete data on a cron every 4 hours

Another thing to consider to is do manual compaction after you issue deletions.

@cbi42 cbi42 self-assigned this Dec 20, 2023
@zaidoon1
Copy link
Contributor Author

zaidoon1 commented Dec 20, 2023

oh so I did misunderstand what this feature does, what does compacts file to the same level achieve?

also I did read in the docs about setting ttl to compact old files down but I'm not sure what exact feature that is. Can you link me to the code for the compaction ttl? (specifically, I rely on the C api since i'm using rust rocksdb so hopefully that's exposed)

delete triggered compaction i

regarding this, I initially considered it but after asking around in the community, it seems this feature is best suited for when deletion is concentrated in a specific range where as my deletes are all over the place. Is this accurate?

Another thing to consider to is do manual compaction after you issue deletions.

This definitely sounds like a great idea and I'll likely go with this approach and maybe combine it with the ttl based compaction.

Also regarding https://github.com/facebook/rocksdb/wiki/Manual-Compaction#compactrange, do you recommend I change any of the defaults to achieve what I want of removing tombstones? I'm guessing I need to set CompactRangeOptions::bottommost_level_compaction to BottommostLevelCompaction::KforceOptimized since I don't have a custom compaction filter setup and I want to compact the bottommost level?

@cbi42
Copy link
Member

cbi42 commented Dec 20, 2023

what does compacts file to the same level achieve

It lets files go through compaction filter periodically. The option has a different meaning for universal compaction.

Can you link me to the code for the compaction ttl?

I don't see it in C API. Feel free to submit a PR to add it.

delete triggered compaction

You can specify a file to be compacted if it has more than a D tombstones in a window of X entries, or that the ratio of tombstones exceed some value.

I'm guessing I need to set CompactRangeOptions::bottommost_level_compaction to BottommostLevelCompaction::KforceOptimized

Yeah, setting it to kForceOptimized should be enough.

@ajkr
Copy link
Contributor

ajkr commented Dec 21, 2023

I wonder if we could get away with the following behavior change: in leveled compaction, also trigger compaction to the next level for upper-level files whose data age exceeds periodic_compaction_seconds. (Is periodic_compaction_seconds < ttl in leveled compaction a common/useful setting today?)

@git-hulk
Copy link
Contributor

I wonder if we could get away with the following behavior change: in leveled compaction, also trigger compaction to the next level for upper-level files whose data age exceeds periodic_compaction_seconds.

As a user, I believe it would be better if we could trigger compaction to the next level. I can submit a PR to resolve this if it sounds good to maintainers.

@zaidoon1
Copy link
Contributor Author

I also agree that it would be great to trigger compaction to the next level.

@zaidoon1
Copy link
Contributor Author

zaidoon1 commented Dec 21, 2023

I've created a pr to expose options ttl via c api #12170

@git-hulk
Copy link
Contributor

Is periodic_compaction_seconds < ttl in leveled compaction a common/useful setting today?

I see, seems periodic_compaction_seconds will perform the same way with TTL in level compaction if allowed to trigger next-level compaction.

@zaidoon1
Copy link
Contributor Author

zaidoon1 commented Dec 21, 2023

right, based on my understanding, I completely misunderstood how periodic_compaction_seconds works, what I was looking for is the TTL option

@git-hulk
Copy link
Contributor

I checked the implementation, and the TTL option only triggers the next-level compaction without caring about the input file level. So I also think it's good to switch to the TTL option since we're using the level compaction.

@cbi42
Copy link
Member

cbi42 commented Dec 21, 2023

Is periodic_compaction_seconds < ttl in leveled compaction a common/useful setting today?

Probably not, this can also prevent ttl compaction from happening.

trigger compaction to the next level for upper-level files whose data age exceeds periodic_compaction_seconds

I'm fine with this change. It should not affect users who configure ttl <= periodic_compaction_seconds. The change may not make sense when level_compaction_dynamic_level_bytes=false, but neither does ttl.

@ajkr
Copy link
Contributor

ajkr commented Dec 21, 2023

As a user, I believe it would be better if we could trigger compaction to the next level. I can submit a PR to resolve this if it sounds good to maintainers.

It sounds good to me

@git-hulk
Copy link
Contributor

I'm fine with this change. It should not affect users who configure ttl <= periodic_compaction_seconds. The change may not make sense when level_compaction_dynamic_level_bytes=false, but neither does ttl.

Sure, thank you!

@kishorenc
Copy link

I just stumbled on this exact same gotcha. Perhaps the documentation / comments around periodic_compaction_seconds could have been clearer in this regard. Like others on this thread, I fully expected compaction to trigger to the next level and that would be a welcome change.

kishorenc added a commit to typesense/typesense that referenced this issue Dec 27, 2023
This is required until RocksDB fixes the behavior of periodic compaction seconds: facebook/rocksdb#12165
facebook-github-bot pushed a commit that referenced this issue Dec 28, 2023
…_compaction_seconds (#12175)

Summary:
Currently, the data are always compacted to the same level if exceed periodic_compaction_seconds which may confuse users, so we change it to allow trigger compaction to the next level here. It's a behavior change to users, and may affect users
who have disabled their ttl or ttl > periodic_compaction_seconds.

Relate issue: #12165

Pull Request resolved: #12175

Reviewed By: ajkr

Differential Revision: D52446722

Pulled By: cbi42

fbshipit-source-id: ccd3d2c6434ed77055735a03408d4a62d119342f
@cbi42
Copy link
Member

cbi42 commented Jan 2, 2024

periodic_compaction_seconds was working as expected and improvement is made in #12175.

@cbi42 cbi42 closed this as completed Jan 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants