-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tightening_ratio support #26
Conversation
Can this PR be rebased? The digest change is merged in and looks like this PR contains commits from digest (which should be removed after rebasing) |
src/bloom/command_handler.rs
Outdated
let tightening_ratio = match input_args[curr_cmd_idx].to_string_lossy().parse::<f64>() { | ||
Ok(num) if num > TIGHTENING_RATIO_MIN && num < TIGHTENING_RATIO_MAX => num, | ||
Ok(num) if !(num > TIGHTENING_RATIO_MIN && num < TIGHTENING_RATIO_MAX) => { | ||
return Err(ValkeyError::Str(utils::ERROR_RATIO_RANGE)); | ||
} | ||
_ => { | ||
return Err(ValkeyError::Str(utils::BAD_ERROR_RATIO)); | ||
} | ||
}; | ||
curr_cmd_idx += 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will not be provided as a command argument for the BF.RESERVE command. We can remove this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure will remove this
src/bloom/command_handler.rs
Outdated
@@ -327,6 +341,18 @@ pub fn bloom_filter_insert(ctx: &Context, input_args: &[ValkeyString]) -> Valkey | |||
} | |||
}; | |||
} | |||
"RATIO" => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Renaming to
TIGHTENING
makes this clearer - Let's add a check to only allow this argument if the command is replicated. If it is not replicated, we should not accept it as command args. This means, if the command is on a primary, it will use the constant from the configs.rs file
src/bloom/command_handler.rs
Outdated
@@ -228,6 +230,16 @@ pub fn bloom_filter_reserve(ctx: &Context, input_args: &[ValkeyString]) -> Valke | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will need to update the arity check at the beginning of the fn to handle the extra TIGHTENING argument
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will update this
src/configs.rs
Outdated
@@ -32,6 +32,8 @@ lazy_static! { | |||
// Tightening ratio used during scale out for the calculation of fp_rate of every new filter within a bloom object to | |||
// maintain the bloom object's overall fp_rate to the configured value. | |||
pub const TIGHTENING_RATIO: f64 = 0.5; | |||
pub const TIGHTENING_RATIO_MIN: f64 = 0.001; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be 0, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will update
src/wrapper/bloom_callback.rs
Outdated
@@ -2,6 +2,7 @@ use crate::bloom; | |||
use crate::bloom::data_type::ValkeyDataType; | |||
use crate::bloom::utils::BloomFilterType; | |||
use crate::configs; | |||
use crate::wrapper::digest::Digest; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are updating RDB Load, we need to update RDB Save to add the tightening ratio at the right place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will update here in rdb_save
src/bloom/data_type.rs
Outdated
fn debug_digest(&self, mut dig: Digest) { | ||
dig.add_long_long(self.expansion.into()); | ||
dig.add_string_buffer(&self.fp_rate.to_le_bytes()); | ||
// dig.add_string_buffer(&self.tightening_ratio.to_le_bytes()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add the tightening ratio to the digest. Why is this commented out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was testing out something else and commented this briefly, will uncomment in future revision
6bec972
to
7c25468
Compare
99acf6f
to
078534f
Compare
src/bloom/command_handler.rs
Outdated
@@ -299,14 +298,15 @@ pub fn bloom_filter_reserve(ctx: &Context, input_args: &[ValkeyString]) -> Valke | |||
pub fn bloom_filter_insert(ctx: &Context, input_args: &[ValkeyString]) -> ValkeyResult { | |||
let argc = input_args.len(); | |||
// At the very least, we need: BF.INSERT <key> ITEMS <item> | |||
if argc < 4 { | |||
if argc < 5 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This depends on ctx.get_flags().contains(ContextFlags::REPLICATED)
, right?
If replica, < 5. If primary, < 4. Can we update it?
src/bloom/utils.rs
Outdated
@@ -94,9 +98,11 @@ impl BloomFilterType { | |||
// Create the bloom filter and add to the main BloomFilter object. | |||
let bloom = BloomFilter::new(fp_rate, capacity); | |||
let filters = vec![bloom]; | |||
let tightening_ratio = 0.5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will need to be passed in as arguments from new_reserved
. This should not be hard coded
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had added it in this change but since you mentioned above:
not be provided as a command argument for the BF.RESERVE command
I had to remove it wherever we use new_reserved
src/bloom/command_handler.rs
Outdated
@@ -227,7 +227,6 @@ pub fn bloom_filter_reserve(ctx: &Context, input_args: &[ValkeyString]) -> Valke | |||
return Err(ValkeyError::Str(utils::BAD_ERROR_RATE)); | |||
} | |||
}; | |||
curr_cmd_idx += 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add this back. It might have been accidentally removed in your PR
@@ -779,6 +794,7 @@ mod tests { | |||
|
|||
// verify new_bf and bf | |||
assert_eq!(bf.fp_rate, new_bf.fp_rate); | |||
assert_eq!(bf.tightening_ratio, new_bf.tightening_ratio); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the verify_restored_items function, can we assert on this?
You can also do this assert for the false positive rate
assert_eq!(
restored_bloom_filter_type.tightenting_ratio,
original_bloom_filter_type.tightenting_ratio
);
c992016
to
1fb640c
Compare
src/bloom/command_handler.rs
Outdated
@@ -298,15 +302,17 @@ pub fn bloom_filter_reserve(ctx: &Context, input_args: &[ValkeyString]) -> Valke | |||
|
|||
pub fn bloom_filter_insert(ctx: &Context, input_args: &[ValkeyString]) -> ValkeyResult { | |||
let argc = input_args.len(); | |||
let validate_size_limit = ctx.get_flags().contains(ContextFlags::REPLICATED); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let validate_size_limit = ctx.get_flags().contains(ContextFlags::REPLICATED); | |
let replicated_cmd = ctx.get_flags().contains(ContextFlags::REPLICATED); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will update
src/bloom/command_handler.rs
Outdated
"TIGHTENING" => { | ||
if !validate_size_limit { | ||
return Err(ValkeyError::Str(utils::ERROR)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"TIGHTENING" => { | |
if !validate_size_limit { | |
return Err(ValkeyError::Str(utils::ERROR)); | |
} | |
"TIGHTENING" if replicated_cmd => { | |
if idx >= (argc - 1) { | |
return Err(ValkeyError::WrongArity); | |
} |
src/bloom/command_handler.rs
Outdated
// At the very least, we need: BF.INSERT <key> ITEMS <item> | ||
if argc < 4 { | ||
if !validate_size_limit && argc < 4 || validate_size_limit && argc < 5 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if !validate_size_limit && argc < 4 || validate_size_limit && argc < 5 { | |
if argc < 4 { |
This should handle it already. It is not required to contain tightening ratio on the replicated command - it is optional and we should handle cases where it provided and not provided.
The check above is sufficient
src/bloom/utils.rs
Outdated
@@ -95,9 +100,11 @@ impl BloomFilterType { | |||
// Create the bloom filter and add to the main BloomFilter object. | |||
let bloom = BloomFilter::new(fp_rate, capacity); | |||
let filters = vec![bloom]; | |||
let tightening_ratio = 0.5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed? Removing it
let tightening_ratio = 0.5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed to remove this, it is not required
tests/test_replication.py
Outdated
@@ -34,7 +34,7 @@ def test_replication_behavior(self): | |||
('BF.ADD', 'BF.ADD key item', 'BF.ADD key item1', 2), | |||
('BF.MADD', 'BF.MADD key item', 'BF.MADD key item1', 2), | |||
('BF.RESERVE', 'BF.RESERVE key 0.001 100000', 'BF.ADD key item1', 1), | |||
('BF.INSERT', 'BF.INSERT key items item', 'BF.INSERT key items item1', 2), | |||
('BF.INSERT', 'BF.INSERT key items item tightening 0.5', 'BF.INSERT key items item1 tightening 0.5', 2), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
('BF.INSERT', 'BF.INSERT key items item tightening 0.5', 'BF.INSERT key items item1 tightening 0.5', 2), | |
('BF.INSERT', 'BF.INSERT key items item', 'BF.INSERT key items item1', 2), |
This is an invalid test. The tightening ratio is coming after ITEM
so it is treated as an item. It is not treated as a property. Also, we cannot use this command on the primary, since we only allow this arg during replication :D
Let's just use a normal BF.INSERT command.
Once we support the tightening ratio config, we can then (1) update the config on the primary node, (2) Use BF.INSERT (3) Validate Digest matches on the replica even though the tightening ratio config is different
For now, let us leave it as it is because it is essentially untested code (unless we use a monitor on replica during bloom creation on the primary node)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah got it, will revert this change accordingly
src/bloom/utils.rs
Outdated
@@ -642,12 +672,18 @@ mod tests { | |||
let rand_prefix = random_prefix(7); | |||
// 1 in every 1000 operations is expected to be a false positive. | |||
let expected_fp_rate: f64 = 0.001; | |||
let expected_tightening_ratio: f64 = 0.5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let expected_tightening_ratio: f64 = 0.5; | |
let tightening_ratio: f64 = 0.5; |
src/bloom/utils.rs
Outdated
@@ -584,13 +608,19 @@ mod tests { | |||
let rand_prefix = random_prefix(7); | |||
// 1 in every 1000 operations is expected to be a false positive. | |||
let expected_fp_rate: f64 = 0.001; | |||
let expected_tightening_ratio: f64 = 0.5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let expected_tightening_ratio: f64 = 0.5; | |
let tightening_ratio: f64 = 0.5; |
We are not asserting against this. so there is no need to call it "expected"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will change accordingly
Signed-off-by: Nihal Mehta <[email protected]>
e4521ea
to
de6ec44
Compare
@@ -22,6 +25,8 @@ pub const BAD_EXPANSION: &str = "ERR bad expansion"; | |||
pub const BAD_CAPACITY: &str = "ERR bad capacity"; | |||
pub const BAD_ERROR_RATE: &str = "ERR bad error rate"; | |||
pub const ERROR_RATE_RANGE: &str = "ERR (0 < error rate range < 1)"; | |||
pub const BAD_ERROR_RATIO: &str = "ERR bad error ratio"; | |||
pub const ERROR_RATIO_RANGE: &str = "ERR (0 < error ratio range < 1)"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would having a better error message i.e. something like ratio must be in the range (0, 1)
or is following convention for error rate range better?
Add tightening_ratio support for tuning of the
TIGHENTING_RATIO
to allow scalable bloom objects to have a higher capacity at cost of higher false positive rate.We allow the
TIGHTENING_RATIO
argument only if the command is replicated.