Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding RegNets to tf.keras.applications #15702
Adding RegNets to tf.keras.applications #15702
Changes from 9 commits
275565f
6c719b1
fb05277
892813f
7297820
98d67e5
4383429
c7f2151
e17f378
851ca16
cf25748
62c79eb
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason this needs to be updated? Do things break otherwise? We still run these test cases in TF1 without eager mode enabled, and this line is breaking a number of our tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattdangerw
Yes, the change is necessary. Grouped convolutions are not yet fully supported on CPUs. We see that
model.predict(X_test)
breaks whereasmodel(X_test)
works fine.There are a number of issues discussing this in the TF repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could trigger the failures by adding a call to
tf.compat.v1.disable_v2_behavior()
after before the call totf.test.main
inapplications_load_weight_test
. We can't submit this if we are breaking all these application tests in a TF1 context. We would need to find a change that does not rely on eager mode behavior (.numpy is eager only).This might mean we need to dig into the difference between direct call vs predict here. It sound like this is an issue with grouped convolutions on CPU that will only appear when compiling a tf.function, is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattdangerw
I have found a small solution to this. I have tested the change both with TF2 and using
tf.compat.v1.disable_v2_behavior()
and it works on my end. Could you please take a look and run the workflow again?keras/keras/applications/applications_load_weight_test.py
Lines 127 to 132 in 851ca16
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will take a look next week! Sorry for the delayed reply, most of the team is out this week. The proposed change would still not submit, because that fallback (the numpy call) would still be run in a TF1 context for the regnet load weights test unless we disable it.
Overall, I think our options are...
jit_compile=True
may allow CPU to work, which might give us a way forward).I would say 3) would be the way to go if we can make it work. We really do want the load weights tests to test the compile predicted function (that's how these will often be used!), and shipping regnets such that predict will fail on CPU by default is not a great out of box experience.
Will follow up next week when people are back in office. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattdangerw
Thanks for the detailed explanation.
I guess (2) is not something which can be done in the Keras codebase, as the error is thrown in tensorflow/tensorflow/core/kernels/conv_ops_fused_impl.h. I'll open an issue in the TF repo regarding this. So I agree that (3) might be the best option.
Lastly, wish you very happy new year!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattdangerw
Could you please take a look at this one? TIA
/cc @fchollet @qlzh727
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still here on this! We think we have found a good workaround (option 3), forcing XLA compilation grouped convolutions. #15868
Once that lands (assuming that doesn't run into road blocks), we can submit this without modifying the predict call in load weights tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattdangerw
Thanks a lot for this! Really appreciate it.