Adding RegNets to tf.keras.applications #15702

AdityaKane2001 · 2021-11-25T03:54:22Z

Progress so far:

X variant:

Model	Paper (in %)	Ours (in %)	Diff (in %)	Comments
X002	68.9	67.15	1.75	adamw, area_factor=0.25
X004	72.6	71.22	1.38	adamw, area_factor=0.08
X006	74.1	72.37	1.73	adamw, area_factor=0.08
X008	75.2	73.45	1.75	adamw, area_factor=0.08
X016	77	75.55	1.45	adamw, area_factor=0.08, mixup=0.2
X032	78.3	77.09	1.21	adamw, area_factor=0.08, mixup=0.2
X040	78.6	77.87	0.73	adamw, area_factor=0.08, mixup=0.2
X064	79.2	78.22	0.98	adamw, area_factor=0.08, mixup=0.3
X080	79.3	78.41	0.89	adamw, area_factor=0.08, mixup=0.3
X120	79.7	79.09	0.61	adamw, area_factor=0.08, mixup=0.4
X160	80	79.53	0.47	adamw, area_factor=0.08, mixup=0.4
X320	80.5	80.35	0.15	adamw, area_factor=0.08, mixup=0.4

Y variant:

	Paper (in %)	Ours (in %)	Diff (in %)	Comments
Y002	70.3	68.51	1.79	adamw, WD=1e-5, area_factor=0.16 mixup=0.2
Y004	74.1	72.11	1.99	adamw, area_factor=0.16, mixup=0.2,WD=1e-5
Y006	75.5	73.52	1.98	adamw, area_factor=0.16, mixup=0.2
Y008	76.3	74.48	1.82	adamw, area_factor=0.16, mixup=0.2
Y016	77.9	76.95	0.95	adamw, area_factor=0.08, mixup=0.2
Y032	78.9	78.05	0.85	adamw, area_factor=0.08, mixup=0.2
Y040	79.4	78.2	1.2	adamw, area_factor=0.08, mixup=0.2
Y064	79.9	78.95	0.95	adamw, area_factor=0.08, mixup=0.3
Y080	79.9	79.11	0.69	adamw, area_factor=0.08, mixup=0.3
Y120	80.3	79.45	0.85	adamw, area_factor=0.08, mixup=0.4
Y160	80.4	79.71	0.69	adamw, area_factor=0.08, mixup=0.4
Y320	80.9	80.12	0.78	adamw, area_factor=0.08, mixup=0.4

/cc @fchollet @sayakpaul @qlzh727
/auto Closes #15240.

innat · 2021-11-25T11:58:24Z

@AdityaKane2001
Wondering, looks like it's the first time we're going to have lots of variants of the same model in tf.keras.applications. Great job anyway.

AdityaKane2001 · 2021-11-25T13:12:35Z

@innat

Yes, that is the case. Thank you :)

MrinalTyagi · 2021-11-26T08:06:13Z

@AdityaKane2001 Wondering, looks like it's the first time we're going to have lots of variants of the same model in tf.keras.applications. Great job anyway.

@innat Isn't addition of ResNet18 and 34 possible in similar fashion?

innat · 2021-11-26T09:16:29Z

@MrinalTyagi
I agree. But I don't know why these models aren't there. There're some requests 1, 2, 3, 4 are pending. Maybe there're some criteria that I'm unaware of.

lgeiger

Thanks again for opening this PR! I have two tiny suggestions to slightly improve readability.

keras/applications/regnet.py

AdityaKane2001 · 2021-11-26T17:55:08Z

@lgeiger Always appreciated! Thanks for the changes. I'll merge them tomorrow.

Co-authored-by: Lukas Geiger <[email protected]>

mattdangerw

Thanks for the PR! Left a few comments. Do you have the weights for the applications available online somewhere?

keras/applications/regnet.py

keras/applications/applications_load_weight_test.py

keras/applications/regnet.py

AdityaKane2001 · 2021-12-01T03:56:20Z

@mattdangerw

Thanks for the review! Made requested changes.

Do you have the weights for the applications available online somewhere?

I am still training these models, and I'm updating the tables at the start of the thread accordingly. I'll test the loading code parallelly.

AdityaKane2001 · 2021-12-10T19:03:06Z

@fchollet @mattdangerw

I have completed training of all the models. I have updated the tables at the start of the thread accordingly. There are a couple of things I want to bring to your notice:

model.predict(X_test) does not work with grouped convolutions on CPU. For some reason, model(X_test) works flawlessly. Thus, I have updated the application_load_weight_test.py file accordingly. If needed I'll open an issue for the same.
All models are with 2% of the accuracies mentioned in the paper. Larger models are within 1%.

/cc @sayakpaul

AdityaKane2001 · 2021-12-16T03:36:12Z

@mattdangerw @fchollet

I have made the requested changes. Please run the workflow again.

I wanted to provide performance metrics for all the models as on keras.io/applications. However, I am not able to spin up a VM instance of the given specs¹ on Google Cloud.

Could you please tell how to go about this?

/cc @sayakpaul

CPU: AMD EPYC Processor (with IBPB) (92 core) - Ram: 1.7T - GPU: Tesla A100 ↩

AdityaKane2001 · 2021-12-20T15:45:11Z

@fchollet @mattdangerw

Could you please take a look at this one? TIA

AdityaKane2001 · 2021-12-22T03:22:28Z

Thanks for the approval!

sayakpaul · 2021-12-22T03:42:45Z

@mattdangerw could you also provide an update on what do we do about this?

I wanted to provide performance metrics for all the models as on keras.io/applications. However, I am not able to spin up a VM instance of the given specs1 on Google Cloud.

mattdangerw · 2021-12-22T19:36:17Z

Yeah, re ideal machine for metrics, particularly the performance per step numbers, I am not sure. This is probably a question for @fchollet. We might take a bit to get back to you on this given it's the holidays and a lot of the team is out.

In the mean time, I think we can move ahead trying to land this PR, as the table update will be a separate change to keras.io anyway.

mattdangerw

I've uploaded weights and gen'd api files for this PR, things are looking overall good. However the change to applications_load_weight_tests is breaking right now. Commented on the line. Is that change necessary? Thanks!

mattdangerw · 2021-12-22T23:48:05Z

keras/applications/applications_load_weight_test.py

@@ -115,7 +125,7 @@ def test_application_pretrained_weights_loading(self):
      self.assertShapeEqual(model.output_shape, (None, _IMAGENET_CLASSES))
      x = _get_elephant(model.input_shape[1:3])
      x = app_module.preprocess_input(x)
-      preds = model.predict(x)
+      preds = model(x).numpy()


Is there a reason this needs to be updated? Do things break otherwise? We still run these test cases in TF1 without eager mode enabled, and this line is breaking a number of our tests.

@mattdangerw

Yes, the change is necessary. Grouped convolutions are not yet fully supported on CPUs. We see that model.predict(X_test) breaks whereas model(X_test) works fine.

There are a number of issues discussing this in the TF repo.

I think you could trigger the failures by adding a call to tf.compat.v1.disable_v2_behavior() after before the call to tf.test.main in applications_load_weight_test. We can't submit this if we are breaking all these application tests in a TF1 context. We would need to find a change that does not rely on eager mode behavior (.numpy is eager only).

This might mean we need to dig into the difference between direct call vs predict here. It sound like this is an issue with grouped convolutions on CPU that will only appear when compiling a tf.function, is that right?

@mattdangerw

I have found a small solution to this. I have tested the change both with TF2 and using tf.compat.v1.disable_v2_behavior() and it works on my end. Could you please take a look and run the workflow again?

keras/keras/applications/applications_load_weight_test.py

Lines 127 to 132 in 851ca16

x = app_module.preprocess_input(x)

try:

preds = model.predict(x) # Works in TF1

except:

preds = model(x).numpy() # Works in TF2

names = [p[1] for p in app_module.decode_predictions(preds)[0]]

I will take a look next week! Sorry for the delayed reply, most of the team is out this week. The proposed change would still not submit, because that fallback (the numpy call) would still be run in a TF1 context for the regnet load weights test unless we disable it.

Overall, I think our options are...

Disable the load weights test for regnet (without removing the predict call here), and follow up with a fix.

Fix the underlying CPU/compiled function/grouped convolution issue, and then land this PR.

Work around the bug for regnets somehow (the conversation here suggests that using jit_compile=True may allow CPU to work, which might give us a way forward).

I would say 3) would be the way to go if we can make it work. We really do want the load weights tests to test the compile predicted function (that's how these will often be used!), and shipping regnets such that predict will fail on CPU by default is not a great out of box experience.

Will follow up next week when people are back in office. Thanks!

@mattdangerw

Thanks for the detailed explanation.

I guess (2) is not something which can be done in the Keras codebase, as the error is thrown in tensorflow/tensorflow/core/kernels/conv_ops_fused_impl.h. I'll open an issue in the TF repo regarding this. So I agree that (3) might be the best option.

Lastly, wish you very happy new year!

@mattdangerw

Could you please take a look at this one? TIA

/cc @fchollet @qlzh727

Still here on this! We think we have found a good workaround (option 3), forcing XLA compilation grouped convolutions. #15868

Once that lands (assuming that doesn't run into road blocks), we can submit this without modifying the predict call in load weights tests.

@mattdangerw

Thanks a lot for this! Really appreciate it.

AdityaKane2001 · 2022-01-09T18:48:38Z

@mattdangerw

Thanks a ton for #15868!

I have tested the code on my end and rolled back 851ca16 in cf25748.

PiperOrigin-RevId: 421338796

AdityaKane2001 · 2022-01-13T16:50:01Z

Today these models were pushed to the official docs. I sincerely thank the Keras team for allowing me to add these models. Huge thanks to the TPU Research Group (TRC) for providing TPUs for the entire duration of this project, without which this would not have been possible. Thanks a lot to @fchollet for allowing this and guiding me throughout the process. Thanks to @qlzh727 for his guidance in building Keras from source on TPU VMs. Thanks to @mattdangerw for his support regarding grouped convolutions. Special thanks to @lgeiger for his contributions to the code. Last but not least, thanks a ton to @sayakpaul for his continuous guidance and encouragement.

mattdangerw · 2022-01-13T20:55:22Z

Congrats on getting it landed and thanks for all the hard work on this! This is great to have!

Restored progress from keras-team#15419

275565f

google-ml-butler bot added the size:XL label Nov 25, 2021

google-ml-butler bot assigned gbaned Nov 25, 2021

google-cla bot added the cla: yes label Nov 25, 2021

AdityaKane2001 mentioned this pull request Nov 25, 2021

Adding RegNets to tf.keras.applications #15419

Closed

AdityaKane2001 mentioned this pull request Nov 25, 2021

References and discussions AdityaKane2001/regnets_trainer#1

Closed

gbaned requested a review from mattdangerw November 25, 2021 14:25

google-ml-butler bot added the awaiting review label Nov 25, 2021

gbaned requested review from fchollet and removed request for mattdangerw November 25, 2021 14:25

gbaned added the keras-team-review-pending Pending review by a Keras team member. label Nov 26, 2021

lgeiger reviewed Nov 26, 2021

View reviewed changes

Used f-strings for block numbering in Stage

6c719b1

Co-authored-by: Lukas Geiger <[email protected]>

mattdangerw reviewed Dec 1, 2021

View reviewed changes

Added tf_py_test, docstrings and other minor corrections

fb05277

mattdangerw removed the keras-team-review-pending Pending review by a Keras team member. label Dec 2, 2021

AdityaKane2001 added 5 commits December 8, 2021 00:15

Added RegNetX sha256 hashes

892813f

Added all RegNetY sha256 hashes except Y004

7297820

Added sha256 hashes for Y004

98d67e5

Minor error in BASE_DOCSTRING

4383429

Change in applications tset for grouped conv compatibility

c7f2151

AdityaKane2001 requested a review from mattdangerw December 12, 2021 04:30

Minor changes and yapf

e17f378

google-ml-butler bot removed the ready to pull Ready to be merged into the codebase label Dec 16, 2021

AdityaKane2001 requested a review from mattdangerw December 16, 2021 03:48

mattdangerw approved these changes Dec 21, 2021

View reviewed changes

google-ml-butler bot added kokoro:force-run ready to pull Ready to be merged into the codebase labels Dec 21, 2021

kokoro-team removed the kokoro:force-run label Dec 21, 2021

mattdangerw reviewed Dec 22, 2021

View reviewed changes

Compat change in applications_load_weight_test

851ca16

google-ml-butler bot removed the ready to pull Ready to be merged into the codebase label Dec 24, 2021

AdityaKane2001 requested a review from mattdangerw December 25, 2021 03:20

google-ml-butler bot added the keras-team-review-pending Pending review by a Keras team member. label Dec 25, 2021

qlzh727 removed the keras-team-review-pending Pending review by a Keras team member. label Jan 4, 2022

AdityaKane2001 added 2 commits January 10, 2022 00:11

Rollback 851ca16

cf25748

Rollback 851ca16

62c79eb

copybara-service bot pushed a commit that referenced this pull request Jan 12, 2022

Merge pull request #15702 from AdityaKane2001:regnet_new

dd73626

PiperOrigin-RevId: 421338796

AdityaKane2001 mentioned this pull request Jan 13, 2022

Regarding documentation for RegNets #15895

Closed

mattdangerw closed this Jan 13, 2022

google-ml-butler bot removed the awaiting review label Jan 13, 2022

AdityaKane2001 mentioned this pull request Jan 14, 2022

Interested in adding RegNets to tf.keras.applications #15240

Closed

AdityaKane2001 mentioned this pull request Apr 5, 2022

RegNets in next release? #16365

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding RegNets to tf.keras.applications #15702

Adding RegNets to tf.keras.applications #15702

AdityaKane2001 commented Nov 25, 2021 •

edited

Loading

innat commented Nov 25, 2021

AdityaKane2001 commented Nov 25, 2021

MrinalTyagi commented Nov 26, 2021

innat commented Nov 26, 2021

lgeiger left a comment

AdityaKane2001 commented Nov 26, 2021

mattdangerw left a comment

AdityaKane2001 commented Dec 1, 2021 •

edited

Loading

AdityaKane2001 commented Dec 10, 2021 •

edited

Loading

AdityaKane2001 commented Dec 16, 2021 •

edited

Loading

AdityaKane2001 commented Dec 20, 2021

AdityaKane2001 commented Dec 22, 2021

sayakpaul commented Dec 22, 2021

mattdangerw commented Dec 22, 2021

mattdangerw left a comment

mattdangerw Dec 22, 2021

AdityaKane2001 Dec 23, 2021

mattdangerw Dec 23, 2021

AdityaKane2001 Dec 24, 2021

mattdangerw Dec 31, 2021

AdityaKane2001 Dec 31, 2021 •

edited

Loading

AdityaKane2001 Jan 5, 2022

mattdangerw Jan 6, 2022

AdityaKane2001 Jan 7, 2022

AdityaKane2001 commented Jan 9, 2022

AdityaKane2001 commented Jan 13, 2022

mattdangerw commented Jan 13, 2022

	x = app_module.preprocess_input(x)
	try:
	preds = model.predict(x) # Works in TF1
	except:
	preds = model(x).numpy() # Works in TF2
	names = [p[1] for p in app_module.decode_predictions(preds)[0]]

Adding RegNets to tf.keras.applications #15702

Adding RegNets to tf.keras.applications #15702

Conversation

AdityaKane2001 commented Nov 25, 2021 • edited Loading

innat commented Nov 25, 2021

AdityaKane2001 commented Nov 25, 2021

MrinalTyagi commented Nov 26, 2021

innat commented Nov 26, 2021

lgeiger left a comment

Choose a reason for hiding this comment

AdityaKane2001 commented Nov 26, 2021

mattdangerw left a comment

Choose a reason for hiding this comment

AdityaKane2001 commented Dec 1, 2021 • edited Loading

AdityaKane2001 commented Dec 10, 2021 • edited Loading

AdityaKane2001 commented Dec 16, 2021 • edited Loading

Footnotes

AdityaKane2001 commented Dec 20, 2021

AdityaKane2001 commented Dec 22, 2021

sayakpaul commented Dec 22, 2021

mattdangerw commented Dec 22, 2021

mattdangerw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AdityaKane2001 Dec 31, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AdityaKane2001 commented Jan 9, 2022

AdityaKane2001 commented Jan 13, 2022

mattdangerw commented Jan 13, 2022

AdityaKane2001 commented Nov 25, 2021 •

edited

Loading

AdityaKane2001 commented Dec 1, 2021 •

edited

Loading

AdityaKane2001 commented Dec 10, 2021 •

edited

Loading

AdityaKane2001 commented Dec 16, 2021 •

edited

Loading

AdityaKane2001 Dec 31, 2021 •

edited

Loading