About "Pretext + Kmeans' #126

fi591 · 2022-10-31T07:23:14Z

Hi, I see simply 'Pretext + Kmeans' achieves 65.2 on CIFAR10 on average. I download your model and tried it, but it only acquires 33.3%. Can you tell me your settings or something special you used?(I didn't see it in your code)

spdj2271 · 2023-03-23T02:50:06Z

I have the same question. This what I get after run simclr.py and then perform Kmeans on the learned representations on Cifar10 dataset. The clustering performance is much lower than Pretext + K-means (ACC=65.9 ) in table 1 of paper.

wvangansbeke · 2023-03-23T08:09:55Z

You're not applying Kmeans correctly if you don't get 65%. You need to normalize the features and report Kmeans on the validation set by fitting them on the train set. You also need to average the results over multiple runs. The code for Kmeans clustering is mostly the same as I provided in other repositories, like semantic segmentation for example.

spdj2271 · 2023-03-23T11:32:54Z

Dear author, I obtain a silimar result with that in table 3 of the paper (ACC=65.9,NMI=59.8,ARI=50.9):

Evaluate with hungarian matching algorithm ...
{'ACC': 0.6829, 'ARI': 0.4890204494614667, 'NMI': 0.5742197581324099, 'ACC Top-5': 0.9585,

However, the result I obtained is performing Kmeans on outputs of the clustering network (i.e., $\Phi_\eta(X)\in \mathbb{R}^{10}$ ) before clustering training starts, but not on the output of the pretext network (i.e., $\Phi_\theta\left(X\right)\in \mathbb{R}^{512}$ ). This seems a bug.

I believe performing Kmeans on features of pretext networks is important, which often severs as a baseline, even if the clustering network you proposed achieves good performance.

wvangansbeke · 2023-03-23T11:44:02Z

You should indeed cluster the pretext features, not the class vectors. Yes, we use KMeans as a baseline in our paper.
It's not clear to me what you believe is a 'bug' though. I was able to get the same numbers for KMeans with this codebase and the provided models. Most issues are caused by not normalizing properly or not using the correct pretext features.

spdj2271 · 2023-03-23T12:04:54Z

I means it is better to report the results of Kmeans on $\Phi_\theta\left(X\right)$ rather than Kmeans on $\Phi_\eta\left(X\right)$ in the row Pretext [7] + K-means of table 1 for clarity. But you seem to report the result of Kmeans on $\Phi_\eta\left(X\right)$, which is the bug I said. This is easy to misunderstand, at least for me.

Maybe the word bug I used is not so decent (non-native English speaker). I apologize for any inconvenience.

wvangansbeke · 2023-03-23T12:11:51Z

No, you are misunderstanding something. We cluster the features of Φθ and not Φη. The latter does not make sense.

gihanjayatilaka · 2023-07-11T21:13:50Z

Edit: I was able to replicate @wvangansbeke 's result. My code is not general enough to share (it is part of something convoluted).

Few issues that you might be running into are

Make sure you extract the pretext features out of the model in the eval() state.
Find the mean and std_dev in the train set (full dataset) and normalize both train and test datasets using these numbers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About "Pretext + Kmeans' #126

About "Pretext + Kmeans' #126

fi591 commented Oct 31, 2022

spdj2271 commented Mar 23, 2023

wvangansbeke commented Mar 23, 2023

spdj2271 commented Mar 23, 2023

wvangansbeke commented Mar 23, 2023

spdj2271 commented Mar 23, 2023

wvangansbeke commented Mar 23, 2023 •

edited

Loading

gihanjayatilaka commented Jul 11, 2023 •

edited

Loading

About "Pretext + Kmeans' #126

About "Pretext + Kmeans' #126

Comments

fi591 commented Oct 31, 2022

spdj2271 commented Mar 23, 2023

wvangansbeke commented Mar 23, 2023

spdj2271 commented Mar 23, 2023

wvangansbeke commented Mar 23, 2023

spdj2271 commented Mar 23, 2023

wvangansbeke commented Mar 23, 2023 • edited Loading

gihanjayatilaka commented Jul 11, 2023 • edited Loading

wvangansbeke commented Mar 23, 2023 •

edited

Loading

gihanjayatilaka commented Jul 11, 2023 •

edited

Loading