Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About "Pretext + Kmeans' #126

Open
fi591 opened this issue Oct 31, 2022 · 7 comments
Open

About "Pretext + Kmeans' #126

fi591 opened this issue Oct 31, 2022 · 7 comments

Comments

@fi591
Copy link

fi591 commented Oct 31, 2022

Hi, I see simply 'Pretext + Kmeans' achieves 65.2 on CIFAR10 on average. I download your model and tried it, but it only acquires 33.3%. Can you tell me your settings or something special you used?(I didn't see it in your code)

@spdj2271
Copy link

I have the same question. This what I get after run simclr.py and then perform Kmeans on the learned representations on Cifar10 dataset. The clustering performance is much lower than Pretext + K-means (ACC=65.9 ) in table 1 of paper.
图片

@wvangansbeke
Copy link
Owner

You're not applying Kmeans correctly if you don't get 65%. You need to normalize the features and report Kmeans on the validation set by fitting them on the train set. You also need to average the results over multiple runs. The code for Kmeans clustering is mostly the same as I provided in other repositories, like semantic segmentation for example.

@spdj2271
Copy link

Dear author, I obtain a silimar result with that in table 3 of the paper (ACC=65.9,NMI=59.8,ARI=50.9):

Evaluate with hungarian matching algorithm ...
{'ACC': 0.6829, 'ARI': 0.4890204494614667, 'NMI': 0.5742197581324099, 'ACC Top-5': 0.9585,

However, the result I obtained is performing Kmeans on outputs of the clustering network (i.e., $\Phi_\eta(X)\in \mathbb{R}^{10}$ ) before clustering training starts, but not on the output of the pretext network (i.e., $\Phi_\theta\left(X\right)\in \mathbb{R}^{512}$ ). This seems a bug.

I believe performing Kmeans on features of pretext networks is important, which often severs as a baseline, even if the clustering network you proposed achieves good performance.

@wvangansbeke
Copy link
Owner

You should indeed cluster the pretext features, not the class vectors. Yes, we use KMeans as a baseline in our paper.
It's not clear to me what you believe is a 'bug' though. I was able to get the same numbers for KMeans with this codebase and the provided models. Most issues are caused by not normalizing properly or not using the correct pretext features.

@spdj2271
Copy link

I means it is better to report the results of Kmeans on $\Phi_\theta\left(X\right)$ rather than Kmeans on $\Phi_\eta\left(X\right)$ in the row Pretext [7] + K-means of table 1 for clarity. But you seem to report the result of Kmeans on $\Phi_\eta\left(X\right)$, which is the bug I said. This is easy to misunderstand, at least for me.

Maybe the word bug I used is not so decent (non-native English speaker). I apologize for any inconvenience.

@wvangansbeke
Copy link
Owner

wvangansbeke commented Mar 23, 2023

No, you are misunderstanding something. We cluster the features of Φθ and not Φη. The latter does not make sense.

@gihanjayatilaka
Copy link

gihanjayatilaka commented Jul 11, 2023

Edit: I was able to replicate @wvangansbeke 's result. My code is not general enough to share (it is part of something convoluted).

Few issues that you might be running into are

  1. Make sure you extract the pretext features out of the model in the eval() state.
  2. Find the mean and std_dev in the train set (full dataset) and normalize both train and test datasets using these numbers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants