Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set baseline expectations for categorical cross-entropy #472

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

qualiaMachine
Copy link
Collaborator

In the same spirit as the regression episode, I thought it might be useful to establish a baseline expectation in terms of the categorical cross-entropy loss metric. You can calculate the expected performance of a model that guesses at random by using log(n), where n is the number of classes

In the same spirit as the regression episode, I thought it might be useful to establish a baseline expectation in terms of the categorical cross-entropy loss metric. You can calculate the expected performance of a model that guesses at random by using log(n), where n is the number of classes
Copy link

github-actions bot commented Jun 4, 2024

Thank you!

Thank you for your pull request 😃

🤖 This automated message can help you check the rendered files in your submission for clarity. If you have any questions, please feel free to open an issue in {sandpaper}.

If you have files that automatically render output (e.g. R Markdown), then you should check for the following:

  • 🎯 correct output
  • 🖼️ correct figures
  • ❓ new warnings
  • ‼️ new errors

Rendered Changes

🔍 Inspect the changes: https://github.com/carpentries-incubator/deep-learning-intro/compare/md-outputs..md-outputs-PR-472

The following changes were observed in the rendered markdown documents:

 2-keras.md | 10 +++++++++-
 md5sum.txt |  4 ++--
 2 files changed, 11 insertions(+), 3 deletions(-)
What does this mean?

If you have source files that require output and figures to be generated (e.g. R Markdown), then it is important to make sure the generated figures and output are reproducible.

This output provides a way for you to inspect the output in a diff-friendly manner so that it's easy to see the changes that occur due to new software versions or randomisation.

⏱️ Updated at 2024-06-04 16:57:15 +0000

github-actions bot pushed a commit that referenced this pull request Jun 4, 2024
Copy link
Collaborator

@svenvanderburg svenvanderburg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea @qualiaMachine. IMO, it makes sense to introduce baselines in episode 3. I see episode 2 as quickly going through the deep learning cycle once, without going into details. Then episode 3 introducing greatly expanding the cycle with more advanced concepts.

If you do want to introduce baselines in episode 2 I would suggest we add a callout box and compute the baseline accuracy instead of categorical crossentropy loss, to make things a little bit more intuitive for people without a mathemical background.

What do you think?

@qualiaMachine
Copy link
Collaborator Author

Thank you for your comment, @svenvanderburg! I agree that a baseline measure in terms of accuracy is probably more intuitive. However, I would argue that it's useful for real-world deep learning practitioners to know the equation necessary for establishing a baseline using cross-entropy loss. It's a very common loss metric that comes up in all kinds of classification problems. Even if they don't understand the math fully, it can be useful thing to memorize down the line and can help you detect problems while the model is still training. In contrast, interpreting the confusion matrix / accuracy is intuitive enough that I'm not sure it's worth a callout.

@svenvanderburg
Copy link
Collaborator

@qualiaMachine Agree that it is useful to know. Although I worked on deep learning for many years with just an intuitive understanding of crossentropy without the mathematics 🙈😂.

Maybe it's an idea that we introduce it in episode 4? There was also use categorical crossentropy if I am not mistaken. That we we keep episode 2 relatively clean and not overwhelm students.

Otherwise I suggest to put the addition that you currently have in a callout box, and add a little bit more context. I think to fully understand why the loss would be log(n) you need some more explanation of the mathematics.

@qualiaMachine
Copy link
Collaborator Author

I totally get the desire to keep episode 2 light. The only thing that has me wanting to stick to episode 2 is that's where we introduce categorical cross-entropy. I think my explanation of the baseline math might make more sense if I also write a little paragraph unpacking categorical cross-entropy loss in episode 2. I can probably do that all as a callout box if that seems most appropriate? I could also concede and stick to episode 4 if you'd really like -- I won't die on this hill haha.

@svenvanderburg
Copy link
Collaborator

OK, let's go for a callout box explaining crossentropy loss and introducing the baseline loss in episode 2. We can always move it to 4 if it doesn't work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants