Set baseline expectations for categorical cross-entropy #472

qualiaMachine · 2024-06-04T16:55:57Z

In the same spirit as the regression episode, I thought it might be useful to establish a baseline expectation in terms of the categorical cross-entropy loss metric. You can calculate the expected performance of a model that guesses at random by using log(n), where n is the number of classes

github-actions · 2024-06-04T16:56:11Z

Thank you!

Thank you for your pull request 😃

🤖 This automated message can help you check the rendered files in your submission for clarity. If you have any questions, please feel free to open an issue in {sandpaper}.

If you have files that automatically render output (e.g. R Markdown), then you should check for the following:

🎯 correct output
🖼️ correct figures
❓ new warnings
‼️ new errors

Rendered Changes

🔍 Inspect the changes: https://github.com/carpentries-incubator/deep-learning-intro/compare/md-outputs..md-outputs-PR-472

The following changes were observed in the rendered markdown documents:

 2-keras.md | 10 +++++++++-
 md5sum.txt |  4 ++--
 2 files changed, 11 insertions(+), 3 deletions(-)

What does this mean?

If you have source files that require output and figures to be generated (e.g. R Markdown), then it is important to make sure the generated figures and output are reproducible.

This output provides a way for you to inspect the output in a diff-friendly manner so that it's easy to see the changes that occur due to new software versions or randomisation.

⏱️ Updated at 2024-06-04 16:57:15 +0000

svenvanderburg

I like the idea @qualiaMachine. IMO, it makes sense to introduce baselines in episode 3. I see episode 2 as quickly going through the deep learning cycle once, without going into details. Then episode 3 introducing greatly expanding the cycle with more advanced concepts.

If you do want to introduce baselines in episode 2 I would suggest we add a callout box and compute the baseline accuracy instead of categorical crossentropy loss, to make things a little bit more intuitive for people without a mathemical background.

What do you think?

qualiaMachine · 2024-06-11T14:09:09Z

Thank you for your comment, @svenvanderburg! I agree that a baseline measure in terms of accuracy is probably more intuitive. However, I would argue that it's useful for real-world deep learning practitioners to know the equation necessary for establishing a baseline using cross-entropy loss. It's a very common loss metric that comes up in all kinds of classification problems. Even if they don't understand the math fully, it can be useful thing to memorize down the line and can help you detect problems while the model is still training. In contrast, interpreting the confusion matrix / accuracy is intuitive enough that I'm not sure it's worth a callout.

svenvanderburg · 2024-06-11T16:53:23Z

@qualiaMachine Agree that it is useful to know. Although I worked on deep learning for many years with just an intuitive understanding of crossentropy without the mathematics 🙈😂.

Maybe it's an idea that we introduce it in episode 4? There was also use categorical crossentropy if I am not mistaken. That we we keep episode 2 relatively clean and not overwhelm students.

Otherwise I suggest to put the addition that you currently have in a callout box, and add a little bit more context. I think to fully understand why the loss would be log(n) you need some more explanation of the mathematics.

qualiaMachine · 2024-06-11T20:00:50Z

I totally get the desire to keep episode 2 light. The only thing that has me wanting to stick to episode 2 is that's where we introduce categorical cross-entropy. I think my explanation of the baseline math might make more sense if I also write a little paragraph unpacking categorical cross-entropy loss in episode 2. I can probably do that all as a callout box if that seems most appropriate? I could also concede and stick to episode 4 if you'd really like -- I won't die on this hill haha.

svenvanderburg · 2024-06-13T05:38:08Z

OK, let's go for a callout box explaining crossentropy loss and introducing the baseline loss in episode 2. We can always move it to 4 if it doesn't work.

github-actions bot pushed a commit that referenced this pull request Jun 4, 2024

differences for PR #472

524c525

svenvanderburg requested changes Jun 11, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set baseline expectations for categorical cross-entropy #472

Set baseline expectations for categorical cross-entropy #472

qualiaMachine commented Jun 4, 2024

github-actions bot commented Jun 4, 2024 •

edited

Loading

svenvanderburg left a comment

qualiaMachine commented Jun 11, 2024

svenvanderburg commented Jun 11, 2024

qualiaMachine commented Jun 11, 2024

svenvanderburg commented Jun 13, 2024

Set baseline expectations for categorical cross-entropy #472

Are you sure you want to change the base?

Set baseline expectations for categorical cross-entropy #472

Conversation

qualiaMachine commented Jun 4, 2024

github-actions bot commented Jun 4, 2024 • edited Loading

Thank you!

Rendered Changes

svenvanderburg left a comment

Choose a reason for hiding this comment

qualiaMachine commented Jun 11, 2024

svenvanderburg commented Jun 11, 2024

qualiaMachine commented Jun 11, 2024

svenvanderburg commented Jun 13, 2024

github-actions bot commented Jun 4, 2024 •

edited

Loading