differences for PR #472

carpentries-incubator · Jun 4, 2024 · 524c525 · 524c525
1 parent 644557d
commit 524c525
Show file tree

Hide file tree

Showing 4 changed files with 11 additions and 3 deletions.
diff --git a/2-keras.md b/2-keras.md
@@ -479,7 +479,7 @@ For the one-hot encoding that we selected before a fitting loss function is the
 In Keras this is implemented in the `keras.losses.CategoricalCrossentropy` class.
 This loss function works well in combination with the `softmax` activation function
 we chose earlier.
-The Categorical Crossentropy works by comparing the probabilities that the
+The *categorical cross-entropy* works by comparing the probabilities that the
 neural network predicts with 'true' probabilities that we generated using the one-hot encoding.
 This is a measure for how close the distribution of the three neural network outputs corresponds to the distribution of the three values in the one-hot encoding.
 It is lower if the distributions are more similar.
@@ -519,6 +519,14 @@ history = model.fit(X_train, y_train, epochs=100)
 
 The fit method returns a history object that has a history attribute with the training loss and
 potentially other metrics per training epoch.
+
+### Setting baseline expectations
+What might be a good value for loss here when looking at categorical cross-entropy loss? In a classification context, we can establish a baseline by determining what loss we would get for simply guessing each class randomly. If predictions are completely random, the cross-entropy loss will be approximately log(n), where n is the number of classes. Any useful model should be able to outperform (have a lower loss) this baseline.
+```python
+import numpy as np
+np.log(3)
+```
+
 It can be very insightful to plot the training loss to see how the training progresses.
 Using seaborn we can do this as follow:
 ```python

diff --git a/fig/03_tensorboard.png b/fig/03_tensorboard.png
diff --git a/fig/04_conv_image.png b/fig/04_conv_image.png
diff --git a/md5sum.txt b/md5sum.txt
@@ -6,7 +6,7 @@
 "links.md" "8184cf4149eafbf03ce8da8ff0778c14" "site/built/links.md" "2024-06-04"
 "paper.md" "9a65da74572113c228e0c225f62e7b78" "site/built/paper.md" "2024-06-04"
 "episodes/1-introduction.Rmd" "70b9b736d80b1f243d82bfae89537b0e" "site/built/1-introduction.md" "2024-06-04"
-"episodes/2-keras.Rmd" "88fb676ec63d72886c544e305e66cd4a" "site/built/2-keras.md" "2024-06-04"
+"episodes/2-keras.Rmd" "6c31e04a20e52d4a156f629cd2f6c5af" "site/built/2-keras.md" "2024-06-04"
 "episodes/3-monitor-the-model.Rmd" "d9a73639b67c9c4a149a28d1df067dd2" "site/built/3-monitor-the-model.md" "2024-06-04"
 "episodes/4-advanced-layer-types.Rmd" "a62388b287760267459c61e677d3379d" "site/built/4-advanced-layer-types.md" "2024-06-04"
 "episodes/5-transfer-learning.Rmd" "03f95721c1981d0fdc19e3d1f5da35ec" "site/built/5-transfer-learning.md" "2024-06-04"
@@ -19,4 +19,4 @@
 "learners/reference.md" "ae95aeca6d28f5f0f994d053dc10d67c" "site/built/reference.md" "2024-06-04"
 "learners/setup.md" "8833a7eec970679022fca980623323ff" "site/built/setup.md" "2024-06-04"
 "profiles/learner-profiles.md" "698c27136a1a320b0c04303403859bdc" "site/built/learner-profiles.md" "2024-06-04"
-"renv/profiles/lesson-requirements/renv.lock" "645b9b8534f309dd234b877a28ce71e8" "site/built/renv.lock" "2024-06-04"
+"renv/profiles/lesson-requirements/renv.lock" "38d3eab909262adbaf1bf0ffe6f2fce7" "site/built/renv.lock" "2024-06-04"