Elian Rafael Dal Prá, Sean Johnson, Leonardo Scabini, Raí Fernando Dal Prá, João Vitor Brentigani Torezan, Daniel Baldin Franceschini, Bruno Pereira Kellm, Marcelo Soccol Gris, Odemir Martinez Bruno
Note: we will soon update this repository with a detailed explanation about our approach and methods.
Our submission consists of two parts. The first one, in the directory first-part
, is a rougher approach to ink detection, where each 2D 64x64 output patch is classified into ink or no-ink with one scalar value only: binary classification on patches. This was done to avoid the output of shapes during inference. Then, we created a new training dataset based on this model predictions and used an already verified method, First Letters' Youssef's model, to improve the readability of our previous predictions; this is the content of the second-part
directory.
Our main objective was to detect regions with or without ink at the papyrus's segments using machine learning and to form letters without relying on the output of any shape during inference. To achieve this, we've done classification on square patches of size
The rationale behind this comes from the difficulty of creating an accurate training dataset. The crackle finding by Casey Handmer pivoted a new way at looking at segments. This was the basis for our classification model. We created two
Since we did not find any more significant sign of ink beyond crackle in the segments, we did not feel confident in predicting any kind of shape with our model, as we could only label the crackles signs without knowing if we have anything else (also, their boundaries are not easy to decide on too). But, classifying small subvolumes of
On the left figure, a
To avoid this at any scale, we preferred a model with more noise, but a more accurate representation of the papyrus itself. The confidence is given by the pixel intensity and letters shapes can be formed together easily where the signal is strong. We hope this can help with the identification and the certainty given to the letters by papyrologists. The "noise" can be attributed to the use of
To confirm and improve the readability of our results, we did augmentation on this model output and trained Youssef's model with them (which was already verified). The predictions verified our results.
In conclusion, we saw many examples of ink detection being possible on cleaner segments. Because of this, our approach was to try to get a model that was capable of giving a high certainty in regards to what can actually be found in the segments, avoiding hallucinations at our best and giving an accurate way of asserting letters in them.