You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When converting a specific PDF document to JSON using the docling CLI tool, the process fails with a KeyError: 'prediction' in the table_structure_model.py file.
This error occurs because the prediction key is missing in the table_out["predict_details"] dictionary during the table structure recognition step.
Notably, this issue only occurs with this specific PDF document, as other similar documents (around 10 tested so far) convert successfully.
Steps to reproduce
Run the following command in the terminal:
docling --from pdf --to json --image-export-mode placeholder --output /tmp https://venda-imoveis.caixa.gov.br/editais/EL01030224CPARE.PDF
Observe the error message:
WARNING:docling.pipeline.base_pipeline:Encountered an error during conversion of document 3be6f7171b899a5cd051aefc1d9c3782971ce2a31a8394d1593596e4bf0d0f66:Traceback (most recent call last):File"/home/zvictor/development/martelada/data-lab/.venv/lib/python3.12/site-packages/docling/pipeline/base_pipeline.py", line 150, in _build_document
for p in pipeline_pages:#Must exhaust!^^^^^^^^^^^^^^File"/home/zvictor/development/martelada/data-lab/.venv/lib/python3.12/site-packages/docling/pipeline/base_pipeline.py", line 116, in _apply_on_pages
yield from page_batch
File"/home/zvictor/development/martelada/data-lab/.venv/lib/python3.12/site-packages/docling/models/page_assemble_model.py", line 60, in __call__
for page in page_batch:^^^^^^^^^^File"/home/zvictor/development/martelada/data-lab/.venv/lib/python3.12/site-packages/docling/models/table_structure_model.py", line 215, in __call__
otsl_seq = table_out["predict_details"]["prediction"][
~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^KeyError: 'prediction'
The issue appears to be specific to the structure or content of this PDF document, as other similar documents process without errors. This suggests that the document may contain unexpected or unsupported table structures that the model cannot handle.
The text was updated successfully, but these errors were encountered:
Bug
When converting a specific PDF document to JSON using the
docling
CLI tool, the process fails with aKeyError: 'prediction'
in thetable_structure_model.py
file.This error occurs because the
prediction
key is missing in thetable_out["predict_details"]
dictionary during the table structure recognition step.Notably, this issue only occurs with this specific PDF document, as other similar documents (around 10 tested so far) convert successfully.
Steps to reproduce
Docling version
Python version
Additional context
The issue appears to be specific to the structure or content of this PDF document, as other similar documents process without errors. This suggests that the document may contain unexpected or unsupported table structures that the model cannot handle.
The text was updated successfully, but these errors were encountered: