Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/chipper gpu float16 #267

Merged
merged 2 commits into from
Oct 26, 2023
Merged

Feat/chipper gpu float16 #267

merged 2 commits into from
Oct 26, 2023

Conversation

ajjimeno
Copy link
Contributor

We added the automatic mixed precision for CPU, but this is needed as well for GPU running. This PR makes this possible for GPU.

For testing, probably before and after the change on any long page.

from unstructured_inference.inference.layout import DocumentLayout
from unstructured_inference.models.base import get_model

model = get_model("chipper")
doc = DocumentLayout.from_file("sample-docs/layout-parser-paper.pdf",
                               detection_model=model, pdf_image_dpi=300)

@ajjimeno ajjimeno requested review from LaverdeS and mengdih October 24, 2023 01:26
Copy link
Contributor

@LaverdeS LaverdeS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Tested on colab (login in to HF to use chipperv2) with a T4 using the snippet:

%%time
doc = %memit -i3 DocumentLayout.from_file("donut_rl_merged.pdf", detection_model="chipper", pdf_image_dpi=300)

Here are the results for unstructured==0.7.10:

peak memory: 5038.44 MiB, increment: 53.56 MiB
CPU times: user 16min 18s, sys: 34.1 s, total: 16min 52s
Wall time: 16min 36s

And the results for unstructured==0.7.10-dev0 (with the changes in this PR):

peak memory: 4999.92 MiB, increment: 940.64 MiB
CPU times: user 17min 19s, sys: 36.8 s, total: 17min 56s
Wall time: 17min 37s

Both version executed a parsed a pdf document with 66 pages (academic paper) successfully.

There is a big change in the memory increment value though, from 53.56 MiB to 940.64 MiB, but the GPU RAM looks better. In the image, the first part of the GPU RAM region until the red top ends, corresponds to unstructured==0.7.10, then there is a runtime restart, and the rest in orange is with unstructured==0.7.10-dev0.

image

@ajjimeno ajjimeno merged commit 76ac6ab into main Oct 26, 2023
8 checks passed
@ajjimeno ajjimeno deleted the feat/chipper-gpu-float16 branch October 26, 2023 23:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants