-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix read write tmpfile on windows #323
base: main
Are you sure you want to change the base?
fix read write tmpfile on windows #323
Conversation
@@ -351,11 +352,14 @@ def process_data_with_model( | |||
) -> DocumentLayout: | |||
"""Processes pdf file in the form of a file handler (supporting a read method) into a | |||
DocumentLayout by using a model identified by model_name.""" | |||
with tempfile.NamedTemporaryFile() as tmp_file: | |||
tmp_file.write(data.read()) | |||
tmp_file.flush() # Make sure the file is written out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of creating a separate file, could you try with this smaller change:
with tempfile.NamedTemporaryFile() as tmp_file:
tmp_file.write(data.read())
tmp_file.flush()
tmp_file.seek(0)
layout = process_file_with_model(...)
Unfortunately I don't have a windows machine to check myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, probably it's not this issue - I'd try to create tempfile with options:
tempfile.NamedTemporaryFile(delete=True, delete_on_close=False)
According to windows recommendations from here:
https://docs.python.org/3/library/tempfile.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, probably it's not this issue - I'd try to create tempfile with options:
tempfile.NamedTemporaryFile(delete=True, delete_on_close=False)According to windows recommendations from here: https://docs.python.org/3/library/tempfile.html
Note that this only exists for Python 3.12, the docs for even Python 3.11 do not have the delete_on_close
parameter: https://docs.python.org/3.11/library/tempfile.html#tempfile.NamedTemporaryFile
The reason I'm commenting is because I'm also running into this issue on windows
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And wound it defeat the purpose of a temp file to leave it on device after things have been done with it?
I guess you could do something like this if you dont want to use a temp folder
def process_data_with_model(
data: BinaryIO,
model_name: Optional[str],
**kwargs,
) -> DocumentLayout:
"""Processes pdf file in the form of a file handler (supporting a read method) into a
DocumentLayout by using a model identified by model_name."""
file_name = ''
with tempfile.NamedTemporaryFile(delete=False) as tmp_file:
tmp_file.write(data.read())
tmp_file.flush() # Make sure the file is written out
file_name = tmp_file.name
try:
layout = process_file_with_model(
file_name,
model_name,
**kwargs,
)
finally:
os.remove(file_name)
return layout
This is a low impact fix for reading still open temporary files on windows.
#303