Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Toil complains that a filename with spaces "contains illegal characters" while cwltool works fine #5158

Open
ndonyapour opened this issue Nov 14, 2024 · 5 comments
Assignees

Comments

@ndonyapour
Copy link

ndonyapour commented Nov 14, 2024

Hello, we have workflows running with Toil that run issues when datasets contain files and folders with whitespaces in their names. We’d like to preserve the original data without running a preprocessing step to rename these files and folders. Is it possible to add support for file and folder names with whitespaces?

Thank you!

┆Issue is synchronized with this Jira Story
┆Issue Number: TOIL-1669

@ndonyapour ndonyapour changed the title Request to allow whitespaces in folder and file Names Request to allow whitespaces in folder and file names Nov 14, 2024
@adamnovak
Copy link
Member

As far as I know, Toil should already support whitespace in path names for input and output files. There's nothing I know of about its import/export machinery that shouldn't be able to handle it.

Are you using Toil with the CWL or WDL front-end? Are you using a particular CWL or WDL workflow? It's easy to write a WDL workflow that doesn't quote filename placeholders and won't work properly when filenames contain spaces.

Can you provide a reproducible example of Toil not supporting whitespace in a filename?

@ndonyapour
Copy link
Author

Actually, the error is coming from CWL, not Toil. I’m using Toil with CWL, and here’s the error I'm running into

visit_class(d, cls, op)
     File "/home/donyapourn2/mambaforge-pypy3/envs/wic/lib/python3.10/site-packages/cwltool/utils.py", line 218, in visit_class
      visit_class(d, cls, op)
     File "/home/donyapourn2/mambaforge-pypy3/envs/wic/lib/python3.10/site-packages/cwltool/utils.py", line 213, in visit_class
      op(rec)
     File "/home/donyapourn2/mambaforge-pypy3/envs/wic/lib/python3.10/site-packages/cwltool/command_line_tool.py", line 379, in check_adjust
      raise WorkflowException(
    cwl_utils.errors.WorkflowException: Invalid filename: 'CD_SOD1_2_E1023884 __1' contains illegal characters

I don’t get this error when I run the workflow with CWLtool directly.
toil_whitespace_example.zip

@adamnovak adamnovak changed the title Request to allow whitespaces in folder and file names Toil complains that a filename with spaces "contains illegal characters" while cwltool works fine Nov 14, 2024
@mr-c
Copy link
Contributor

mr-c commented Nov 15, 2024

Actually, the error is coming from CWL, not Toil. I’m using Toil with CWL, and here’s the error I'm running into

visit_class(d, cls, op)
     File "/home/donyapourn2/mambaforge-pypy3/envs/wic/lib/python3.10/site-packages/cwltool/utils.py", line 218, in visit_class
      visit_class(d, cls, op)
     File "/home/donyapourn2/mambaforge-pypy3/envs/wic/lib/python3.10/site-packages/cwltool/utils.py", line 213, in visit_class
      op(rec)
     File "/home/donyapourn2/mambaforge-pypy3/envs/wic/lib/python3.10/site-packages/cwltool/command_line_tool.py", line 379, in check_adjust
      raise WorkflowException(
    cwl_utils.errors.WorkflowException: Invalid filename: 'CD_SOD1_2_E1023884 __1' contains illegal characters

I don’t get this error when I run the workflow with CWLtool directly. toil_whitespace_example.zip

Can you try again with --relax-path-checks?

@adamnovak I remember that we had made this the default for toil-cwl-runner

@stxue1
Copy link
Contributor

stxue1 commented Nov 15, 2024

Actually, the error is coming from CWL, not Toil. I’m using Toil with CWL, and here’s the error I'm running into

visit_class(d, cls, op)
     File "/home/donyapourn2/mambaforge-pypy3/envs/wic/lib/python3.10/site-packages/cwltool/utils.py", line 218, in visit_class
      visit_class(d, cls, op)
     File "/home/donyapourn2/mambaforge-pypy3/envs/wic/lib/python3.10/site-packages/cwltool/utils.py", line 213, in visit_class
      op(rec)
     File "/home/donyapourn2/mambaforge-pypy3/envs/wic/lib/python3.10/site-packages/cwltool/command_line_tool.py", line 379, in check_adjust
      raise WorkflowException(
    cwl_utils.errors.WorkflowException: Invalid filename: 'CD_SOD1_2_E1023884 __1' contains illegal characters

I don’t get this error when I run the workflow with CWLtool directly. toil_whitespace_example.zip

Can you try again with --relax-path-checks?

@adamnovak I remember that we had made this the default for toil-cwl-runner

This is currently set to False by default:

default=False,

Maybe this should be set to True by default. We would then also need to either change the argument name or not have it as store_true.

@ndonyapour
Copy link
Author

Thank you for your help! Using --relax-path-checks fixed the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants