Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove fast-float crate? #13035

Closed
jqnatividad opened this issue Dec 13, 2023 · 12 comments · Fixed by #19578
Closed

Remove fast-float crate? #13035

jqnatividad opened this issue Dec 13, 2023 · 12 comments · Fixed by #19578
Assignees
Labels
accepted Ready for implementation enhancement New feature or an improvement of an existing feature

Comments

@jqnatividad
Copy link
Contributor

Description

The Eisel-Lemire algorithm at the heart of the fast-float crate has been merged into std.

rust-lang/rust#86761

Does polars still need fast-float?

See aldanor/fast-float-rust#34

@jqnatividad jqnatividad added the enhancement New feature or an improvement of an existing feature label Dec 13, 2023
@orlp
Copy link
Collaborator

orlp commented Dec 15, 2023

I can run a quick benchmark today to see if there's a significant performance difference if we remove the dependency, and if not we can just remove it.

@orlp orlp self-assigned this Dec 15, 2023
@orlp
Copy link
Collaborator

orlp commented Dec 21, 2023

A bit late, I did do the benchmark and there was essentially no difference between the stdlib parse and fast_float parse.

There is however another problem:

        unsafe {
            // SAFETY: this is technically not ok, but the first line of
            // the stdlib parse implementation calls s = s.as_bytes()...
            std::str::from_utf8_unchecked(bytes).parse().ok()
        }

We directly want to parse bytestrings which fast_float supports but the stdlib doesn't. So either we do this thing which isn't really kosher, or keep using the stdlib. @ritchie46 Thoughts?

@orlp
Copy link
Collaborator

orlp commented Dec 21, 2023

Relevant stdlib proposal: rust-lang/libs-team#287.

@ritchie46
Copy link
Member

I believe the bytes is the reason we use fast-float indeed. I'd say let's wait.

@jeadorf
Copy link

jeadorf commented Mar 25, 2024

To summarize, I understand that polars still depends, and only depends on fast_float because the std library does not yet support (being discussed, but last update four months ago) parsing bytestrings, something fast_float offers, and polars relies on (here and here, for instance, if I read the code correctly)?

Why am I asking? I'm interested because fast_float crate has led to some friction in the environment where I'm contributing to adopting polars, and if polars' dependency on fast_float went away, so would this friction. Knowing that I got the summary right would help me to figure out what to do next.

@ritchie46
Copy link
Member

Yes, if std would accept bytes we could remove fast floats. We parse floats directly from bytes that may have invalid utf8. Having to do utf8 checking first makes it more costly than needed. Why is the fast_float dep a problem?

@jeadorf
Copy link

jeadorf commented Mar 25, 2024

I went through a code review process when importing dependencies of Polars. Some questions/cocnerns popped up about fast_float, around some of the unsafe uses in fast_float. This then prompted me to find out about the state of this issue: since if the fast_float dependency were removed, we would need to look no further. Thanks for helping me understand.

@Manishearth
Copy link
Contributor

Manishearth commented Mar 29, 2024

@ritchie46 would it be acceptable to make fast-float an optional dependency and use checked from_utf8 as the alternative? This way people can have safe-but-slightly-slower code.

Or even a manual impl copied from the stdlib (i think this might be too much code)

@ritchie46
Copy link
Member

To my knowledge fast-float is sound. There are many dependencies that use unsafe and to create feature flags and alternatives for all of them would be a maintenance nightmare. What is it specific on fast-float? Is there something unsound?

@Manishearth
Copy link
Contributor

Someone needs to file an issue about it with detail (sorry, IMO that should have happened first), but basically in the process of an unsafe audit we discovered that fast-float is rather careless with unsafe: it checks safety invariants with debug_assert!(), invariants that might be upheld elsewhere in the codebase but not in a proximate, obvious manner. It's unclear to me at this time if there is any actual UB in the crate, but it is a bit of an outsize burden on a dependency tree to have crates that have nonlocal safety invariants (since you have to check them again every time you update)

@Manishearth
Copy link
Contributor

Filed an issue with some detail: aldanor/fast-float-rust#37

Doing a proper safety audit of that code will take much longer and is not really worth it for me, at least.

@orlp
Copy link
Collaborator

orlp commented Apr 17, 2024

I'm closing this issue because it is not actionable right now. When the Rust standard library provides a method to directly parse bytes into a float we can switch to that.

@orlp orlp closed this as not planned Won't fix, can't repro, duplicate, stale Apr 17, 2024
@c-peters c-peters added the accepted Ready for implementation label Nov 4, 2024
@c-peters c-peters added this to Backlog Nov 4, 2024
@c-peters c-peters moved this to Done in Backlog Nov 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Ready for implementation enhancement New feature or an improvement of an existing feature
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

6 participants