Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++][Compute] Rank function considers NaNs and nulls equal #45193

Open
pitrou opened this issue Jan 7, 2025 · 3 comments
Open

[C++][Compute] Rank function considers NaNs and nulls equal #45193

pitrou opened this issue Jan 7, 2025 · 3 comments

Comments

@pitrou
Copy link
Member

pitrou commented Jan 7, 2025

Describe the bug, including details regarding any error messages, version, and platform.

Expected (NaNs and nulls are distinct, ordered according to null_placement):

>>> a = pa.array([1, None, math.nan, 2, math.nan, None])
>>> pc.rank(a, tiebreaker='min', null_placement='at_end')
<pyarrow.lib.UInt64Array object at 0x7f168295ca00>
[
  1,
  5,
  3,
  2,
  3,
  5
]
>>> pc.rank(a, tiebreaker='min', null_placement='at_start')
<pyarrow.lib.UInt64Array object at 0x7f1682845600>
[
  5,
  1,
  3,
  6,
  3,
  1
]

Actual (NaNs and nulls are considered ties):

>>> pc.rank(a, tiebreaker='min', null_placement='at_end')
<pyarrow.lib.UInt64Array object at 0x7f1682951660>
[
  1,
  3,
  3,
  2,
  3,
  3
]
>>> pc.rank(a, tiebreaker='min', null_placement='at_start')
<pyarrow.lib.UInt64Array object at 0x7f1682845600>
[
  5,
  1,
  1,
  6,
  1,
  1
]

Component(s)

C++

@timgrein
Copy link
Contributor

Hey 👋

Recently started to look more deeply into Arrow, I would love to get my hands dirty with some smaller issues. Do you think this is one is a good starting point? If yes, I would like to give it a shot :)

@pitrou
Copy link
Member Author

pitrou commented Jan 13, 2025

Hi @timgrein , yes, I think it would be a good issue to start if you already have some familiarity with the basic Arrow C++ APIs. However, there is a refactor going on in #45217 (probably soon merged), so you should probably base your work on that.

@pitrou
Copy link
Member Author

pitrou commented Jan 13, 2025

Update: the aforementioned PR was merged, so you can start from git main :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants