You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
FileWriter and StreamWriter should ensure that the data is written with appropriate alignment such that arrays can be used without copying to a more-aligned buffer.
In particular, as of Rust 1.77.0 and LLVM 18, i128 now has a 16-byte alignment requirement even on x86 (ARM always had this requirement), i.e. std::mem::align_of::<i128> == 16. So Decimal128Arrays must be aligned to a 16-byte boundary when serialized into an IPC buffer. The pad_to_8 used everywhere in the IPC code causes it to pad insufficiently.
This prevents readers of the IPC data generated by this crate from doing true zero-copy reads (e.g. mmapping) since the data may be insufficiently aligned.
On some platforms, SIMD may also be significantly slower if the beginning of the IPC block isn't aligned to a 16-, 32-, or 64- byte boundary (as discussed in the Arrow spec document).
To Reproduce
See the test test_decimal128_alignment8_is_unaligned in PR #5554 - the fact that this test throws an error shows that alignment is not currently respected.
Expected behavior
See the test test_decimal128_alignment16 in PR #5554 - increasing alignment should allow us to do "true" zero-copy reads.
Additional context
IpcWriteOptions already has an "alignment" field but it is not being respected throughout the IPC code.
Describe the bug
FileWriter
andStreamWriter
should ensure that the data is written with appropriate alignment such that arrays can be used without copying to a more-aligned buffer.In particular, as of Rust 1.77.0 and LLVM 18,
i128
now has a 16-byte alignment requirement even on x86 (ARM always had this requirement), i.e.std::mem::align_of::<i128> == 16
. SoDecimal128Array
s must be aligned to a 16-byte boundary when serialized into an IPC buffer. Thepad_to_8
used everywhere in the IPC code causes it to pad insufficiently.This prevents readers of the IPC data generated by this crate from doing true zero-copy reads (e.g. mmapping) since the data may be insufficiently aligned.
On some platforms, SIMD may also be significantly slower if the beginning of the IPC block isn't aligned to a 16-, 32-, or 64- byte boundary (as discussed in the Arrow spec document).
To Reproduce
See the test
test_decimal128_alignment8_is_unaligned
in PR #5554 - the fact that this test throws an error shows thatalignment
is not currently respected.Expected behavior
See the test
test_decimal128_alignment16
in PR #5554 - increasing alignment should allow us to do "true" zero-copy reads.Additional context
IpcWriteOptions
already has an "alignment" field but it is not being respected throughout the IPC code.Related PRs and issues:
RawPtrBox::new
#2882The text was updated successfully, but these errors were encountered: