Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRM GTT/Shared memory broken in Nvidia Open drivers on Linux #758

Open
1 of 2 tasks
fish4terrisa-MSDSM opened this issue Dec 28, 2024 · 0 comments
Open
1 of 2 tasks
Labels
bug Something isn't working

Comments

@fish4terrisa-MSDSM
Copy link

fish4terrisa-MSDSM commented Dec 28, 2024

NVIDIA Open GPU Kernel Modules Version

565.77-1 (but do affect every version of it)

Please confirm this issue does not happen with the proprietary driver (of the same version). This issue tracker is only for bugs specific to the open kernel driver.

  • I confirm that this does not happen with the proprietary driver package.

Operating System and Version

Arch Linux

Kernel Release

6.12.6-arch1-1 (but affects every version of linux kernel, at least all 6.x.x)

Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels.

  • I am running on a stable kernel release.

Hardware: GPU

NVIDIA GeForce RTX 4060 Laptop GPU (AD107-B)

Describe the bug

Just as described in
#663
#618
https://forums.developer.nvidia.com/t/vram-allocation-issues/239678
https://forums.developer.nvidia.com/t/non-existent-shared-vram-on-nvidia-linux-drivers/260304

The standard DRM functionality GTT support is broken in nvidia-open modules and that made it impossible to use Shared Memory in Linux with nvidia gpus.
That's not a minor missing feature, but a major functional bug which strongly affected every Linux user with a Nvidia gpu.
#663 is closed in error, as described by @martynhare in #663#issuecomment-2487194834, so it's kinda wierd for you to ignore it when this caused a lot of games, Xorg, wayland, pytorch and many other ai related stuffs to crash and complain when there's absolutely enough RAM for them.

To Reproduce

Just use the latest edition of nvidia-open module, and it exists there.
nvidia-uvm won't help at all, and it's hard to find something using uvm in 2024.
Many ai stuffs doesn't support uvm at all, or has a uvm branch which is unmaintained for years.
For games, well, nvidia-uvm is only for cuda. Some of them can use dxvk which support to use system ram, but it's not a general solution and didn't fixed the problem at all.

Bug Incidence

Always

nvidia-bug-report.log.gz

nvidia-bug-report.log.gz doesn't help at all. It's a wide-affected bug in every version of nvidia-open in any environment.

Since a bot will close the issues without a nvidia-bug-report.log.gz, I'll upload a dummy one.
nvidia-bug-report.log.gz

More Info

Please fix this bug, it existed for years and caused pain on plenty of linux users who owns a nvidia gpu.

@fish4terrisa-MSDSM fish4terrisa-MSDSM added the bug Something isn't working label Dec 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant