Correct the conversion from float/double to half on CUDA#3627
Merged
christophe-murphy merged 1 commit intoarrayfire:masterfrom Jan 16, 2025
Merged
Correct the conversion from float/double to half on CUDA#3627christophe-murphy merged 1 commit intoarrayfire:masterfrom
christophe-murphy merged 1 commit intoarrayfire:masterfrom
Conversation
edwinsolisf
reviewed
Jan 7, 2025
src/backend/common/half.hpp
Outdated
| uint64_t bits{0}; // = *reinterpret_cast<uint64*>(&value); //violating | ||
| // strict aliasing! | ||
| std::memcpy(&bits, &value, sizeof(double)); | ||
| union { |
Contributor
There was a problem hiding this comment.
According to cppreference, this type of conversion is Undefined Behavior. I think the more standard compliant way to do it would be using reinterpret_cast.
Contributor
Author
There was a problem hiding this comment.
All explicit conversions are now updated to reinterpret_cast.
1035c3d to
4156985
Compare
edwinsolisf
approved these changes
Jan 9, 2025
Contributor
edwinsolisf
left a comment
There was a problem hiding this comment.
Works, tested on Windows with an RTX 3070 Ti Mobile and on Ubuntu 22.04 LTS with a Tesla P4
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Corrects the conversion from a float/double to half.
This affects following functions: af::min, af::max, af::dot, af::mean, af::mean_var, af::topk, af::var for those GPU, where CUDA supports slow-rate FP16 (type __half is supported, although compute is in float)
Description
In the device function float2half_impl the float/double is converted to an unsigned integer on which bit operations are performed.
Finally we get a uint16 containing the bit presentation of the same value in native_half_t format.
During the return, an extra implicit casting was performed, this time from uint16 to the native_half_t format.
GPU supporting fully __half (type & compute) will not convert a float to __half, and will not show this error.
My GPU is a GTX1080 with CC6.1, with low-rate FP16 support.
Reproducible code, CUDA engine.
Output:
Additional information about the PR answering following questions:
Changes to Users
No changes to API.
Checklist