CL_DEVICE_HALF_FP_CONFIG returns CL_INVALID_VALUE#3068
Merged
umar456 merged 1 commit intoarrayfire:masterfrom Feb 18, 2021
Merged
CL_DEVICE_HALF_FP_CONFIG returns CL_INVALID_VALUE#3068umar456 merged 1 commit intoarrayfire:masterfrom
umar456 merged 1 commit intoarrayfire:masterfrom
Conversation
9prady9
reviewed
Dec 18, 2020
| return (dev.getInfo<CL_DEVICE_DOUBLE_FP_CONFIG>() > 0); | ||
| // 64bit fp is an optional extension | ||
| return (dev.getInfo<CL_DEVICE_EXTENSIONS>().find("cl_khr_fp64") != | ||
| string::npos); |
Member
There was a problem hiding this comment.
👍🏾
Although I wonder if the vendor implementation caches the device type support in internal cache. If cached in such manner, then fetching that compared to doing a string search is more efficient.
Contributor
Author
There was a problem hiding this comment.
I will get through the debugger step by step and report what I find.
Contributor
Author
There was a problem hiding this comment.
I have the impression that it is straight copy into a vector.
For NVIDIA
- Total length of vector: 453 chars
- cl_khr_fp16 NA
- cl_khr_fp64 at pos 138
For AMD - Total length of vector: 661 chars
- cl_khr_fp16 NA
- cl_khr_fp64 at pos 0
The complete string is cached in the driver. Copy happens in cl2.hpp at line 1427.
Although I can not see inside the function, I come to this conclusion because the exact length is provided upfront during the construction of the vector.
16fp and 64fp are optional extensions to OpenCL. The CONFIG's only exists when the extension is available. It is therefore better to check the availability of the extension, so that no errors are thrown (and have to treated). + Cleanup of compiler warnings.
bcff4c6 to
6e59dd7
Compare
umar456
approved these changes
Feb 18, 2021
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
CL_DEVICE_HALF_FP_CONFIG only exists when the extension cl_khr_fp16 is available.
Description
Instead of testing on the CONFIGURATION, it is better to directly test if the extensions (fp16 & fp64) are available so that no exceptions are thrown.
While updating the file, I added explicit conversions to eliminate compiler warnings, so that later real warnings jump in the eye.
Changes to Users
No changes to end-users.
Checklist
- [ ] Functions added to unified API- [ ] Functions documented