Conversation
include/af/algorithm.h
Outdated
|
|
||
| \note NaN values are ignored | ||
| */ | ||
| AFAPI void max(array &val, array &idx, const array &in, const int dim, const array &ragged_len); |
There was a problem hiding this comment.
Opinion: I think ragged_len should come in after the in array.
| } | ||
|
|
||
| template<af_op_t op> | ||
| static af_err rreduce_common(af_array *val, af_array *idx, const af_array in, |
There was a problem hiding this comment.
Will an overload to reduce_common not work?
src/backend/cpu/ireduce.cpp
Outdated
|
|
||
| template<af_op_t op, typename T> | ||
| void rreduce(Array<T> &out, Array<uint> &loc, const Array<T> &in, | ||
| const int dim, const Array<uint> &rlen) { |
There was a problem hiding this comment.
Is this function still needed? It looks like you can combine this with ireduce.
|
@syurkevi sounds good: Though f16 not really faster than f32: Could you please just add a minimalist bench like this one: |
|
@syurkevi could you please at least rebase for me to see what needs to be finished and advise accordingly ? |
|
@syurkevi I took care of rebase from latest master. If you are adding more ragged functions and need to touch the ireduce kernel. You can find the kernels in the file If you face any issues while editing/adding new things to kernels, please ping me. I can guide you through the nvrtc related changes. |
e502b36 to
6599dd2
Compare
aa0a856 to
2307ccf
Compare
Addresses #2782 .
TODO: