Files | |
file | hqsimd.h |
Macros implementing SIMD operations. | |
Macros | |
#define | SIMD_16x8u_COPY_NONZERO(dest, src) |
#define | SIMD_16x8u_COPY_NONZERO(dest, src) |
#define | SIMD_8x16u_COPY_NONZERO(dest, src) |
#define | SIMD_8x16u_COPY_NONZERO(dest, src) |
Typedefs | |
typedef __m128i | simd_16x8i_t |
typedef __m128i | simd_16x8u_t |
typedef __m128i | simd_8x16i_t |
typedef __m128i | simd_8x16u_t |
typedef __m128i | simd_4x32i_t |
typedef __m128i | simd_4x32u_t |
typedef __m128 | simd_4x32f_t |
We provide a set of packaged SIMD operations, which we can use to optimise common operations on multiple compilers and processor architectures, using either compiler intrinsics or in-line assembly.
Short sequences of SIMD operations are packaged into macros, with a generic version of the operation in the initial section. Then, for each architecture on which specialisation is done, the generic macro is undefined, and a specialised version of the macro is implemented.
#define SIMD_16x8u_COPY_NONZERO | ( | dest, | |
src | |||
) |
Copy 16 byte values from src to dest, but only if the value in src is not zero.
[out] | dest | Destination address to copy to. For best performance, this should be 128 bit aligned, but does not need to be. |
[in] | src | Source address to copy from. For best performance, this should be 128 bit aligned, but does not need to be. |
#define SIMD_16x8u_COPY_NONZERO | ( | dest, | |
src | |||
) |
Copy 16 byte values from src to dest, but only if the value in src is not zero.
[out] | dest | Destination address to copy to. For best performance, this should be 128 bit aligned, but does not need to be. |
[in] | src | Source address to copy from. For best performance, this should be 128 bit aligned, but does not need to be. |
#define SIMD_8x16u_COPY_NONZERO | ( | dest, | |
src | |||
) |
Copy 8 short values from src to dest, but only if the value in src is not zero.
[out] | dest | Destination address to copy to. For best performance, this should be 128 bit aligned, but does not need to be. |
[in] | src | Source address to copy from. For best performance, this should be 128 bit aligned, but does not need to be. |
#define SIMD_8x16u_COPY_NONZERO | ( | dest, | |
src | |||
) |
Copy 8 short values from src to dest, but only if the value in src is not zero.
[out] | dest | Destination address to copy to. For best performance, this should be 128 bit aligned, but does not need to be. |
[in] | src | Source address to copy from. For best performance, this should be 128 bit aligned, but does not need to be. |
A SIMD type that supports 16x8bit signed integer operations.
A SIMD type that supports 16x8bit unsigned integer operations.
A SIMD type that supports 4x32bit floating point operations.
A SIMD type that supports 4x32bit signed integer operations.
A SIMD type that supports 4x32bit unsigned integer operations.
A SIMD type that supports 8x16bit signed integer operations.
A SIMD type that supports 8x16bit unsigned integer operations.