Files
file	hqsimd.h
	Macros implementing SIMD operations.

Macros
#define	SIMD_16x8u_COPY_NONZERO(dest, src)

#define	SIMD_16x8u_COPY_NONZERO(dest, src)

#define	SIMD_8x16u_COPY_NONZERO(dest, src)

#define	SIMD_8x16u_COPY_NONZERO(dest, src)

Typedefs
typedef __m128i	simd_16x8i_t

typedef __m128i	simd_16x8u_t

typedef __m128i	simd_8x16i_t

typedef __m128i	simd_8x16u_t

typedef __m128i	simd_4x32i_t

typedef __m128i	simd_4x32u_t

typedef __m128	simd_4x32f_t

Detailed Description

We provide a set of packaged SIMD operations, which we can use to optimise common operations on multiple compilers and processor architectures, using either compiler intrinsics or in-line assembly.

Short sequences of SIMD operations are packaged into macros, with a generic version of the operation in the initial section. Then, for each architecture on which specialisation is done, the generic macro is undefined, and a specialised version of the macro is implemented.

Macro Definition Documentation

◆ SIMD_16x8u_COPY_NONZERO [1/2]

#define SIMD_16x8u_COPY_NONZERO	(	dest,
		src
	)

Value:

  MACRO_START                  \
  uint8 *_dest_ = (dest), *_src_ = (src) ;                              \
  for ( unsigned int _i_ = 0 ; _i_ < 16 ; ++_i_ ) {                     \
    if ( _src_[_i_] != 0 )                                              \
      _dest_[_i_] = _src_[_i_] ;                                        \
  }                                                                     \
MACRO_END

Copy 16 byte values from src to dest, but only if the value in src is not zero.

Parameters

[out]	dest	Destination address to copy to. For best performance, this should be 128 bit aligned, but does not need to be.
[in]	src	Source address to copy from. For best performance, this should be 128 bit aligned, but does not need to be.

◆ SIMD_16x8u_COPY_NONZERO [2/2]

#define SIMD_16x8u_COPY_NONZERO	(	dest,
		src
	)

Copy 16 byte values from src to dest, but only if the value in src is not zero.

Parameters

[out]	dest	Destination address to copy to. For best performance, this should be 128 bit aligned, but does not need to be.
[in]	src	Source address to copy from. For best performance, this should be 128 bit aligned, but does not need to be.

◆ SIMD_8x16u_COPY_NONZERO [1/2]

#define SIMD_8x16u_COPY_NONZERO	(	dest,
		src
	)

Value:

  MACRO_START                   \
  uint16 *_dest_ = (dest), *_src_ = (src) ;                             \
  for ( unsigned int _i_ = 0 ; _i_ < 8 ; ++_i_ ) {                      \
    if ( _src_[_i_] != 0 )                                              \
      _dest_[_i_] = _src_[_i_] ;                                        \
  }                                                                     \
MACRO_END

Copy 8 short values from src to dest, but only if the value in src is not zero.

Parameters

[out]	dest	Destination address to copy to. For best performance, this should be 128 bit aligned, but does not need to be.
[in]	src	Source address to copy from. For best performance, this should be 128 bit aligned, but does not need to be.

◆ SIMD_8x16u_COPY_NONZERO [2/2]

#define SIMD_8x16u_COPY_NONZERO	(	dest,
		src
	)

Copy 8 short values from src to dest, but only if the value in src is not zero.

Parameters

[out]	dest	Destination address to copy to. For best performance, this should be 128 bit aligned, but does not need to be.
[in]	src	Source address to copy from. For best performance, this should be 128 bit aligned, but does not need to be.