forked from BilalY/Rasagar
184 KiB
184 KiB
uid |
---|
neon-intrinsics |
Burst Arm Neon intrinsics reference
This page contains an ordered reference for the APIs in Unity.Burst.Intrinsics.Arm.Neon. For information on how to use these, see the documentation on Processor specific SIMD extensions.
Intrinsics type creation and conversion
Arithmetic
Operation | Description | APIs |
---|---|---|
vadd | Add | |
vaddv | Add across vector | |
vaddl | Add long | |
vaddlv | Add long across Vector | Click here to expand the API listvaddlv_s16vaddlv_s32 vaddlv_s8 vaddlv_u16 vaddlv_u32 vaddlv_u8 vaddlvq_s16 vaddlvq_s32 vaddlvq_s8 vaddlvq_u16 vaddlvq_u32 vaddlvq_u8 |
vaddw | Add wide | |
vhadd | Halving add | Click here to expand the API listvhadd_s16vhadd_s32 vhadd_s8 vhadd_u16 vhadd_u32 vhadd_u8 vhaddq_s16 vhaddq_s32 vhaddq_s8 vhaddq_u16 vhaddq_u32 vhaddq_u8 |
vrhadd | Rounding halving add | Click here to expand the API listvrhadd_s16vrhadd_s32 vrhadd_s8 vrhadd_u16 vrhadd_u32 vrhadd_u8 vrhaddq_s16 vrhaddq_s32 vrhaddq_s8 vrhaddq_u16 vrhaddq_u32 vrhaddq_u8 |
vqadd | Saturating add | |
vsqadd | Unsigned saturating Accumulate of signed value | Click here to expand the API listvsqadd_u16vsqadd_u32 vsqadd_u64 vsqadd_u8 vsqaddq_u16 vsqaddq_u32 vsqaddq_u64 vsqaddq_u8 vsqaddb_u8 vsqaddh_u16 vsqadds_u32 vsqaddd_u64 |
vuqadd | Signed saturating Accumulate of unsigned value | Click here to expand the API listvuqadd_s16vuqadd_s32 vuqadd_s64 vuqadd_s8 vuqaddq_s16 vuqaddq_s32 vuqaddq_s64 vuqaddq_s8 vuqaddb_s8 vuqaddh_s16 vuqadds_s32 vuqaddd_s64 |
vaddhn | Add returning high narrow | |
vraddhn | Rounding add returning high narrow | |
vpadd | Add pairwise (vector) | |
vpaddl | Signed add long pairwise | Click here to expand the API listvpaddl_s16vpaddl_s32 vpaddl_s8 vpaddl_u16 vpaddl_u32 vpaddl_u8 vpaddlq_s16 vpaddlq_s32 vpaddlq_s8 vpaddlq_u16 vpaddlq_u32 vpaddlq_u8 |
vpadal | Signed add and accumulate long pairwise | Click here to expand the API listvpadal_s16vpadal_s32 vpadal_s8 vpadal_u16 vpadal_u32 vpadal_u8 vpadalq_s16 vpadalq_s32 vpadalq_s8 vpadalq_u16 vpadalq_u32 vpadalq_u8 |
vsub | Subtract | |
vsubl | Subtract long | |
vsubw | Subtract wide | |
vhsub | Halving subtract | Click here to expand the API listvhsub_s16vhsub_s32 vhsub_s8 vhsub_u16 vhsub_u32 vhsub_u8 vhsubq_s16 vhsubq_s32 vhsubq_s8 vhsubq_u16 vhsubq_u32 vhsubq_u8 |
vqsub | Saturating subtract | |
vsubhn | Subtract returning high narrow | |
vrsubhn | Rounding subtract returning high narrow |
Multiply
Operation | Description | APIs |
---|---|---|
vmul | Multiply (vector) | |
vmul_n | Vector multiply by scalar | Click here to expand the API listvmul_n_f32vmul_n_f64 vmul_n_s16 vmul_n_s32 vmul_n_u16 vmul_n_u32 vmulq_n_f32 vmulq_n_f64 vmulq_n_s16 vmulq_n_s32 vmulq_n_u16 vmulq_n_u32 |
vmul_lane | Multiply (vector) | Click here to expand the API listvmul_lane_f32vmul_lane_f64 vmul_lane_s16 vmul_lane_s32 vmul_lane_u16 vmul_lane_u32 vmul_laneq_f32 vmul_laneq_f64 vmul_laneq_s16 vmul_laneq_s32 vmul_laneq_u16 vmul_laneq_u32 vmulq_lane_f32 vmulq_lane_f64 vmulq_lane_s16 vmulq_lane_s32 vmulq_lane_u16 vmulq_lane_u32 vmulq_laneq_f32 vmulq_laneq_f64 vmulq_laneq_s16 vmulq_laneq_s32 vmulq_laneq_u16 vmulq_laneq_u32 vmuls_lane_f32 vmuls_laneq_f32 vmuld_lane_f64 vmuld_laneq_f64 |
vmull | Multiply long (vector) | |
vmull_n | Vector long multiply by scalar | Click here to expand the API listvmull_n_s16vmull_n_s32 vmull_n_u16 vmull_n_u32 vmull_high_n_s16 vmull_high_n_s32 vmull_high_n_u16 vmull_high_n_u32 |
vmull_lane | Multiply long (vector) | Click here to expand the API listvmull_lane_s16vmull_lane_s32 vmull_lane_u16 vmull_lane_u32 vmull_laneq_s16 vmull_laneq_s32 vmull_laneq_u16 vmull_laneq_u32 vmull_high_lane_s16 vmull_high_lane_s32 vmull_high_lane_u16 vmull_high_lane_u32 vmull_high_laneq_s16 vmull_high_laneq_s32 vmull_high_laneq_u16 vmull_high_laneq_u32 |
vmulx | Floating-point multiply extended | |
vmla | Multiply-add to accumulator (vector) | |
vmla_lane | Vector multiply accumulate with scalar | Click here to expand the API listvmla_lane_f32vmla_lane_s16 vmla_lane_s32 vmla_lane_u16 vmla_lane_u32 vmla_laneq_f32 vmla_laneq_s16 vmla_laneq_s32 vmla_laneq_u16 vmla_laneq_u32 vmlaq_lane_f32 vmlaq_lane_s16 vmlaq_lane_s32 vmlaq_lane_u16 vmlaq_lane_u32 vmlaq_laneq_f32 vmlaq_laneq_s16 vmlaq_laneq_s32 vmlaq_laneq_u16 vmlaq_laneq_u32 |
vmla_n | Vector multiply accumulate with scalar | Click here to expand the API listvmla_n_f32vmla_n_s16 vmla_n_s32 vmla_n_u16 vmla_n_u32 vmlaq_n_f32 vmlaq_n_s16 vmlaq_n_s32 vmlaq_n_u16 vmlaq_n_u32 |
vmlal | Multiply-accumulate long (vector) | |
vmlal_lane | Multiply-accumulate long with scalar | Click here to expand the API listvmlal_lane_s16vmlal_lane_s32 vmlal_lane_u16 vmlal_lane_u32 vmlal_laneq_s16 vmlal_laneq_s32 vmlal_laneq_u16 vmlal_laneq_u32 vmlal_high_lane_s16 vmlal_high_lane_s32 vmlal_high_lane_u16 vmlal_high_lane_u32 vmlal_high_laneq_s16 vmlal_high_laneq_s32 vmlal_high_laneq_u16 vmlal_high_laneq_u32 |
vmlal_n | Multiply-accumulate long with scalar | Click here to expand the API listvmlal_n_s16vmlal_n_s32 vmlal_n_u16 vmlal_n_u32 vmlal_high_n_s16 vmlal_high_n_s32 vmlal_high_n_u16 vmlal_high_n_u32 |
vmls | Multiply-subtract from accumulator (vector) | |
vmls_lane | Vector multiply subtract with scalar | Click here to expand the API listvmls_lane_f32vmls_lane_s16 vmls_lane_s32 vmls_lane_u16 vmls_lane_u32 vmls_laneq_f32 vmls_laneq_s16 vmls_laneq_s32 vmls_laneq_u16 vmls_laneq_u32 vmlsq_lane_f32 vmlsq_lane_s16 vmlsq_lane_s32 vmlsq_lane_u16 vmlsq_lane_u32 vmlsq_laneq_f32 vmlsq_laneq_s16 vmlsq_laneq_s32 vmlsq_laneq_u16 vmlsq_laneq_u32 |
vmls_n | Vector multiply subtract with scalar | Click here to expand the API listvmls_n_f32vmls_n_s16 vmls_n_s32 vmls_n_u16 vmls_n_u32 vmlsq_n_f32 vmlsq_n_s16 vmlsq_n_s32 vmlsq_n_u16 vmlsq_n_u32 |
vmlsl | Multiply-subtract long (vector) | |
vmlsl_lane | Vector multiply-subtract long with scalar | Click here to expand the API listvmlsl_lane_s16vmlsl_lane_s32 vmlsl_lane_u16 vmlsl_lane_u32 vmlsl_laneq_s16 vmlsl_laneq_s32 vmlsl_laneq_u16 vmlsl_laneq_u32 vmlsl_high_lane_s16 vmlsl_high_lane_s32 vmlsl_high_lane_u16 vmlsl_high_lane_u32 vmlsl_high_laneq_s16 vmlsl_high_laneq_s32 vmlsl_high_laneq_u16 vmlsl_high_laneq_u32 |
vmlsl_n | Vector multiply-subtract long with scalar | Click here to expand the API listvmlsl_n_s16vmlsl_n_s32 vmlsl_n_u16 vmlsl_n_u32 vmlsl_high_n_s16 vmlsl_high_n_s32 vmlsl_high_n_u16 vmlsl_high_n_u32 |
vqdmull | Signed saturating doubling multiply long | Click here to expand the API listvqdmull_s16vqdmull_s32 vqdmullh_s16 vqdmulls_s32 vqdmull_high_s16 vqdmull_high_s32 |
vqdmull_lane | Vector saturating doubling multiply long with scalar | |
vqdmull_n | Vector saturating doubling multiply long with scalar | |
vqdmulh | Saturating doubling multiply returning high half | Click here to expand the API listvqdmulh_s16vqdmulh_s32 vqdmulhq_s16 vqdmulhq_s32 vqdmulhh_s16 vqdmulhs_s32 |
vqdmulh_lane | Vector saturating doubling multiply high by scalar | |
vqdmulh_n | Vector saturating doubling multiply high by scalar | |
vqrdmulh | Saturating rounding doubling multiply returning high half | Click here to expand the API listvqrdmulh_s16vqrdmulh_s32 vqrdmulhq_s16 vqrdmulhq_s32 vqrdmulhh_s16 vqrdmulhs_s32 |
vqrdmulh_lane | Vector saturating rounding doubling multiply high with scalar | |
vqrdmulh_n | Vector saturating rounding doubling multiply high with scalar | |
vqdmlal | Saturating doubling multiply-add long | Click here to expand the API listvqdmlal_s16vqdmlal_s32 vqdmlalh_s16 vqdmlals_s32 vqdmlal_high_s16 vqdmlal_high_s32 |
vqdmlal_lane | Vector saturating doubling multiply-accumulate long with scalar |
|
vqdmlal_n | Vector saturating doubling multiply-accumulate long with scalar |
|
vqdmlsl | Signed saturating doubling multiply-subtract long | Click here to expand the API listvqdmlsl_s16vqdmlsl_s32 vqdmlslh_s16 vqdmlsls_s32 vqdmlsl_high_s16 vqdmlsl_high_s32 |
vqdmlsl_lane | Vector saturating doubling multiply-subtract long with scalar |
|
vqdmlsl_n | Vector saturating doubling multiply-subtract long with scalar |
|
vqrdmlah | Saturating rounding doubling multiply accumulate returning high half (vector) |
Click here to expand the API listvqrdmlah_s16vqrdmlah_s32 vqrdmlahq_s16 vqrdmlahq_s32 vqrdmlahh_s16 vqrdmlahs_s32 |
vqrdmlah_lane | Saturating rounding doubling multiply accumulate returning high half (vector) |
|
vqrdmlsh | Saturating rounding doubling multiply subtract returning high half (vector) |
Click here to expand the API listvqrdmlsh_s16vqrdmlsh_s32 vqrdmlshq_s16 vqrdmlshq_s32 vqrdmlshh_s16 vqrdmlshs_s32 |
vqrdmlsh_lane | Saturating rounding doubling multiply subtract returning high half (vector) |
|
vfma | Floating-point fused multiply-add to accumulator (vector) | |
vfma_n | Floating-point fused multiply-add to accumulator (vector) | |
vfma_lane | Floating-point fused multiply-add to accumulator (vector) | |
vfms | Floating-point fused multiply-subtract from accumulator (vector) |
|
vfms_n | Floating-point fused multiply-subtract from accumulator (vector) |
|
vfms_lane | Floating-point fused multiply-subtract from accumulator (vector) |
|
vdiv | Floating-point divide (vector) |
Data processing
Operation | Description | APIs |
---|---|---|
vpmax | Maximum pairwise | |
vpmaxnm | Floating-point maximum number pairwise (vector) | |
vpmin | Minimum pairwise | |
vpminnm | Floating-point minimum number pairwise (vector) | |
vabd | Absolute difference | |
vabdl | Absolute difference long | |
vaba | Absolute difference and accumulate | |
vabal | Absolute difference and accumulate long | |
vmax | Maximum | |
vmaxnm | Floating-point maximum number | Click here to expand the API listvmaxnm_f32vmaxnm_f64 vmaxnmq_f32 vmaxnmq_f64 vmaxnmv_f32 vmaxnmvq_f32 vmaxnmvq_f64 |
vmaxv | Maximum across vector | |
vmin | Minimum | |
vminnm | Floating-point minimum number | Click here to expand the API listvminnm_f32vminnm_f64 vminnmq_f32 vminnmq_f64 vminnmv_f32 vminnmvq_f32 vminnmvq_f64 |
vminv | Minimum across vector | |
vabs | Absolute value | |
vqabs | Saturating absolute value | Click here to expand the API listvqabs_s16vqabs_s32 vqabs_s64 vqabs_s8 vqabsq_s16 vqabsq_s32 vqabsq_s64 vqabsq_s8 vqabsb_s8 vqabsh_s16 vqabss_s32 vqabsd_s64 |
vneg | Negate | |
vqneg | Saturating negate | Click here to expand the API listvqneg_s16vqneg_s32 vqneg_s64 vqneg_s8 vqnegq_s16 vqnegq_s32 vqnegq_s64 vqnegq_s8 vqnegb_s8 vqnegh_s16 vqnegs_s32 vqnegd_s64 |
vcls | Count leading sign bits | |
vclz | Count leading zero bits | |
vcnt | Population count per byte | |
vrecpe | Reciprocal estimate | Click here to expand the API listvrecpe_f32vrecpe_f64 vrecpe_u32 vrecpeq_f32 vrecpeq_f64 vrecpeq_u32 vrecpes_f32 vrecped_f64 |
vrecps | Reciprocal step | Click here to expand the API listvrecps_f32vrecps_f64 vrecpsq_f32 vrecpsq_f64 vrecpss_f32 vrecpsd_f64 |
vrecpx | Floating-point reciprocal exponent | |
vrsqrte | Reciprocal square root estimate | Click here to expand the API listvrsqrte_f32vrsqrte_f64 vrsqrte_u32 vrsqrteq_f32 vrsqrteq_f64 vrsqrteq_u32 vrsqrtes_f32 vrsqrted_f64 |
vrsqrts | Reciprocal square root step | Click here to expand the API listvrsqrts_f32vrsqrts_f64 vrsqrtsq_f32 vrsqrtsq_f64 vrsqrtss_f32 vrsqrtsd_f64 |
vmovn | Extract narrow | |
vmovl | Extract long | |
vqmovn | Saturating extract narrow | |
vqmovun | Signed saturating extract unsigned narrow | Click here to expand the API listvqmovun_s16vqmovun_s32 vqmovun_s64 vqmovun_high_s16 vqmovun_high_s32 vqmovun_high_s64 vqmovunh_s16 vqmovuns_s32 vqmovund_s64 |
Comparison
Operation | Description | APIs |
---|---|---|
vceq | Compare bitwise equal | |
vceqz | Compare bitwise equal to zero | |
vcge | Compare greater than or equal | |
vcgez | Compare greater than or equal to zero | |
vcle | Compare less than or equal | |
vclez | Compare less than or equal to zero | |
vcgt | Compare greater than | |
vcgtz | Compare greater than zero | |
vclt | Compare less than | |
vcltz | Compare less than zero | |
vcage | Floating-point absolute compare greater than or equal | |
vcagt | Floating-point absolute compare greater than | |
vcale | Floating-point absolute compare less than or equal | |
vcalt | Floating-point absolute compare less than |
Bitwise
Operation | Description | APIs |
---|---|---|
vtst | Test bits nonzero | |
vmvn | Bitwise NOT | |
vand | Bitwise AND | |
vorr | Bitwise OR | |
vorn | Bitwise OR NOT | |
veor | Bitwise exclusive OR | |
vbic | Bitwise bit clear | |
vbsl | Bitwise select |
Shift
Operation | Description | APIs |
---|---|---|
vshl | Shift left (register) | |
vqshl | Saturating shift left (register) | |
vqshl_n | Saturating shift left (immediate) | Click here to expand the API listvqshl_n_s16vqshl_n_s32 vqshl_n_s64 vqshl_n_s8 vqshl_n_u16 vqshl_n_u32 vqshl_n_u64 vqshl_n_u8 vqshlq_n_s16 vqshlq_n_s32 vqshlq_n_s64 vqshlq_n_s8 vqshlq_n_u16 vqshlq_n_u32 vqshlq_n_u64 vqshlq_n_u8 vqshlb_n_s8 vqshlb_n_u8 vqshlh_n_s16 vqshlh_n_u16 vqshls_n_s32 vqshls_n_u32 vqshld_n_s64 vqshld_n_u64 |
vqshlu_n | Saturating shift left unsigned (immediate) | |
vrshl | Rounding shift left (register) | |
vqrshl | Saturating rounding shift left (register) | |
vshl_n | Shift left (immediate) | |
vshll_n | Shift left long (immediate) | |
vshr_n | Shift right (immediate) | |
vrshr_n | Rounding right left (register) | |
vshrn_n | Shift right narrow (immediate) | |
vqshrun_n | Signed saturating shift right unsigned narrow (immediate) |
Click here to expand the API listvqshrun_n_s16vqshrun_n_s32 vqshrun_n_s64 vqshrunh_n_s16 vqshruns_n_s32 vqshrund_n_s64 vqshrun_high_n_s16 vqshrun_high_n_s32 vqshrun_high_n_s64 |
vqrshrun_n | Signed saturating rounded shift right unsigned narrow (immediate) |
Click here to expand the API listvqrshrun_n_s16vqrshrun_n_s32 vqrshrun_n_s64 vqrshrunh_n_s16 vqrshruns_n_s32 vqrshrund_n_s64 vqrshrun_high_n_s16 vqrshrun_high_n_s32 vqrshrun_high_n_s64 |
vqshrn_n | Signed saturating shift right narrow (immediate) | |
vrshrn_n | Rounding shift right narrow (immediate) | |
vqrshrn_n | Signed saturating rounded shift right narrow (immediate) | Click here to expand the API listvqrshrn_n_s16vqrshrn_n_s32 vqrshrn_n_s64 vqrshrn_n_u16 vqrshrn_n_u32 vqrshrn_n_u64 vqrshrnh_n_s16 vqrshrnh_n_u16 vqrshrns_n_s32 vqrshrns_n_u32 vqrshrnd_n_s64 vqrshrnd_n_u64 vqrshrn_high_n_s16 vqrshrn_high_n_s32 vqrshrn_high_n_s64 vqrshrn_high_n_u16 vqrshrn_high_n_u32 vqrshrn_high_n_u64 |
vsra_n | Signed shift right and accumulate (immediate) | |
vrsra_n | Signed rounding shift right and accumulate (immediate) | |
vsri_n | Shift right and insert (immediate) | |
vsli_n | Shift left and insert (immediate) |
Floating-point
Operation | Description | APIs |
---|---|---|
vcvt | Convert to/from another precision or fixed point, rounding towards zero |
Click here to expand the API listvcvt_f32_f64vcvt_f32_s32 vcvt_f32_u32 vcvt_f64_f32 vcvt_f64_s64 vcvt_f64_u64 vcvt_s32_f32 vcvt_s64_f64 vcvt_u32_f32 vcvt_u64_f64 vcvtq_f32_s32 vcvtq_f32_u32 vcvtq_f64_s64 vcvtq_f64_u64 vcvtq_s32_f32 vcvtq_s64_f64 vcvtq_u32_f32 vcvtq_u64_f64 vcvts_f32_s32 vcvts_f32_u32 vcvts_s32_f32 vcvts_u32_f32 vcvtd_f64_s64 vcvtd_f64_u64 vcvtd_s64_f64 vcvtd_u64_f64 vcvt_high_f32_f64 vcvt_high_f64_f32 |
vcvta | Convert to integer, rounding to nearest with ties to away | |
vcvtm | Convert to integer, rounding towards minus infinity | |
vcvtn | Convert to integer, rounding to nearest with ties to even | |
vcvtp | Convert to integer, rounding towards plus infinity | |
vcvtx | Convert to lower precision, rounding to nearest with ties to odd |
|
vcvt_n | Convert to/from fixed point, rounding towards zero | Click here to expand the API listvcvt_n_f32_s32vcvt_n_f32_u32 vcvt_n_f64_s64 vcvt_n_f64_u64 vcvt_n_s32_f32 vcvt_n_s64_f64 vcvt_n_u32_f32 vcvt_n_u64_f64 vcvtq_n_f32_s32 vcvtq_n_f32_u32 vcvtq_n_f64_s64 vcvtq_n_f64_u64 vcvtq_n_s32_f32 vcvtq_n_s64_f64 vcvtq_n_u32_f32 vcvtq_n_u64_f64 vcvts_n_f32_s32 vcvts_n_f32_u32 vcvts_n_s32_f32 vcvts_n_u32_f32 vcvtd_n_f64_s64 vcvtd_n_f64_u64 vcvtd_n_s64_f64 vcvtd_n_u64_f64 |
vrnd | Round to Integral, toward zero | |
vrnda | Round to Integral, with ties to away | |
vrndi | Round to Integral, using current rounding mode | |
vrndm | Round to Integral, towards minus infinity | |
vrndn | Round to Integral, with ties to even | |
vrndp | Round to Integral, towards plus infinity | |
vrndx | Round to Integral exact |
Load and store
Operation | Description | APIs |
---|---|---|
vld1 | Load vector from memory | |
vst1 | Store vector to memory | |
vget_lane | Get vector element | |
vset_lane | Set vector element |
Permutation
Operation | Description | APIs |
---|---|---|
vext | Extract vector from pair of vectors | |
vtbl1 | Table vector Lookup | |
vtbx1 | Table vector lookup extension | |
vqtbl1 | Table vector Lookup | |
vqtbx1 | Table vector lookup extension | |
vrbit | Reverse bit order | |
vrev16 | Reverse elements in 16-bit halfwords | |
vrev32 | Reverse elements in 32-bit words | Click here to expand the API listvrev32_s16vrev32_s8 vrev32_u16 vrev32_u8 vrev32q_s16 vrev32q_s8 vrev32q_u16 vrev32q_u8 |
vrev64 | Reverse elements in 64-bit doublewords | |
vtrn1 | Transpose vectors (primary) | |
vtrn2 | Transpose vectors (secondary) | |
vzip1 | Zip vectors (primary) | |
vzip2 | Zip vectors (secondary) | |
vuzp1 | Unzip vectors (primary) | |
vuzp2 | Unzip vectors (secondary) |
Cryptographic
Operation | APIs |
---|---|
CRC32 | |
SHA1 | Click here to expand the API listvsha1cq_u32vsha1h_u32 vsha1mq_u32 vsha1pq_u32 vsha1su0q_u32 vsha1su1q_u32 |
SHA256 | |
AES |
Miscellaneous
Operation | Description | APIs |
---|---|---|
vsqrt | Square root | |
vdot | Dot product | |
vdot_lane | Dot product | Click here to expand the API listvdot_lane_s32vdot_lane_u32 vdot_laneq_s32 vdot_laneq_u32 vdotq_lane_s32 vdotq_lane_u32 vdotq_laneq_s32 vdotq_laneq_u32 |