the proper check using compiler flags would be
defined(__ARM_NEON) || defined(__aarch64__)
however explicit wscript defined "ARM_NEON_SUPPORT" is prefereable.
This commit adds ARM NEON optimized routines for the following procedures
below:
*_compute_peak
*_find_peaks
*_apply_gain_to_buffer
*_mix_buffers_with_gain
*_mix_buffers_no_gain
*_copy_vector
NEON optimized routines have a prefix of: arm_neon_