Previous Begining of the main_intrinsics.cpp file |
Parent Manual vectorization (by Intrinsic functions) |
Outline | Next The function to evaluate performances |
1 2 3 4 5 6 7 8 9 10 |
void hadamard_product(float* tabResult, const float* tabX, const float* tabY, long unsigned int nbElement){ long unsigned int vecSize(VECTOR_ALIGNEMENT/sizeof(float)); long unsigned int nbVec(nbElement/vecSize); for(long unsigned int i(0lu); i < nbVec; ++i){ __m256 vecX = _mm256_load_ps(tabX + i*vecSize); __m256 vecY = _mm256_load_ps(tabY + i*vecSize); __m256 vecRes = _mm256_mul_ps(vecX, vecY); _mm256_store_ps(tabResult + i*vecSize, vecRes); } } |
Remember, if you do NOT provide aligned data to this kernel you will have a segmentation fault error.
Previous Begining of the main_intrinsics.cpp file |
Parent Manual vectorization (by Intrinsic functions) |
Outline | Next The function to evaluate performances |