10.5.6 : The performances

nothing nothing

Left panel : total averaged time for the sgemm function. Right panel : averaged time to compute one single element.

The problem of this method is the matrices have to have a number of columns which is a multiple of the number of float in a vectorial register size. So you can try by yourself to find a solution to have an intrinsics method available for any matrix size.