11.3.3 : The compilation

Let's compile :
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
$ make
-- Configuring done
-- Generating done
-- Build files have been written to: ExampleOptimisation/build
[  0%] Built target hadamard_product_O2
[  2%] Built target hadamard_product_O1
[  4%] Built target hadamard_product_vectorize
[  6%] Built target hadamard_product_O0
[  8%] Built target hadamard_product_O3
[  8%] Built target hadamard_product_Ofast
[ 10%] Built target hadamard_product_intrinsics
[ 10%] Built target asterics_hpc
[ 13%] Built target saxpy_O2
[ 15%] Built target saxpy_O0
[ 17%] Built target saxpy_O3
[ 17%] Built target saxpy_O1
[ 17%] Built target saxpy_Ofast
[ 17%] Built target saxpy_vectorize
[ 19%] Built target saxpy_intrinsics
[ 21%] Built target reduction_real_O2
[ 23%] Built target reduction_real_intrinsics_interleave8_O3
[ 26%] Built target reduction_real_O1
[ 28%] Built target reduction_real_Ofast
[ 30%] Built target reduction_O0
[ 30%] Built target reduction_O1
[ 32%] Built target reduction_O2
[ 32%] Built target reduction_O3
[ 34%] Built target reduction_real_intrinsics_interleave4_O3
[ 36%] Built target reduction_real_vectorize_Ofast
[ 39%] Built target reduction_real_intrinsics_interleave2_O3
[ 41%] Built target reduction_real_intrinsics_O3
[ 43%] Built target reduction_real_O3
[ 45%] Built target reduction_real_O0
[ 47%] Built target reduction_real_vectorize_O3
[ 50%] Built target barycentre_intrinsics
[ 52%] Built target barycentre_base_O2
[ 54%] Built target barycentre_base_O1
[ 56%] Built target barycentre_base_O0
[ 58%] Built target barycentre_vectorizeSplit_O3
[ 60%] Built target barycentre_base_Ofast
[ 63%] Built target barycentre_base_O3
[ 65%] Built target barycentre_vectorize_O3
[ 67%] Built target sgemm_intrinsicsPitch_O3
[ 69%] Built target sgemm_vectorize_Ofast
[ 71%] Built target sgemm_base_O1
[ 73%] Built target sgemm_vectorize_O3
[ 76%] Built target sgemm_base_Ofast
[ 78%] Built target sgemm_base_O3
[ 80%] Built target sgemm_base_O0
[ 82%] Built target sgemm_intrinsics_O3
[ 84%] Built target sgemm_swap_Ofast
[ 86%] Built target sgemm_swap_O3
[ 89%] Built target sgemm_base_O2
Scanning dependencies of target branchVectorize_Ofast
[ 89%] Building CXX object 7-BranchingPredicator/CMakeFiles/branchVectorize_Ofast.dir/main_vectorize.cpp.o
[ 89%] Linking CXX executable branchVectorize_Ofast
[ 89%] Built target branchVectorize_Ofast
Scanning dependencies of target branchVectorize_O3
[ 89%] Building CXX object 7-BranchingPredicator/CMakeFiles/branchVectorize_O3.dir/main_vectorize.cpp.o
[ 91%] Linking CXX executable branchVectorize_O3
[ 91%] Built target branchVectorize_O3
[ 91%] Built target branchPrediction_O1
[ 93%] Built target branchPrediction_O2
[ 95%] Built target branchPrediction_O0
[ 95%] Built target branchPrediction_O3
[ 97%] Built target branchPrediction_Ofast
[ 97%] Built target branchOptimise_O3
[100%] Built target branchOptimise_Ofast
Let's get the performances :
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
$ make plot_all
[  0%] Built target asterics_hpc
[  1%] Built target branchOptimise_Ofast
[  1%] Built target branchPrediction_O3
[  2%] Built target branchPrediction_Ofast
[  2%] Built target branchOptimise_O3
[  4%] Built target plot_branchOptimise
[  6%] Built target hadamard_product_intrinsics
[  7%] Built target hadamard_product_vectorize
[  8%] Built target hadamard_product_O3
[  9%] Built target plot_hadamardIntrinsics
[  9%] Built target hadamard_product_Ofast
[  9%] Built target hadamard_product_O2
[ 10%] Built target hadamard_product_O1
[ 12%] Built target hadamard_product_O0
[ 14%] Built target plot_hadamardBase
[ 15%] Built target plot_hadamardVectorize
[ 16%] Built target saxpy_intrinsics
[ 18%] Built target saxpy_O3
[ 18%] Built target saxpy_vectorize
[ 19%] Built target plot_saxpyIntrinsics
[ 20%] Built target plot_saxpyVectorize
[ 20%] Built target saxpy_Ofast
[ 21%] Built target saxpy_O2
[ 22%] Built target saxpy_O0
[ 22%] Built target saxpy_O1
[ 25%] Built target plot_saxpyBase
[ 26%] Built target reduction_real_intrinsics_O3
[ 27%] Built target reduction_real_intrinsics_interleave8_O3
[ 28%] Built target reduction_real_Ofast
[ 30%] Built target reduction_real_intrinsics_interleave4_O3
[ 31%] Built target reduction_real_vectorize_Ofast
[ 32%] Built target reduction_real_intrinsics_interleave2_O3
[ 34%] Built target plot_reductionIntrinsicsInterleave8
[ 36%] Built target reduction_real_vectorize_O3
[ 37%] Built target reduction_real_O3
[ 38%] Built target plot_reductionVectorize
[ 38%] Built target reduction_O3
[ 39%] Built target reduction_O0
[ 39%] Built target reduction_O1
[ 40%] Built target reduction_O2
[ 43%] Built target plot_reductionBase
[ 44%] Built target reduction_real_O0
[ 45%] Built target reduction_real_O2
[ 46%] Built target reduction_real_O1
[ 49%] Built target plot_reductionReal
[ 51%] Built target plot_reductionIntrinsicsInterleave2
[ 54%] Built target plot_reductionIntrinsicsInterleave4
[ 55%] Built target plot_reductionIntrinsics
[ 56%] Built target barycentre_vectorize_O3
[ 57%] Built target barycentre_intrinsics
[ 59%] Built target barycentre_vectorizeSplit_O3
[ 60%] Built target barycentre_base_O3
[ 62%] Built target plot_barycentreIntrinsics
[ 63%] Built target barycentre_base_O2
[ 65%] Built target barycentre_base_O1
[ 66%] Built target barycentre_base_O0
[ 67%] Built target barycentre_base_Ofast
[ 68%] Built target plot_barycentreBase
[ 69%] Built target plot_barycentreVectorize
[ 71%] Built target sgemm_intrinsics_O3
[ 72%] Built target sgemm_intrinsicsPitch_O3
[ 73%] Built target sgemm_vectorize_Ofast
[ 74%] Built target sgemm_vectorize_O3
[ 77%] Built target plot_sgemmIntrinsicsPitch
[ 78%] Built target plot_sgemmIntrinsics
[ 79%] Built target sgemm_base_O2
[ 80%] Built target sgemm_base_O1
[ 81%] Built target sgemm_base_Ofast
[ 83%] Built target sgemm_base_O3
[ 84%] Built target sgemm_base_O0
[ 86%] Built target plot_sgemmBase
[ 87%] Built target sgemm_swap_O3
[ 89%] Built target sgemm_swap_Ofast
[ 91%] Built target plot_sgemmVectorize
[ 92%] Built target plot_sgemmSwap
[ 92%] Built target branchVectorize_Ofast
[ 93%] Built target branchVectorize_O3
Scanning dependencies of target plot_branchVectorize
[ 93%] Run branchVectorize_Ofast program
Branching probability no branching
evaluateDummyCopy : proba = 0.1, nbElement = 10000, cyclePerElement = 0.5556 cy/el, elapsedTime = 5556 cy
evaluateDummyCopy : proba = 0.2, nbElement = 10000, cyclePerElement = 0.4821 cy/el, elapsedTime = 4821 cy
evaluateDummyCopy : proba = 0.3, nbElement = 10000, cyclePerElement = 0.5948 cy/el, elapsedTime = 5948 cy
evaluateDummyCopy : proba = 0.4, nbElement = 10000, cyclePerElement = 0.5074 cy/el, elapsedTime = 5074 cy
evaluateDummyCopy : proba = 0.5, nbElement = 10000, cyclePerElement = 0.5345 cy/el, elapsedTime = 5345 cy
evaluateDummyCopy : proba = 0.6, nbElement = 10000, cyclePerElement = 0.5465 cy/el, elapsedTime = 5465 cy
evaluateDummyCopy : proba = 0.7, nbElement = 10000, cyclePerElement = 0.5545 cy/el, elapsedTime = 5545 cy
evaluateDummyCopy : proba = 0.8, nbElement = 10000, cyclePerElement = 0.5456 cy/el, elapsedTime = 5456 cy
evaluateDummyCopy : proba = 0.9, nbElement = 10000, cyclePerElement = 0.5475 cy/el, elapsedTime = 5475 cy
[ 95%] Run branchVectorize_O3 program
Branching probability no branching
evaluateDummyCopy : proba = 0.1, nbElement = 10000, cyclePerElement = 0.56 cy/el, elapsedTime = 5600 cy
evaluateDummyCopy : proba = 0.2, nbElement = 10000, cyclePerElement = 0.5543 cy/el, elapsedTime = 5543 cy
evaluateDummyCopy : proba = 0.3, nbElement = 10000, cyclePerElement = 0.5184 cy/el, elapsedTime = 5184 cy
evaluateDummyCopy : proba = 0.4, nbElement = 10000, cyclePerElement = 0.5459 cy/el, elapsedTime = 5459 cy
evaluateDummyCopy : proba = 0.5, nbElement = 10000, cyclePerElement = 0.4981 cy/el, elapsedTime = 4981 cy
evaluateDummyCopy : proba = 0.6, nbElement = 10000, cyclePerElement = 0.5346 cy/el, elapsedTime = 5346 cy
evaluateDummyCopy : proba = 0.7, nbElement = 10000, cyclePerElement = 0.4549 cy/el, elapsedTime = 4549 cy
evaluateDummyCopy : proba = 0.8, nbElement = 10000, cyclePerElement = 0.5381 cy/el, elapsedTime = 5381 cy
evaluateDummyCopy : proba = 0.9, nbElement = 10000, cyclePerElement = 0.4712 cy/el, elapsedTime = 4712 cy
[ 95%] Call gnuplot branchVectorize
[ 95%] Built target plot_branchVectorize
[ 95%] Built target branchPrediction_O1
[ 96%] Built target branchPrediction_O2
[ 97%] Built target branchPrediction_O0
[100%] Built target plot_branchPredicator
[100%] Built target plot_all