8.6.4.2 : Reduction with our intrinsics implementation
Now, let's write the reductionIntrinsicsPython.py file to test our implementation :
We need also to import several packages :
- sys : to make an output compatible with C++ performances output
- numpy : to deal with arrays
- astericshpc : to allocate arrays and do the performance test
1
2
3
|
import sys
import astericshpc
import reductionpython
|
The function to initialise tables :
1
2
3
4
5
|
def allocInitTable(nbElement):
tab = astericshpc.allocTable(nbElement)
for i in range(0, nbElement):
tab[i] = float(i*32%17)
return tab
|
The function to evaluate performances is built the same way such as the C++ one :
1
2
3
4
5
6
7
8
9
10
11
|
def getTimeHadamardSize(nbRepetition, nbElement):
tabX = allocInitTable(nbElement)
timeBegin = astericshpc.rdtsc()
for i in range(0, nbRepetition):
res = reductionpython.reduction(tabX)
timeEnd = astericshpc.rdtsc()
elapsedTime = float(timeEnd - timeBegin)/float(nbRepetition)
elapsedTimePerElement = elapsedTime/float(nbElement)
print("nbElement =",nbElement,", elapsedTimePerElement =",elapsedTimePerElement,"cy/el",", elapsedTime =",elapsedTime,"cy")
print(str(nbElement) + "\t" + str(elapsedTimePerElement) + "\t" + str(elapsedTime),file=sys.stderr)
|
Then, we have a function to make all the points with a list of sizes :
1
2
3
|
def makeElapsedTimeValue(listSize, nbRepetition):
for val in listSize:
getTimeHadamardSize(nbRepetition, val)
|
Finally, we call the performances tests only if this script is executed as a main file and not if it is included by an other file :
1
2
3
4
5
6
7
|
if __name__ == "__main__":
listSize = [ 1024,
2048,
3072,
4992,
10048]
makeElapsedTimeValue(listSize, 10000000)
|
The full
reductionIntrinsicsPython.py file :
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
|
'''
Auteur : Pierre Aubert
Mail : aubertp7@gmail.com
Licence : CeCILL-C
'''
import sys
import astericshpc
import reductionpython
def allocInitTable(nbElement):
tab = astericshpc.allocTable(nbElement)
for i in range(0, nbElement):
tab[i] = float(i*32%17)
return tab
def getTimeHadamardSize(nbRepetition, nbElement):
tabX = allocInitTable(nbElement)
timeBegin = astericshpc.rdtsc()
for i in range(0, nbRepetition):
res = reductionpython.reduction(tabX)
timeEnd = astericshpc.rdtsc()
elapsedTime = float(timeEnd - timeBegin)/float(nbRepetition)
elapsedTimePerElement = elapsedTime/float(nbElement)
print("nbElement =",nbElement,", elapsedTimePerElement =",elapsedTimePerElement,"cy/el",", elapsedTime =",elapsedTime,"cy")
print(str(nbElement) + "\t" + str(elapsedTimePerElement) + "\t" + str(elapsedTime),file=sys.stderr)
def makeElapsedTimeValue(listSize, nbRepetition):
for val in listSize:
getTimeHadamardSize(nbRepetition, val)
if __name__ == "__main__":
listSize = [ 1024,
2048,
3072,
4992,
10048]
makeElapsedTimeValue(listSize, 10000000)
|
You can download it
here.