Stage 2: Benchmarking

jadach1
Nov 21, 2018
2 min read

Updated: Nov 26, 2018

Performance Testing

************************************************************

The program executable ./whirlpoolsum takes one parameter file and converts it into a hash.

where will i be testing ( on servers ) : Aarchie, Ccharlie

OS: Redhat, Fedora, v28

What CPU Architecture will i be testing on : AArch64

What I will be testing with ( 3 files each with a different size ) :

Big 22GB

Medium 11GB

Small 19MB

note: Each file is filled with the same repetitive phrase "hello world\n"

********** Baseline Results **********

Aarchie:

SYSTEM LOAD ( load average: 0.11, 0.08, 0.09 )

Big.txt file 22GB,

* 3 consecutive runs

Average Time: 9m59.109s

SYSTEM LOAD ( load average: 0.12, 0.07, 0.09 )

Medium.txt file 11GB,

* 3 consecutive runs

Average Time: 5m8.530s

SYSTEM LOAD ( load average: 0.00, 0.00, 0.00 )

Small.txt file 19MB,

* 3 consecutive runs

Average Time: real 0m0.540s

Ccharlie:

SYSTEM LOAD ( load average: 0.02, 0.01, 0.04 )

Big.txt file 22GB,

* 3 consecutive runs

Average Time: 14m57.121s

SYSTEM LOAD (load average: 0.00, 0.02, 0.00 )

Medium.txt file 11GB,

* 3 consecutive runs

Average Time: 7m7.179s

SYSTEM LOAD (load average: 1.04, 1.01, 1.31 )

Small.txt file 19MB,

* 3 consecutive runs

Average Time: 0m0.728s

******************** HOT FUNCTIONS ********************

Using PERF record we can see the hot function as:

84.91% lt-whirlpoolsum libwhirlpool.so.0.0.1 [.] processBuffer ◆

15.09% lt-whirlpoolsum libwhirlpool.so.0.0.1 [.] whirlpool_add

processBuffer is definitely the hot function here, the only reason whirlpool_add is on the list is because it calls processBuffer during its function.

Inside of processBuffer's assembly code I found that the hottest instruction is an fmov -> Floating-point Move (register)

6.15% │ fmov x1, d1

In order to provide any type of performance i will need to try to optimize this function.

There was no assembler code inline or in files so that remains an options as well as vectorization and compiler flag options. My plan of action will be to attack the compiler options first, followed by editing the existing code seeing if there is anything I can remove or edit, vectorization and lastly assembler.

SPO600 BLOG

Stage 2: Benchmarking

Recent Posts

Comments