Linpack Benchmark 是一个用来衡量计算机性能的指标,其原理是使用 CPU 做大量的矩阵计算。
测试脚本
#!/bin/bash
URL=http://registrationcenter.intel.com/irc_nas/3914/l_lpk_p_11.1.2.005.tgz
# 好像目前最新的可以从这里下载 https://software.intel.com/en-us/articles/intel-mkl-benchmarks-suite 不知道一样不一样
# download
wget ${URL} -O /tmp/l_lpk.tgz
# extract
tar -xzf /tmp/l_lpk.tgz -C /tmp/
# copy linpack to /usr/share directory
cp -a /tmp/linpack_11.1.2/benchmarks/linpack/ /usr/share/
# create soft links to executables
ln -sf /usr/share/linpack/runme_xeon64 /usr/sbin/
ln -sf /usr/share/linpack/xlinpack_xeon64 /usr/sbin/
# adjust path in runme_xeon64
sed -i s'|./xlinpack_$arch lininput_$arch|/usr/sbin/xlinpack_$arch /usr/share/linpack/lininput_$arch|g' /usr/sbin/runme_xeon64
# get CPU info
CPU=$(cat /proc/cpuinfo | grep "model name" | tail -1)
COUNT=$(cat /proc/cpuinfo | grep processor | wc -l)
echo "CPU : $CPU"
echo "COUNT : $COUNT"
# OPTIONAL: configure parameter
# export MKL_DYNAMIC=false
# export OMP_NUM_THREADS=4
# run
runme_xeon64
部分测试结果
由于我的笔记本比较老了…… 所以后面的大数据测试没有跑完。太慢了……
This is a SAMPLE run script for SMP LINPACK. Change it to reflect
the correct number of CPUs/threads, problem input files, etc..
2016年 10月 06日 星期四 21:28:49 CST
Intel(R) Optimized LINPACK Benchmark data
Current date/time: Thu Oct 6 21:28:49 2016
CPU frequency: 3.092 GHz
Number of CPUs: 1
Number of cores: 2
Number of threads: 4
Parameters are set to:
Number of tests: 15
Number of equations to solve (problem size) : 1000 2000 5000 10000 15000 18000 20000 22000 25000 26000 27000 30000 35000 40000 45000
Leading dimension of array : 1000 2000 5008 10000 15000 18008 20016 22008 25000 26000 27000 30000 35000 40000 45000
Number of trials to run : 4 2 2 2 2 2 2 2 2 2 1 1 1 1 1
Data alignment value (in Kbytes) : 4 4 4 4 4 4 4 4 4 4 4 1 1 1 1
Maximum memory requested that can be used=7200601024, at the size=30000
=================== Timing linear equation system solver ===================
Size LDA Align. Time(s) GFlops Residual Residual(norm) Check
1000 1000 4 0.039 17.3422 1.029343e-12 3.510325e-02 pass
1000 1000 4 0.033 20.1128 1.029343e-12 3.510325e-02 pass
1000 1000 4 0.037 18.1989 1.029343e-12 3.510325e-02 pass
1000 1000 4 0.035 19.1910 1.029343e-12 3.510325e-02 pass
2000 2000 4 0.259 20.6123 4.298950e-12 3.739560e-02 pass
2000 2000 4 0.278 19.2107 4.298950e-12 3.739560e-02 pass
5000 5008 4 2.310 36.1031 2.581643e-11 3.599893e-02 pass
5000 5008 4 2.542 32.8051 2.581643e-11 3.599893e-02 pass
10000 10000 4 19.637 33.9594 9.603002e-11 3.386116e-02 pass
10000 10000 4 19.528 34.1494 9.603002e-11 3.386116e-02 pass
15000 15000 4 66.034 34.0800 2.042799e-10 3.217442e-02 pass
15000 15000 4 66.219 33.9851 2.042799e-10 3.217442e-02 pass
18000 18008 4 99.569 39.0549 2.894987e-10 3.170367e-02 pass
18000 18008 4 99.255 39.1785 2.894987e-10 3.170367e-02 pass
20000 20016 4 136.455 39.0908 4.097986e-10 3.627616e-02 pass
20000 20016 4 136.309 39.1327 4.097986e-10 3.627616e-02 pass
22000 22008 4 181.347 39.1493 4.548092e-10 3.331299e-02 pass
22000 22008 4 180.878 39.2510 4.548092e-10 3.331299e-02 pass
25000 25000 4 279.344 37.2942 6.089565e-10 3.462917e-02 pass
25000 25000 4 266.264 39.1262 6.089565e-10 3.462917e-02 pass
26000 26000 4 315.179 37.1811 6.669421e-10 3.506981e-02 pass
26000 26000 4 310.565 37.7335 6.669421e-10 3.506981e-02 pass
27000 27000 4 370.754 35.3967 6.672171e-10 3.253690e-02 pass
30000 30000 1 618.849 29.0892 8.421348e-10 3.319704e-02 pass
Performance Summary (GFlops)
Size LDA Align. Average Maximal
1000 1000 4 18.7112 20.1128
2000 2000 4 19.9115 20.6123
5000 5008 4 34.4541 36.1031
10000 10000 4 34.0544 34.1494
15000 15000 4 34.0325 34.0800
18000 18008 4 39.1167 39.1785
20000 20016 4 39.1118 39.1327
22000 22008 4 39.2002 39.2510
25000 25000 4 38.2102 39.1262
26000 26000 4 37.4573 37.7335
27000 27000 4 35.3967 35.3967
30000 30000 1 29.0892 29.0892
Residual checks PASSED
End of tests
Done: 2016年 10月 06日 星期四 22:36:48 CST
评论
嗨~~
首先特别感谢您的分享,满满的干货哟~
其次麻烦问一下您这个linpack包和传统的需要搭建环境的linpack(MPI+BLAS+HPL)测试结果或者原理一样吗?
时间比较长了,有点记不清了……
原理是一样的,但是测试结果可能会有一点点差别,自己搭建环境可能结果会更好一点。