Benchmarking OpenVMS vs. HP-UX on Itanium hardware

Linux, Unix, Windows..

Moderator: Moderators

Benchmarking OpenVMS vs. HP-UX on Itanium hardware

Postby jovan_macosx » Wed Oct 01, 2014 12:54 am

I thought it was interesting that OpenVMS runs my linpack benchmark 40% faster than HP-UX using gcc. Can anyone compile the code to make HP-UX catch up? See my web page for details:

http://www.polarhome.com:983/~jovan/linpack.html
jovan_macosx
Newbie
 
Posts: 2
Joined: Wed Oct 01, 2014 12:51 am
Location: U.S.A. Phoenix

Re: Benchmarking OpenVMS vs. HP-UX on Itanium hardware

Postby zoli » Wed Oct 01, 2014 9:08 am

Thank you Jovan for sharing your results and bringing up this topic.

The difference is remarkable, indeed.

I have redone your tests to verify and exclude the possibility of user rights/limits/nice related differences... and got about the same results on OpenVMS:
Code: Select all
SYSTEM@ia64$ gcc --version
GNV Nov 14 2011 17:57:59
HP C V7.3-020 on OpenVMS IA64 V8.4
SYSTEM@ia64$ mc <>linpack.exe
Enter array size (q to quit) [200]:
Memory required:  315K.


LINPACK benchmark, Double precision.
Machine precision:  15 digits.
Array size 200 X 200.
Average rolled and unrolled performance:

    Reps Time(s) DGEFA   DGESL  OVERHEAD    KFLOPS
----------------------------------------------------
     128   0.72  81.94%   9.72%   8.33%  266343.434
     256   1.46  84.93%   2.74%  12.33%  274666.667
     512   2.90  83.10%   1.03%  15.86%  288174.863
    1024   5.82  82.65%   1.89%  15.46%  285831.978
    2048  11.61  82.77%   2.58%  14.64%  283812.984

Enter array size (q to quit) [200]:  q
SYSTEM@ia64$


It is important to see that GCC on OpenVMS (that comes with GNV) is not a real GCC. It is just a parameter wrapper around the HP C compiler.

On HP-UX side the results were even more interesting.

Running with different compilers got different results that points out the compiler's choice importance while dealing with performance issues.
Code: Select all
bash-4.2# which cc
/usr/bin/cc
bash-4.2# what /usr/bin/cc
/usr/bin/cc:
        HP C (bundled) for Integrity Servers B3910B A.06.12 [Oct 11 2006]
bash-4.2# ./linpack-cc
Enter array size (q to quit) [200]:
Memory required:  315K.


LINPACK benchmark, Double precision.
Machine precision:  15 digits.
Array size 200 X 200.
Average rolled and unrolled performance:

    Reps Time(s) DGEFA   DGESL  OVERHEAD    KFLOPS
----------------------------------------------------
      64   0.58  79.31%   1.72%  18.97%  187007.092
     128   1.17  80.34%   2.56%  17.09%  181223.368
     256   2.33  80.26%   1.29%  18.45%  185038.596
     512   4.65  79.78%   2.58%  17.63%  183589.208
    1024   9.32  80.04%   2.47%  17.49%  182872.995
    2048  18.62  80.02%   2.31%  17.67%  183469.450

Enter array size (q to quit) [200]:  q
bash-4.2#


Also GCC could be run with different optimization levels
Code: Select all
bash-4.2# gcc --version
gcc (GCC) 4.2.3
Copyright (C) 2007 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

bash-4.2#   gcc -O3 -o linpack linpack.c -lm
bash-4.2# ./linpack
Enter array size (q to quit) [200]:
Memory required:  315K.


LINPACK benchmark, Double precision.
Machine precision:  15 digits.
Array size 200 X 200.
Average rolled and unrolled performance:

    Reps Time(s) DGEFA   DGESL  OVERHEAD    KFLOPS
----------------------------------------------------
     128   0.94  88.30%   2.13%   9.57%  206807.843
     256   1.86  89.25%   3.23%   7.53%  204403.101
     512   3.74  88.77%   2.67%   8.56%  205598.441
    1024   7.48  89.17%   2.54%   8.29%  204999.028
    2048  14.95  89.03%   2.74%   8.23%  204999.028

Enter array size (q to quit) [200]:  q


...finaly - what is breathtaking:
Code: Select all
bash-4.2# gcc -o linpack linpack.c -lm
bash-4.2# ./linpack
Enter array size (q to quit) [200]:
Memory required:  315K.


LINPACK benchmark, Double precision.
Machine precision:  15 digits.
Array size 200 X 200.
Average rolled and unrolled performance:

    Reps Time(s) DGEFA   DGESL  OVERHEAD    KFLOPS
----------------------------------------------------
      32   0.73  83.56%   4.11%  12.33%  68666.667
      64   1.46  84.25%   3.42%  12.33%  68666.667
     128   2.90  85.86%   2.41%  11.72%  68666.667
     256   5.81  85.89%   2.41%  11.70%  68532.814
     512  11.63  85.98%   2.49%  11.52%  68333.009

Enter array size (q to quit) [200]:  q
bash-4.2#


These are just facts, and I am not enough competent to explain the behaviour.
Regards,
Z
---
Zoltan Arpadffy
zoli
Forum Admin
Forum Admin
 
Posts: 785
Joined: Mon Sep 30, 2002 1:27 am
Location: Stockholm, Sweden

Re: Benchmarking OpenVMS vs. HP-UX on Itanium hardware

Postby zoli » Wed Oct 01, 2014 1:26 pm

Jovan,

you made me very curious to compare the benchmark with different -O parameters.

Here is the result:
Code: Select all
bash-4.2# gcc -O1 -o linpack linpack.c -lm
bash-4.2# mv linpack linpack_O1
bash-4.2# gcc -O2 -o linpack linpack.c -lm
bash-4.2# mv linpack linpack_O2
bash-4.2# gcc -O3 -o linpack linpack.c -lm
bash-4.2# mv linpack linpack_O3
bash-4.2# gcc -Os -o linpack linpack.c -lm
bash-4.2# mv linpack linpack_Os
bash-4.2# gcc -o linpack linpack.c -lm
bash-4.2# ./linpack_O1
Enter array size (q to quit) [200]: 
Memory required:  315K.


LINPACK benchmark, Double precision.
Machine precision:  15 digits.
Array size 200 X 200.
Average rolled and unrolled performance:

    Reps Time(s) DGEFA   DGESL  OVERHEAD    KFLOPS
----------------------------------------------------
     128   0.94  84.04%   1.06%  14.89%  219733.333
     256   1.88  84.57%   1.60%  13.83%  217020.576
     512   3.75  87.47%   3.47%   9.07%  206201.369
    1024   7.51  87.75%   2.40%   9.85%  207724.274
    2048  15.01  86.81%   2.80%  10.39%  209114.250

Enter array size (q to quit) [200]:  q
bash-4.2# ./linpack_O2
Enter array size (q to quit) [200]: 
Memory required:  315K.


LINPACK benchmark, Double precision.
Machine precision:  15 digits.
Array size 200 X 200.
Average rolled and unrolled performance:

    Reps Time(s) DGEFA   DGESL  OVERHEAD    KFLOPS
----------------------------------------------------
     128   0.95  86.32%   3.16%  10.53%  206807.843
     256   1.89  88.36%   2.12%   9.52%  205598.441
     512   3.79  87.86%   2.90%   9.23%  204403.101
    1024   7.57  87.71%   2.77%   9.51%  205298.297
    2048  15.16  87.93%   2.70%   9.37%  204700.631

Enter array size (q to quit) [200]:  q
bash-4.2# ./linpack_O3
Enter array size (q to quit) [200]: 
Memory required:  315K.


LINPACK benchmark, Double precision.
Machine precision:  15 digits.
Array size 200 X 200.
Average rolled and unrolled performance:

    Reps Time(s) DGEFA   DGESL  OVERHEAD    KFLOPS
----------------------------------------------------
     128   0.94  88.30%   5.32%   6.38%  199757.576
     256   1.86  88.71%   3.23%   8.06%  205598.441
     512   3.74  88.77%   2.94%   8.29%  204999.028
    1024   7.48  89.17%   2.27%   8.56%  205598.441
    2048  14.95  88.76%   2.74%   8.49%  205598.441

Enter array size (q to quit) [200]:  q
bash-4.2# ./linpack_Os
Enter array size (q to quit) [200]: 
Memory required:  315K.


LINPACK benchmark, Double precision.
Machine precision:  15 digits.
Array size 200 X 200.
Average rolled and unrolled performance:

    Reps Time(s) DGEFA   DGESL  OVERHEAD    KFLOPS
----------------------------------------------------
      64   0.54  85.19%   3.70%  11.11%  183111.111
     128   1.06  89.62%   2.83%   7.55%  179374.150
     256   2.11  88.63%   3.32%   8.06%  181223.368
     512   4.23  89.13%   2.84%   8.04%  180757.498
    1024   8.48  88.80%   2.95%   8.25%  180757.498
    2048  16.94  89.20%   2.48%   8.32%  181106.675

Enter array size (q to quit) [200]:  q
bash-4.2# ./linpack   
Enter array size (q to quit) [200]: 
Memory required:  315K.


LINPACK benchmark, Double precision.
Machine precision:  15 digits.
Array size 200 X 200.
Average rolled and unrolled performance:

    Reps Time(s) DGEFA   DGESL  OVERHEAD    KFLOPS
----------------------------------------------------
      32   0.73  83.56%   4.11%  12.33%  68666.667
      64   1.46  84.93%   1.37%  13.70%  69756.614
     128   2.90  84.83%   2.76%  12.41%  69207.349
     256   5.81  85.71%   2.24%  12.05%  68801.044
     512  11.63  85.21%   2.32%  12.47%  69071.382

Enter array size (q to quit) [200]:  q
bash-4.2#
Regards,
Z
---
Zoltan Arpadffy
zoli
Forum Admin
Forum Admin
 
Posts: 785
Joined: Mon Sep 30, 2002 1:27 am
Location: Stockholm, Sweden

Re: Benchmarking OpenVMS vs. HP-UX on Itanium hardware

Postby jovan_macosx » Wed Oct 01, 2014 8:58 pm

Yeah it's too bad the gcc compiler has not been improved for Itanium. I have an HP RX1620 at home running an ancient copy of Debian and the linpack results were also in the 200 MFlop range even a few years ago. I think the only way to get HP-UX to catch up would be to try using one of their compiler packages. The default cc compiler does not have optimization flags if I remember correctly. Also I've read the the C++ compiler on OpenVMS is even more optimized for Itanium. There is some documentation on the web about instruction alignment tricks you can do in C++ that you can't do in C. I hope others chime in on this.
jovan_macosx
Newbie
 
Posts: 2
Joined: Wed Oct 01, 2014 12:51 am
Location: U.S.A. Phoenix

Re: Benchmarking OpenVMS vs. HP-UX on Itanium hardware

Postby sjaz » Sun Oct 05, 2014 9:50 pm

Sorry, nothing to add more than a 'Nice!'
User avatar
sjaz
Forum Admin
Forum Admin
 
Posts: 694
Joined: Fri Feb 14, 2003 11:08 pm
Location: London, UK


Return to Operating systems

Who is online

Users browsing this forum: No registered users and 4 guests