![]() |
Scalasca
(Scalasca 2.6.2, revision 30af1b6e)
Scalable Performance Analysis of Large-Scale Applications
|
The instrumented executable prepared in the previous step can now be executed under the control of the scalasca -analyze
(or short scan
) convenience command to perform an initial summary measurement:
$ cd bin $ scalasca -analyze mpiexec -n 144 ./bt.D.144 S=C=A=N: Scalasca 2.5 runtime summarization S=C=A=N: ./scorep_bt_144_sum experiment archive S=C=A=N: Mon Mar 18 13:44:46 2019: Collect start mpiexec -n 144 ./bt.D.144 NAS Parallel Benchmarks 3.3 -- BT Benchmark No input file inputbt.data. Using compiled defaults Size: 408x 408x 408 Iterations: 250 dt: 0.0000200 Number of active processes: 144 Time step 1 Time step 20 Time step 40 Time step 60 Time step 80 Time step 100 Time step 120 Time step 140 Time step 160 Time step 180 Time step 200 Time step 220 Time step 240 Time step 250 Verification being performed for class D accuracy setting for epsilon = 0.1000000000000E-07 Comparison of RMS-norms of residual 1 0.2533188551738E+05 0.2533188551738E+05 0.1499315900507E-12 2 0.2346393716980E+04 0.2346393716980E+04 0.8546885387975E-13 3 0.6294554366904E+04 0.6294554366904E+04 0.2745293523008E-14 4 0.5352565376030E+04 0.5352565376030E+04 0.8376934357159E-13 5 0.3905864038618E+05 0.3905864038618E+05 0.6650300273080E-13 Comparison of RMS-norms of solution error 1 0.3100009377557E+03 0.3100009377557E+03 0.1373406191445E-12 2 0.2424086324913E+02 0.2424086324913E+02 0.1600422929406E-12 3 0.7782212022645E+02 0.7782212022645E+02 0.4090394153928E-13 4 0.6835623860116E+02 0.6835623860116E+02 0.3596566920650E-13 5 0.6065737200368E+03 0.6065737200368E+03 0.2605201960010E-13 Verification Successful BT Benchmark Completed. Class = D Size = 408x 408x 408 Iterations = 250 Time in seconds = 413.25 Total processes = 144 Compiled procs = 144 Mop/s total = 141162.75 Mop/s/process = 980.30 Operation type = floating point Verification = SUCCESSFUL Version = 3.3.1 Compile date = 18 Mar 2019 Compile options: MPIF77 = scorep mpifort FLINK = $(MPIF77) FMPI_LIB = (none) FMPI_INC = (none) FFLAGS = -O2 FLINKFLAGS = -O2 RAND = (none) Please send feedbacks and/or the results of this run to: NPB Development Team Internet: npb@nas.nasa.gov S=C=A=N: Mon Mar 18 13:51:47 2019: Collect done (status=0) 421s S=C=A=N: ./scorep_bt_144_sum complete. $ ls scorep_bt_144_sum MANIFEST.md profile.cubex scorep.cfg scorep.log
As can be seen, the measurement run successfully produced an experiment directory named scorep_bt_144_sum
containing
MANIFEST.md
briefly describing the directory contents produced by the Score-P measurement system,profile.cubex
,scorep.cfg
, andscorep.log
.However, application execution took almost twice as long as the reference run (413.25 vs. 216.00 seconds). That is, instrumentation and associated measurements introduced a non-negligible amount of run-time overhead. While it is possible to interactively examine the generated summary result file using the Cube report browser, this should only be done with great caution since the substantial overhead negatively impacts the accuracy of the measurement. Therefore, such measurements can easily be misleading.
![]() |
Copyright © 1998–2025 Forschungszentrum Jülich GmbH,
Jülich Supercomputing Centre
Copyright © 2009–2015 German Research School for Simulation Sciences GmbH, Laboratory for Parallel Programming |