Bug or strange results in runtests.pl
Anna Jonna Armannsdottir
annaj at hi.is
Mon Nov 13 10:41:57 EST 2006
Hi, I am new on this list and joined it
because of discussion on the torque users list.
As the admin of IBM Blade cluster at the
University of Iceland, I set up mpich2 in
conjunction with Torque.
Then I decided to try mpiexec, because it is
said to integrate better with Torque.
After a few compilations with Torque 2.0.0p11
and then with Torque 2.1.6, I found exactly the
same error when running runtests.pl
The test runs with the following settings
(among others)
$available_nodes = 4;
$smpsize = 1;
The error appears during the following tests:
mpiexec --comm=pmi hello -sleep -abort 0
mpiexec --comm=pmi hello -sleep -abort 1
mpiexec --comm=pmi hello -sleep -abort 3
Cats of the files:
cat test*8461.34
hello from 0/4 hostname j104.jotunn.rhi.hi.is pid 5782 with 3 args:
-sleep -abort 0
hello from 3/4 hostname j101.jotunn.rhi.hi.is pid 4518 with 3 args:
-sleep -abort 0
hello from 2/4 hostname j102.jotunn.rhi.hi.is pid 4820 with 3 args:
-sleep -abort 0
hello from 1/4 hostname j103.jotunn.rhi.hi.is pid 5030 with 3 args:
-sleep -abort 0
[cli_0]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 4269) - process 0
mpiexec: Warning: task 0 exited with status 173.
mpiexec: Warning: task 1 exited oddly---report bug: status 0 done 0.
mpiexec: Warning: task 2 exited with status -767359304.
mpiexec: Warning: task 3 exited with status 59.
=>> PBS: job killed: walltime 308 exceeded limit 300
If you want some more info, I would be happy to enquire some more info
from this run. just tell me how to do it.
--
Kindest Regards, Anna Jonna Ármannsdóttir,
Unix System Aministration, Computing Services,
University of Iceland.
More information about the mpiexec
mailing list