Bug or strange results in runtests.pl

Anna Jonna Armannsdottir annaj at hi.is
Mon Nov 13 10:41:57 EST 2006


Hi, I am new on this list and joined it 
because of discussion on the torque users list. 

As the admin of IBM Blade cluster at the 
University of Iceland, I set up mpich2 in 
conjunction with Torque. 
Then I decided to try mpiexec, because it is 
said to integrate better with Torque. 

After a few compilations with Torque 2.0.0p11
and then with Torque 2.1.6, I found exactly the 
same error when running runtests.pl

The test runs with the following settings
(among others)
$available_nodes = 4;
$smpsize = 1;


The error appears during the following tests: 
mpiexec --comm=pmi hello -sleep -abort 0
mpiexec --comm=pmi hello -sleep -abort 1
mpiexec --comm=pmi hello -sleep -abort 3

Cats of the files: 
cat test*8461.34
hello from 0/4 hostname j104.jotunn.rhi.hi.is pid 5782 with 3 args:
-sleep -abort 0
hello from 3/4 hostname j101.jotunn.rhi.hi.is pid 4518 with 3 args:
-sleep -abort 0
hello from 2/4 hostname j102.jotunn.rhi.hi.is pid 4820 with 3 args:
-sleep -abort 0
hello from 1/4 hostname j103.jotunn.rhi.hi.is pid 5030 with 3 args:
-sleep -abort 0
[cli_0]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 4269) - process 0
mpiexec: Warning: task 0 exited with status 173.
mpiexec: Warning: task 1 exited oddly---report bug: status 0 done 0.
mpiexec: Warning: task 2 exited with status -767359304.
mpiexec: Warning: task 3 exited with status 59.
=>> PBS: job killed: walltime 308 exceeded limit 300

If you want some more info, I would be happy to enquire some more info
from this run. just tell me how to do it. 

-- 
Kindest Regards, Anna Jonna Ármannsdóttir,
Unix System Aministration, Computing Services, 
University of Iceland.



More information about the mpiexec mailing list