multiple MPI jobs in a single allocation?

Ron W. Green rwgree at sandia.gov
Thu Mar 25 12:50:09 EST 2004


A user here would like to do run multiple mpi jobs within a single PBS 
allocation.  Something like this (csh/tcsh):

qsub -I -l nodes=2:ppn=2
host2$    mpiexec -nostdin -n 2 a.out >& out1.txt & ; mpiexec -nostdin 
-n 2 a.out >& out2.txt &

The first job starts and runs as expected.  The second job errs with:
$ mpiexec: Error: get_hosts: tm_init: tm: not connected.

As a newbie, my feable understanding is that the first mpiexec instance 
opens a fixed port to the pbs_mom on host2.  The second mpiexec, I'm 
assuming, tries to open the same port and finds it occupied.

Is there any way to run multiple mpi jobs within a single allocation?

This user has a program that basically grabs a chunk of the machine and 
then runs multiple jobs within the allocation, basically doing his own 
load balancing and scheduling.
The obvious answer is to use qsubs for each subtasks - this is not 
acceptable to the user since they'd like the jobs to start immediately 
so that they can be monitored.

thanks

ron

-- 
Ron W. Green
rwgree at sandia.gov
+1-505-284-1600

Sr. Engineer, ICC Applications Support



_______________________________________________
mpiexec mailing list
mpiexec at osc.edu
http://email.osc.edu/mailman/listinfo/mpiexec



More information about the mpiexec mailing list