multiple MPI jobs in a single allocation?
Ron W. Green
rwgree at sandia.gov
Thu Mar 25 12:50:09 EST 2004
A user here would like to do run multiple mpi jobs within a single PBS
allocation. Something like this (csh/tcsh):
qsub -I -l nodes=2:ppn=2
host2$ mpiexec -nostdin -n 2 a.out >& out1.txt & ; mpiexec -nostdin
-n 2 a.out >& out2.txt &
The first job starts and runs as expected. The second job errs with:
$ mpiexec: Error: get_hosts: tm_init: tm: not connected.
As a newbie, my feable understanding is that the first mpiexec instance
opens a fixed port to the pbs_mom on host2. The second mpiexec, I'm
assuming, tries to open the same port and finds it occupied.
Is there any way to run multiple mpi jobs within a single allocation?
This user has a program that basically grabs a chunk of the machine and
then runs multiple jobs within the allocation, basically doing his own
load balancing and scheduling.
The obvious answer is to use qsubs for each subtasks - this is not
acceptable to the user since they'd like the jobs to start immediately
so that they can be monitored.
thanks
ron
--
Ron W. Green
rwgree at sandia.gov
+1-505-284-1600
Sr. Engineer, ICC Applications Support
_______________________________________________
mpiexec mailing list
mpiexec at osc.edu
http://email.osc.edu/mailman/listinfo/mpiexec
More information about the mpiexec
mailing list