Creating a new comm

Pete Wyckoff pw at osc.edu
Thu Oct 11 16:30:53 EDT 2007


jbernstein at penguincomputing.com wrote on Thu, 11 Oct 2007 12:12 -0700:
> Pete Wyckoff wrote:
> >I'm not familiar with how mpich/bproc works.  You should take a
> >look at the mpirun that comes with it, and at the MPID_Init function
> >in mpid_bproc (or whatever).  If you have web pointers to these
> >things, others can double check that you're headed in the right
> >direction.
> 
> This is a helpful direction. Though how do I know what startup method my 
> MPICH distribution is using? I know when MPICH is built its using 
> --comm=bproc. Is this the startup method?

Read the source.  Or compile with debugging and step down from
MPI_Init until you figure out where it ends up.  My local mpich1
source doesn't have anything in it that looks like bproc.  You have
something special, apparently.

> Otherwise, if I'm starting up just over Ethernet on Linux, are I just 
> using ch_p4?

For mpich/p4, yup.  Not sure if bproc relies on that or rolls its
own.  There are other ways to startup on ethernet.

> When I try starting up a an mpi job with mpiexec using --comm=p4, It 
> seems to start the processes, but they just sit there. Likely waiting 
> for a signal to tell them to start.
> 
> How can I figure out what MPICH is using for the startup method?
> 
> Another hint is that --comm=bproc changes RSHCOMMAND and RCP commands to 
> Scyld specifics (bpsh and bpcp) is mpiexec using these commands at all?
> 
> In the end the problem I'm having is that when using mpiexec, I'm 
> starting more processes then I need. For example consider:
> 
> qsub -l nodes=2:ppn=2
> mpiexec ./myjob
> ^D
> 
> mpiexec actually starts up 4, 4 process tasks, rather then just 1, 4 
> process task. Whats interesting is that if I do:
> 
> mpiexec -npernode 1 ./cpi
> or
> mpiexec -pernode ./cpi
> 
> I only get 2, 4 process jobs.

Sounds like, under the hood, each of these tasks that mpiexec starts
thinks it should go start up N copies of itself.  Hopefully you can
find some sort of magic environment variable that tells it that it
doesn't need to spawn any more.

		-- Pete


More information about the mpiexec mailing list