mpich2 - will the real mpiexec please stand up
Pete Wyckoff
pw at osc.edu
Mon Jan 31 10:50:06 EST 2005
bill at Princeton.EDU wrote on Fri, 28 Jan 2005 13:30 -0500:
> With the release of mpich2-v1.0 from ANL, it looks like there have been
> some major changes in job creation. They are using a replacement
> mpirun, now called mpiexec, which requires a -n <CPU #> option which is
> sure to wreck havoc amongst my users!
>
> More importantly, there is now a need to launch a process manager daemon
> on all the nodes before jobs can be started. It would have been so much
> nicer to have used the "real" mpiexec and have this all happen
> underneath the covers, especially when used with PBS/Torque.
>
> Pete, do you have plans in the works for the "real" mpiexec to do this
> job for the users or should I be starting to think of a wrapper script
> to do this instead? Maybe I should just modify my sample script
> documentation and just let the users revamp their scripts themselves.
>
> Change IS a good thing right? :-)
There appear to be, in fact, 6 different codes that call themselves
"mpiexec" in the mpich2-1.0 distribution. I pointed out the namespace
collision issue to Bill Gropp a while back, and he did know about the
popular PBS-specific version we all know and love, but I can see his
desire to stick to the name as it was mentioned long ago in the
mpi-2 spec published in 1997.
That spec says that "mpiexec -n <np> <code>" should do as one expects,
and recommends a few other arguments, most of which do not apply in the
PBS world.
For over a year now, though, the PBS mpiexec has had code to support
mpich2, so you don't have to bother with any of the mpiexecs included
with mpich2. And that bit about needing the mpd daemon on every node
to start jobs is only recommended, not required. We already have
a pbs_mom ready to launch jobs for us, so there's no real need for mpd.
You might try
mpiexec --comm=pmi mycode args
and see if it works with this v1.0 release (or --comm=mpich2 if you
prefer: PMI is the name of their process startup interface). It worked
a year ago, albeit with some changes to mpich2 that I pushed up to the
developers.
If it doesn't work, try adding "--enable-pmiport" to your mpich2 build.
I recommended that they make this the default, but alas, that suggestion
may not yet have gotten into their source tree.
Send bug fixes and enhancements if you find time to fix the PMI startup
interface in mpiexec. And problems too, to let everybody know what the
status is between mpiexec and this new mpich2 release.
-- Pete
More information about the mpiexec
mailing list