mpiexec-0.82's possible bug ?
Jimmy Tang
jtang at tchpc.tcd.ie
Mon Mar 5 17:26:23 EST 2007
Hi All,
On Mon, Mar 05, 2007 at 04:20:02PM -0500, Pete Wyckoff wrote:
> chan at mcs.anl.gov wrote on Mon, 05 Mar 2007 13:21 -0600:
> > While I am reading the mpiexec source code to understand how mpiexec makes
> > use of mpich2's pmi interface, I notice there may be a bug.
> >
> > mpiexec.c's main() calls distribute_executable() which then calls
> > start_tasks() {without any argument}. But start_tasks is defined:
> >
> > int start_tasks(int spawn)
> >
> > A grep of "start_tasks" in the source code shows:
> >
> > > grep -n start_tasks *.c
> > exedist.c:151: start_tasks();
>
> Indeed, thanks for noticing. The bits in there have truly rotted.
> It is to support a method for distributing the executable in advance
> of running the parallel program. Most sites use a shared file
> system like NFS instead. I'm sure if anyone were to try enabling
> --with-fast-dist during configure they would not even get a
> successful compilation.
>
funny someone should mail that in, i was playing with fastdist with
recent releases of mpiexec and noticed it too, i think doing this...
diff --git a/mpiexec-0.82/exedist.c b/mpiexec-0.82/exedist.c
index 297ba64..0dc95b7 100644
--- a/mpiexec-0.82/exedist.c
+++ b/mpiexec-0.82/exedist.c
@@ -148,7 +148,7 @@ distribute_executable(void)
}
/* spawn tasks */
- start_tasks();
+ start_tasks(numtasks - 1);
debug(1, "%s: tasks started", __func__);
/* wait for them to exit */
makes it compile and it appears to run, never got around to checking
and making sure that it was right though before mailing in about it.
> If this is something you are interested in and want to resurrect,
> I'll certainly help you if you run into issues in the code.
>
it'd be neat to be able to use fastdist to distribute "data" files over
the ib fabric than for actual job startups. I'd certainly be interested
in seeing it somewhat resurrected for that purpose, but I lack experience
to get it going.
Jimmy.
--
Jimmy Tang
Trinity Centre for High Performance Computing,
Lloyd Building, Trinity College Dublin, Dublin 2, Ireland.
http://www.tchpc.tcd.ie/ | http://www.tchpc.tcd.ie/~jtang
More information about the mpiexec
mailing list