why mpiexec run uncorrectly on smp?
Brooks Davis
brooks at aero.org
Tue Jun 11 18:33:55 EDT 2002
On Tue, Jun 11, 2002 at 05:06:38AM -0700, Ben Webb wrote:
> On Mon, Jun 10, 2002 at 05:03:03PM -0700, Brooks Davis wrote:
> > I really wish mpiexec could support a stupid mode that used sockets for
> > all communication.
>
> If you want to use LAM/MPI, I'm not stopping you! As far as I
> know, this is an MPICH limitation, not an mpiexec one; mpiexec just
> starts up the MPICH job.
I'm pretty sure it's not an mpich limitation because mpirun works
fine on one machine without comm=shared.
> > The shared mode is rather flaky on FreeBSD
>
> On Linux, too, if by "flaky" you mean it leaves shared memory
> segments lying around after a crash... and the default P4_GLOBMEMSIZE is
> set rather too low for our typical calculations (so MPICH jobs keep
> running out of shared memory) but that's easily remedied.
Yah, that's it. The diagnostics are totally useless when something goes
wrong.
> > but I can't get mpich with p4 and comm=shared working reliably enough to
> > inflict it on users.
>
> That's understandable. We had a lot of teething problems with
> MPICH+mpiexec on our cluster too, but I haven't seen a job crash with
> bizarre MPICH errors for a month or more now.
That's good to hear. I'l try again next week. I think part of the issue
is that since I'm currently getting started, there are too many
variables when testing.
-- Brooks
More information about the mpiexec
mailing list