mpiexec scalability improved!

Pete Wyckoff pw at osc.edu
Tue Apr 18 14:48:52 EDT 2006


garrick at usc.edu wrote on Wed, 12 Apr 2006 19:42 -0700:
> On Wed, Apr 12, 2006 at 05:38:45PM -0700, garrick alleged:
> > On Wed, Apr 12, 2006 at 03:24:44PM -0700, garrick alleged:
[..]
> > This is definitely failing regularly.  With -v -v, about all I get is
> > this:
> > mpiexec: Error: do_child: input on unexpected fd 10.
> 
> fd 10 is the pipe from the parent and the handler isn't clearing it
> from the read fd list.
> 
> One line patch to do_child() and it is behaving very very nicely:
>     /* message from parent */
>     if (poll_isset(pipe_with_stdio, &rfsd)) {
>         poll_clr(pipe_with_stdio, &rfsd);      <-- missing
>         --n;
>         stdio_msg_listener_read();
>     }
> 
> I've run dozens of tests on 1800 nodes without any failures.
> 
> Will we see this in a release or svn soon?

You are absolutely right---surprised anything worked without that
line.  Thanks for fixing it and testing.  I'm running tests on a
bunch of machines and will likely check in these GM changes soon.

Hopefully roll a release here too.

		-- Pete


More information about the mpiexec mailing list