p4_error, could not write to fd=5
Pete Wyckoff
pw at osc.edu
Tue Dec 14 15:17:58 EST 2004
tornesi at BATTELLE.ORG wrote on Mon, 13 Dec 2004 14:04 -0500:
[..]
> If I try to run this code on more than 16 processors I get the
> following set of messages
>
> p18_7862: p4_error: : 10188
> p5_8935: (16.022127) net_send: could not write to fd=5, errno = 32
> p5_8935: p4_error: net_send write: -1
> p4_error: latest msg from perror: Broken pipe
[..]
> mpiexec: Warning: tasks 3,5,7,9-13,15,17-19 exited with status 1.
That doesn't really sound like an mpiexec error. Your batch of p4
errors looks like what usually happens when one of the tasks dies then
the others give up trying to contact it. Since these messages don't
start until after 16 seconds of execution time have passed, it looks
like everything was started up just fine.
You might run the code in a debugger or add printf()s to see what is
happening to the tasks that didn't print out p4 error messages and exit
with status 1. Where did process #0 go?
-- Pete
More information about the mpiexec
mailing list