mpiexec patch for very large jobs
Alex
korobka at nankai.edu.cn
Mon May 3 07:53:00 EDT 2004
Hi,
I encountered a problem where mpiexec would not work properly when
1. The number of file descriptors exceeded FD_SETSIZE.
2. write_full() in scatter_gm_startup_ports() returned -1 with errno
of EAGAIN after a write to the connected nonblocking socket.
First problem could be fixed either by recompiling the kernel and reinstalling
it on all nodes or by replacing select() with poll() in the mpiexec source code,
the second problem clearly needed better error handling in xxx_full() routines.
Here is a patch for both problems. It worked here but it may need a bit more
polishing.
Thanks for mpiexec,
Alex Korobka
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mpiexec-0.76.poll.patch.gz
Type: application/x-gzip-compressed
Size: 4916 bytes
Desc: not available
Url : http://email.osc.edu/pipermail/mpiexec/attachments/20040503/4b9cf42a/mpiexec-0.76.poll.patch.bin
More information about the mpiexec
mailing list