mpiexec patch for very large jobs
Maestas, Christopher Daniel
cdmaest at sandia.gov
Fri Sep 17 11:26:07 EDT 2004
This is the patch I hacked to allow job launching on voltaire ibfiniband
fabrics:
Basically just set the version to 1 until voltaire updates their MVAPICH
release in their software package to support versions in the protocol.
Regards,
-----Original Message-----
From: Alex [mailto:korobka at nankai.edu.cn]
Sent: Thursday, September 16, 2004 9:20 PM
To: cdmaest at sandia.gov; pw at osc.edu
Cc: mpiexec at osc.edu
Subject: RE: mpiexec patch for very large jobs
I have an update to this, it fixes a corner case. I will send it next week
after I get back to the office.
On the other hand, is there anything in the works to support OSU MVAPICH
stack? If not then I'll have a go at it next week.
Alex
ÔÚÄúµÄÀ´ÐÅÖÐÔø¾Ìáµ½:
>From: "Maestas, Christopher Daniel" <cdmaest at sandia.gov>
>Reply-To:
>To: "'Pete Wyckoff'" <pw at osc.edu>,
Alex <korobka at nankai.edu.cn>
>Subject: RE: mpiexec patch for very large jobs
>Date:Thu, 16 Sep 2004 18:30:10 -0600
>
>Hello,
>
> What is the current status of integrating this patch?
>
> Regards,
> - Chris
>
>
>korobka at nankai.edu.cn wrote on Mon, 03 May 2004 19:53 +0800:
>> I encountered a problem where mpiexec would not work properly when
>>
>> 1. The number of file descriptors exceeded FD_SETSIZE.
>> 2. write_full() in scatter_gm_startup_ports() returned -1 with errno
>> of EAGAIN after a write to the connected nonblocking socket.
>>
>> First problem could be fixed either by recompiling the kernel and
>> reinstalling it on all nodes or by replacing select() with poll() in
>> the mpiexec source code, the second problem clearly needed better
>> error handling in xxx_full() routines. Here is a patch for both
>> problems. It worked here but it may need a bit more polishing.
>
>Thanks much for this patch. I'll definitely include something like it
>in the next release. A few questions for you, though, if you'll help
>me to understand some of it.
>
>Was it really necessary to grow the listen() backlog? System defaults
>tend to be around 128, so unless you had to change this systemwide
>(e.g. via /proc/sys/net/core/somaxconn on linux), 4096 should give the
>same behavior as 1024. I can make that the default with a note about
>the system limit if you think it makes sense.
>
>I need to make sure poll() exists on most machines then will gut any
>remaining select() use.
>
>The second part of your patch is obviously the right thing to do.
>Sorry I didn't deal with this correctly in the first place. It doesn't
>look necessary to check EAGAIN in read_full(), though, since we only
>ever read blocking sockets. And I'm tempted just to switch the fd to
>blocking before the call to write_full(), maybe wrapped with an alarm()
>to avoid the hang-on-dead-node scenario instead of the EAGAIN checking
>code you did.
>
>Then I should do this to all the devices that need it, for
>completeness, maybe abstracted out with some helper function for the
>asynchronous
>connect() part.
>
>Thanks again,
>
> -- Pete
>_______________________________________________
>mpiexec mailing list
>mpiexec at osc.edu
>http://email.osc.edu/mailman/listinfo/mpiexec
>
_______________________________________________
mpiexec mailing list
mpiexec at osc.edu
http://email.osc.edu/mailman/listinfo/mpiexec
-------------- next part --------------
A non-text attachment was scrubbed...
Name: voltaire_ib_patch_0.76
Type: application/octet-stream
Size: 363 bytes
Desc: not available
Url : http://email.osc.edu/pipermail/mpiexec/attachments/20040917/e10cac75/voltaire_ib_patch_0.obj
More information about the mpiexec
mailing list