mpiexec and # of prcoessors

David Golden dgolden at cp.dias.ie
Sat Apr 1 10:23:33 EST 2006


On 2006-03-31 09:31:52 -0500, Bryan Putnam wrote:
> Dear mpiexec team,
> 
> I have a quick question regarding mpiexec-8.0 which we're using to run 
> some benchmarks (HPL).
> 
> We've found that mpiexec works fine until we get to about 512+ nodes, and 
> then the parallel job fails for various reasons. Is there an adjustable 
> parameter in the mpiexec code that limits parallel jobs to 512 processors, 
> or do you think the problem is likely not mpiexec related?
>

Well, I dunno if it's the problem - are you getting
"need X sockets, only Y available" errors by any chance?
- see mpiexec-0.80/stdio.c, where it points out that
without tweaking simultaneous open file limits, you
might be limited to ~510 processes, unless you adjust 
you max file descriptors per user limit.  On linux, 
sometimes it's still only 1024, which can be rapidly
used up (IIRC each open socket counts as an fd)
Might want to add e.g.

* -     nofile  16383
to /etc/security/limits.conf
on every node to up the per-user limit a bit.

Also maybe up the system setting (mightn't be necessary 
anymore, think modern linux autoadjusts:
fs.file-max = 65535
to 
/etc/sysctl.conf




More information about the mpiexec mailing list