mpiexec and # of prcoessors

Bryan Putnam bfp at purdue.edu
Mon Apr 3 14:20:39 EDT 2006


Thanks David and others for the information.

Yes, increasing the "# open files" limits seems to have fixed the problem.

Bryan

On Sat, 1 Apr 2006, David Golden wrote:

> On 2006-03-31 09:31:52 -0500, Bryan Putnam wrote:
> > Dear mpiexec team,
> > 
> > I have a quick question regarding mpiexec-8.0 which we're using to run 
> > some benchmarks (HPL).
> > 
> > We've found that mpiexec works fine until we get to about 512+ nodes, and 
> > then the parallel job fails for various reasons. Is there an adjustable 
> > parameter in the mpiexec code that limits parallel jobs to 512 processors, 
> > or do you think the problem is likely not mpiexec related?
> >
> 
> Well, I dunno if it's the problem - are you getting
> "need X sockets, only Y available" errors by any chance?
> - see mpiexec-0.80/stdio.c, where it points out that
> without tweaking simultaneous open file limits, you
> might be limited to ~510 processes, unless you adjust 
> you max file descriptors per user limit.  On linux, 
> sometimes it's still only 1024, which can be rapidly
> used up (IIRC each open socket counts as an fd)
> Might want to add e.g.
> 
> * -     nofile  16383
> to /etc/security/limits.conf
> on every node to up the per-user limit a bit.
> 
> Also maybe up the system setting (mightn't be necessary 
> anymore, think modern linux autoadjusts:
> fs.file-max = 65535
> to 
> /etc/sysctl.conf
> 
> 
> 




More information about the mpiexec mailing list