mpiexec and # of prcoessors
Bryan Putnam
bfp at purdue.edu
Mon Apr 3 14:20:39 EDT 2006
Thanks David and others for the information.
Yes, increasing the "# open files" limits seems to have fixed the problem.
Bryan
On Sat, 1 Apr 2006, David Golden wrote:
> On 2006-03-31 09:31:52 -0500, Bryan Putnam wrote:
> > Dear mpiexec team,
> >
> > I have a quick question regarding mpiexec-8.0 which we're using to run
> > some benchmarks (HPL).
> >
> > We've found that mpiexec works fine until we get to about 512+ nodes, and
> > then the parallel job fails for various reasons. Is there an adjustable
> > parameter in the mpiexec code that limits parallel jobs to 512 processors,
> > or do you think the problem is likely not mpiexec related?
> >
>
> Well, I dunno if it's the problem - are you getting
> "need X sockets, only Y available" errors by any chance?
> - see mpiexec-0.80/stdio.c, where it points out that
> without tweaking simultaneous open file limits, you
> might be limited to ~510 processes, unless you adjust
> you max file descriptors per user limit. On linux,
> sometimes it's still only 1024, which can be rapidly
> used up (IIRC each open socket counts as an fd)
> Might want to add e.g.
>
> * - nofile 16383
> to /etc/security/limits.conf
> on every node to up the per-user limit a bit.
>
> Also maybe up the system setting (mightn't be necessary
> anymore, think modern linux autoadjusts:
> fs.file-max = 65535
> to
> /etc/sysctl.conf
>
>
>
More information about the mpiexec
mailing list