Running multiple processes on the same host

Ken Borrelli kwb1 at cec.wustl.edu
Mon Mar 6 13:41:21 EST 2006


Actually I ended up hacking the code last week to do exactly what you just
said you didn't want to do (if numproc > # physical processes start
over-allocating), but there probably is a way to adjust this and make it a
compile time option.  We have some dual-CPU nodes that can get assigned
independently so I know you're talking about.  The user-base on our cluster
is small enough that we can prevent such behavior, but for a big system it
might be hard to police.  If you're really worried there might be a way to
make all over-allocated processes have a lower priority, but I haven't
looked into this.

The reason we want to over-allocate is that we have a series of
semi-independent threads running with a single head node just doing the
book-keeping and updating some parameters in each of the semi-independent
threads.  When we are running on > 20 processors such an architecture makes
sense, the loss of compute time is small compared to the saving we get on
latency (non-blocking sends are not practical for a couple of reasons).
When we are running on only a few we are wasting a large amount of processor
time, so the idea was we could simply run a semi-independent thread as well
as the head thread (mostly book-keeping) on the same physical processor
without having to adjust the code at all.   In this way we can efficiently
run the same job on 5 or so processors.

Thanks,
            Ken

2006/3/3, Pete Wyckoff <pw at osc.edu>:
>
> kwb1 at cec.wustl.edu wrote on Tue, 28 Feb 2006 14:33 -0600:
> > I am trying to use mpiexec to start an mpi job with two processes
> running on
> > some of the physical processors.  If I was using mpirun, I would just
> repeat
> > some of the processors on the hostlist or just allow it to wrap around,
> but
> > I'm having some issues doing this with mpiexec.  I took a look at
> hacking
> > into the code, but I wanted to see if anyone has a less drastic
> solution.
>
> Mpiexec actually goes out of its way to make sure you can't do this.
> You could certainly hack the code to allow oversubscription of
> processor resources, and I've toyed with adding an option to let
> people like you say "I know what I'm doing".  The problem is, some
> sites use multi-CPU nodes and will schedule individual processors
> from those nodes.  They don't want a user running more than one
> thread on his assigned processor as that would interfere with other
> jobs on the same node.  Yes, there are many other ways jobs fight
> each other in such an allocation scheme, so this is far from
> foolproof.
>
> I'm curious what your code does that it makes sense to overallocate
> procesosrs.  If you do hack the code, a ./configure time option to
> enable this would let admins somewhat set policy.  I'd want to force
> users to give another flag (beyond "-numproc 27") to indicate they
> know what they're doing too.
>
>                 -- Pete
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://email.osc.edu/pipermail/mpiexec/attachments/20060306/719ad4bc/attachment.htm


More information about the mpiexec mailing list