Running multiple processes on the same host
Ken Borrelli
kwb1 at cec.wustl.edu
Mon Mar 6 13:41:21 EST 2006
Actually I ended up hacking the code last week to do exactly what you just
said you didn't want to do (if numproc > # physical processes start
over-allocating), but there probably is a way to adjust this and make it a
compile time option. We have some dual-CPU nodes that can get assigned
independently so I know you're talking about. The user-base on our cluster
is small enough that we can prevent such behavior, but for a big system it
might be hard to police. If you're really worried there might be a way to
make all over-allocated processes have a lower priority, but I haven't
looked into this.
The reason we want to over-allocate is that we have a series of
semi-independent threads running with a single head node just doing the
book-keeping and updating some parameters in each of the semi-independent
threads. When we are running on > 20 processors such an architecture makes
sense, the loss of compute time is small compared to the saving we get on
latency (non-blocking sends are not practical for a couple of reasons).
When we are running on only a few we are wasting a large amount of processor
time, so the idea was we could simply run a semi-independent thread as well
as the head thread (mostly book-keeping) on the same physical processor
without having to adjust the code at all. In this way we can efficiently
run the same job on 5 or so processors.
Thanks,
Ken
2006/3/3, Pete Wyckoff <pw at osc.edu>:
>
> kwb1 at cec.wustl.edu wrote on Tue, 28 Feb 2006 14:33 -0600:
> > I am trying to use mpiexec to start an mpi job with two processes
> running on
> > some of the physical processors. If I was using mpirun, I would just
> repeat
> > some of the processors on the hostlist or just allow it to wrap around,
> but
> > I'm having some issues doing this with mpiexec. I took a look at
> hacking
> > into the code, but I wanted to see if anyone has a less drastic
> solution.
>
> Mpiexec actually goes out of its way to make sure you can't do this.
> You could certainly hack the code to allow oversubscription of
> processor resources, and I've toyed with adding an option to let
> people like you say "I know what I'm doing". The problem is, some
> sites use multi-CPU nodes and will schedule individual processors
> from those nodes. They don't want a user running more than one
> thread on his assigned processor as that would interfere with other
> jobs on the same node. Yes, there are many other ways jobs fight
> each other in such an allocation scheme, so this is far from
> foolproof.
>
> I'm curious what your code does that it makes sense to overallocate
> procesosrs. If you do hack the code, a ./configure time option to
> enable this would let admins somewhat set policy. I'd want to force
> users to give another flag (beyond "-numproc 27") to indicate they
> know what they're doing too.
>
> -- Pete
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://email.osc.edu/pipermail/mpiexec/attachments/20060306/719ad4bc/attachment.htm
More information about the mpiexec
mailing list