two mpiexec questions on compatability

Pete Wyckoff pw at osc.edu
Tue May 14 22:58:08 EDT 2002


cpenney at ford.com said:
> I have been using mpiexec under OpenPBS for quite a while.  It's an 
> excellent tool, nice work.  I have two questions:

Some problems with the OpenPBS compile were found recently by Brooks
Davis <brooks at aero.org>; check the mpiexec archives.  There's no release
yet containing the fixes, but it'll be announced sometime soon.  Not
that any of this will solve your problem below.

> We are converting to PBS Pro 5.2.  When I compile mpiexec it compiles 
> and links just fine, but when I try and use it (with -v) I get the error:
> 
> resolve_exe: using absolute exe "/apps/radioss/v4/v41n/ENGNV41N_SPMD"
> node  0: name = node16r1, cpu = 0
> node  1: name = node15r1, cpu = 0
> node  2: name = node14r1, cpu = 0
> node  3: name = node13r1, cpu = 0
> node  4: name = node12r1, cpu = 0
> node  5: name = node11r1, cpu = 0
> node  6: name = node6r1, cpu = 0
> node  7: name = node2r1, cpu = 0
> gmpi conf file = /.../cae.ford.com/fs/u/cpenney/.gmpiconf.161
> mpiexec: Error: wait_one_task_start: tm_poll remote: tm: system error.
> Command exited with non-zero status 1

This "system error" is PBS's way of saying something bad happened.
Mpiexec went to start a task and while waiting to make sure it actually
got started, was told by PBS that something went awry.  I've found no
way to get more information out of PBS without reading through the
source.  You might attach an strace to the pbs_mom on the node on which
task#0 will be run, with the "-f" flag to follow the forks all the way
down to the failing spawned task.

> execve("/apps/mpiexec/bin/mpiexec", ["/apps/mpiexec/bin/mpiexec", "-kill", "-nostdout", "-verbose", "/apps/radioss/v4/v41n/ENGNV41N_SPMD"], [/* 42 vars */]) = 0

I will speculate that it is the standard input handling that is causing
the problem.  PBSPro requires a patch just like OpenPBS to redirect the
input stream from the job script into process#0 of the application.  If
you don't apply the patch, things work just fine with OpenPBS except you
get no stdio handling---mpiexec just ignores stdin and never sees any
stdout to redirect.  I was sort of hoping that an unpatched pbspro would
also work just fine.  All mpiexec does is set some extra weird-named
environment variables in the call to tm_spawn().  Perhaps PBSPro just
broke TM support.  Can't tell until you play around with it a bit more.

Did you apply the openpbs patch which is included with mpiexec to your
pbspro distribution?  Do things work if you add "-nostdin" to the
command line too?

> My other question is that I'm curious how you would handle SMP boxes 
> with a Myrinet card.  Each process has to be assigned a port number. 
> Does mpiexec handle that so it picks on not in use?

Reconfigure with "--with-smp-size=2" (or however many processors you
have).  PBS has this concept of processor number that you see in node
assignments via qstat -n such as "node01/0+node01/1+..." and it
explicitly assigns a particular processor to a job.  So we use a static
mapping of processor# to GM port# and trust that PBS only assigns one
task per virtual processor.  This presumably won't work if you decide to
timeshare nodes.  In that case pbs+scheduler should be taught to track
GM ports as a separate resource, and hand them out along with
processors, memory, etc.

The code in build_gm_mapping() makes this wild assumption that good GM
ports to use are, in order:  2, 4, 5, 6, 7.  You can change this if it
turns out not to be right beyond the tested smp size of 2.

If you're crazy enough to have more than one myrinet card per node that
has been known to work too.  Use "--with-myri-cards=3", for example, to
claim that there are 3 myrinet cards in each SMP.  In this case the port
assignments are still static, but first assigned round-robin across the
cards, then on increasing port numbers.

		-- Pete

P.S.  Okay if I bounce your original message (and this one) to the
mpiexec mailing list?



More information about the mpiexec mailing list