Problem with Torque/MPICH2/Mpiexec
David Golden
dgolden at cp.dias.ie
Thu Jun 29 10:14:17 EDT 2006
On 2006-06-29 15:38:37 +0200, Richard de Jong wrote:
> Hi list,
>
> I just subscribed to this list because I have a problem.
>
> I am trying to start an MPI application compiled with mpich2-1.0.3 using
> mpiexec, to use the TM interface of Torque. It gives a strange error.
> Please see my config below.
>
> For mpiexec I have tried versions 0.80 and 0.81
> For Torque I have tried versions 1.0.1p6 1.2.0p3 and 2.0.0p4
>
> Any idea what goes wrong?
>
Not really, but for the record, here's some compile flags we use
for mpich2 1.0.3 (with mpiexec 0.81 and torque 2.1.0p0 ,gcc 3, RHEL4)
- the pm / pmi stuff is the relevant stuff - osc-mpiexec emulates a
particular mpich2 mpd+pmi setup. If you're not using the
pm + pmi settings as below, I expect you'd see funny problems, no
idea if _your_ problems correspond though.
/usr/local/src/mpich2-1.0.3/configure --enable-fast --with-thread-package \
--enable-f77 --disable-f90 --enable-cxx --enable-romio --enable-pmiport \
--with-mpe --with-pm=mpd --with-pmi=simple
- then disable mpich2's mpiexec (and probably un-suid-root/disable their
mpdroot binary if you installed as root, it's unnecessary and the less
suid root stuff hanging about the better!) and use osc-mpiexec
instead, obviously.
Other points: Try torque 2.0.0p8 if 2.1.x still seems a bit bleeding
edge, the early 2.0.0 releases were a bit dodgy IIRC. 2.0.0p8 worked
fine for us for a while, though 2.1.x's new build system and
rationalised pbs-config /libtorque library is much nicer.
More information about the mpiexec
mailing list