mpiexec GMPI_SLAVE env.t problem
Bryan Hellyer
brh at unimelb.edu.au
Tue Aug 5 22:02:21 EDT 2003
Hi,
I'm a newcomer to mpiexec, and have been trying install on an IBM e-1350
Intel cluster
running RH7.3,
I ran configure with
./configure \
--with-pbs=/usr/pbs \
--with-pbssrc=/usr/local/src/OpenPBS_2_3_16 \
--prefix=/usr/local/src/mpiexec_0.74/mpiexec-0.74 \
--with-default-comm=mpich-gm
and we have mpich-gm 1.2.5..10 in /usr/local/src/mpich-gm/mpich-1.2.5..10
but tests fail with :
<MPICH-GM> Error: Need to obtain the slave's hostname in GMPI_SLAVE !
[0] Error: write to socket failed !
I've tracked this down to the mpich-gm gmpi_conf.c source and gmpi_getenv
routine, and
have added printf's to see what's happening...
Under mpirun, the GMPI_.... env't vbles get returned OK, eg.
cat hello_mpigm_ppn.e209
BH: gmpi_conf.c : gethostbyname returned node004
BH: gmpi_getenv var : GMPI_MAGIC , result 7374385
BH: gmpi_getenv var : GMPI_MASTER , result node040
BH: gmpi_getenv var : GMPI_PORT , result 8000
BH: gmpi_getenv var : GMPI_SLAVE , result 172.20.3.4
BH: gmpi_getenv var : GMPI_ID , result 7
BH: gmpi_getenv var : GMPI_NP , result 8
BH: gmpi_getenv var : GMPI_BOARD , result -1
BH: gmpi_getenv var : GMPI_NUMA_NODE , result (null)
BH: gmpi_getenv var : GMPI_EAGER , result (null)
BH: gmpi_getenv var : GMPI_SHMEM , result 1
BH: gmpi_getenv var : GMPI_RECV , result (null)
BH: gmpi_conf.c : gethostbyname returned node004
BH: gmpi_getenv var : GMPI_MAGIC , result 7374385
BH: gmpi_getenv var : GMPI_MASTER , result node040
BH: gmpi_getenv var : GMPI_PORT , result 8000
BH: gmpi_getenv var : GMPI_SLAVE , result 172.20.3.4
BH: gmpi_getenv var : GMPI_ID , result 6
BH: gmpi_getenv var : GMPI_NP , result 8
BH: gmpi_getenv var : GMPI_BOARD , result -1
BH: gmpi_getenv var : GMPI_NUMA_NODE , result (null)
BH: gmpi_getenv var : GMPI_ID , result 7
BH: gmpi_getenv var : GMPI_NP , result 8
BH: gmpi_getenv var : GMPI_BOARD , result -1
BH: gmpi_getenv var : GMPI_NUMA_NODE , result (null)
BH: gmpi_getenv var : GMPI_EAGER , result (null)
BH: gmpi_getenv var : GMPI_SHMEM , result 1
BH: gmpi_getenv var : GMPI_RECV , result (null)
BH: gmpi_conf.c : gethostbyname returned node004
BH: gmpi_getenv var : GMPI_MAGIC , result 7374385
BH: gmpi_getenv var : GMPI_MASTER , result node040
BH: gmpi_getenv var : GMPI_PORT , result 8000
BH: gmpi_getenv var : GMPI_SLAVE , result 172.20.3.4
BH: gmpi_getenv var : GMPI_ID , result 6
BH: gmpi_getenv var : GMPI_NP , result 8
BH: gmpi_getenv var : GMPI_BOARD , result -1
BH: gmpi_getenv var : GMPI_NUMA_NODE , result (null)
BH: gmpi_getenv var : GMPI_EAGER , result (null)
BH: gmpi_getenv var : GMPI_SHMEM , result 1
BH: gmpi_getenv var : GMPI_RECV , result (null)
BH: gmpi_conf.c : gethostbyname returned node001
etc.
but under mpiexec, it fails on GMPI_SLAVE,
BH: gmpi_conf.c : gethostbyname returned node040
BH: gmpi_getenv var : GMPI_MAGIC , result 210
BH: gmpi_getenv var : GMPI_MASTER , result node040
BH: gmpi_getenv var : GMPI_PORT , result 36678
BH: gmpi_getenv var : GMPI_SLAVE , result (null)
<MPICH-GM> Error: Need to obtain the slave's hostname in GMPI_SLAVE !
[0] Error: write to socket failed !
BH: gmpi_conf.c : gethostbyname returned node038
BH: gmpi_getenv var : GMPI_MAGIC , result 210
BH: gmpi_getenv var : GMPI_MASTER , result node040
BH: gmpi_getenv var : GMPI_PORT , result 36678
BH: gmpi_getenv var : GMPI_SLAVE , result (null)
<MPICH-GM> Error: Need to obtain the slave's hostname in GMPI_SLAVE !
[0] Error: write to socket failed !
BH: gmpi_conf.c : gethostbyname returned node040
BH: gmpi_conf.c : gethostbyname returned node039
BH: gmpi_getenv var : GMPI_MAGIC , result 210
BH: gmpi_getenv var : GMPI_MASTER , result node040
BH: gmpi_getenv var : GMPI_PORT , result 36678
BH: gmpi_getenv var : GMPI_SLAVE , result (null)
<MPICH-GM> Error: Need to obtain the slave's hostname in GMPI_SLAVE !
[0] Error: write to socket failed !
.
.
.
I also note that under mpirun GMPI_PORT=8000, whereas as seen above,
under mpiexec its' getting GMPI_PORT , result 36678.
Any ideas what's happening here.
Thanx
Bryan
---------------------------------------
Bryan Hellyer
HPC Systems Programmer
ITS Systems & Infrastructure
University of Melbourne
More information about the mpiexec
mailing list