mpiexec in 2 nodes
Marchand Aurélia
aurelia.marchand at obspm.fr
Wed Oct 10 09:53:49 EDT 2007
Thank you for your reply.
The problem is a private network :
more /etc/hosts.local
# hosts.local
# Définitions propres à cette machine
145.238.2.10 quadri1.obspm.fr quadri1
# partage NFS sur réseau privé
192.168.0.1 siolino-s siolino
192.168.0.2 quadri1-s
192.168.0.3 quadri2-s quadri2
192.168.0.4 quadri3-s quadri3
192.168.0.5 quadri4-s quadri4
192.168.0.6 quadri5-s quadri5
192.168.0.7 quadri6-s quadri6
I think he use quadri3-s and not quadri3.
When I use mpi, I have to add .obspm.fr to the node name in machinefile
For the other machine quadri[7-9] I haven't got problem
Aurelia
Pete Wyckoff wrote:
>aurelia.marchand at obspm.fr wrote on Tue, 09 Oct 2007 15:02 +0200:
>
>
>>I have a problem using mpiexec in more than one node.
>>
>>when I have :
>>#PBS -l nodes=1:ppn=2
>>it work well
>>
>>and when I have :
>>
>>#PBS -l nodes=quadri3:ppn=1+quadri1:ppn=1
>>mpiexec --comm=mpich2 /home/marchand/PBS/test/nomProc2.mpich
>>
>>I have the error :
>>mpiexec: resolve_exe: using absolute path
>>"/home/marchand/PBS/test/nomProc2.mpich".
>>mpiexec: accept_pmi_conn: cmd=initack pmiid=0.
>>mpiexec: accept_pmi_conn: rank 0 (spawn 0) checks in.
>>mpiexec: accept_pmi_conn: cmd=init pmi_version=1 pmi_subversion=1.
>>[unset]: connect failed with connection refused
>>[unset]: Unable to connect to quadri3 on 39045
>>[unset]: aborting job:
>>
>>
>
>These [unset] messages are in the mpich2 library in the task on
>quadri1. During MPI_Init() it tries to connect to mpiexec on host
>quadri3 port 39045, but gets a "connection refused" error. But from
>your nodes=1:ppn=2 test and that there is no error from the task on
>quadri3 which connects to itself locally, we know mpiexec is
>listening okay.
>
>You might check if you have a firewall running and disable it. The
>other aspect to look at is name resolution: maybe quadri1 has the
>wrong IP address for quadri3 in its /etc/hosts. Less likely.
>
> -- Pete
>
>
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Aurélia Marchand
Service Informatique de l'Observatoire
5 place Jules Janssen Tel : 01 45 07 76 24
92195 Meudon Fax : 01 45 07 76 13
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
More information about the mpiexec
mailing list