pbs_connect error when running mpiexec jobs with PBS
Pete Wyckoff
pw at osc.edu
Mon Feb 7 10:07:17 EST 2005
psiwczak at man.poznan.pl wrote on Mon, 07 Feb 2005 09:20 +0100:
> Recently I've been experiencing a strange behaviour from my 'pbs-enabled'
> mpiexec. All mpi jobs quit with the following information:
>
> mpiexec: Error: get_hosts: pbs_connect: Access from host not allowed, or
> unknown host
>
> However, in logs I can see that the pbs scheduler accepts submitted job
> and sends it to a mom at one of my cluster nodes. Having been processed on
> pbs_mom, job exits with error status=1.
The compute node that is trying to run mpiexec cannot talk to the PBS
server. Most likely the name did not resolve (the name in the server_name
file in the PBS /var/... directory) on the compute node. You might fix
the server_name file or edit /etc/hosts to have an entry for the server.
You might type "qstat" in your batch job on the compute node and see if
it has the same problem as does mpiexec.
> On the other hand, jobs submitted with mpirun (mpich2) outside pbs work
> perfectly.
One major difference: mpirun doesn't talk to PBS from the compute node.
-- Pete
More information about the mpiexec
mailing list