get_hosts problem on mpiexec-0.80 with torque-1.2.0p6
Pete Wyckoff
pw at osc.edu
Wed Oct 5 17:48:48 EDT 2005
taminami at mac.com wrote on Wed, 05 Oct 2005 16:51 -0400:
> The job exits immediately and gives me the following error+output:
>
> mpiexec: resolve_exe: prefixing dot to executable: "./s3d_dms_fftw".
> mpiexec: Error: get_hosts: pbs_connect: no error.
>
> I have learned that the second line indicates MPI could not resolve hostnames used in PBS, from the archive of this mailing list. However, I couldn't get any more information about this.
>
> Is there any way for me to know in what machine (in server or mom) and which hostname MPI tried to resolve and failed?
I know very little about torque, but scanning their code points out
an environment variable you may be able to use to get more debugging
information. Try (in bash-speak):
PBSDEBUG=yup mpiexec s3d....
But likely your guess is correct. In which case the traditional
advice for PBS server name is that it looks first at environment
variable PBS_DEFAULT. But most systems use an installed file to
hold the default server name. In torque this appears to be
<serverhome>/server_name. And <serverhome> defaults to
/usr/spool/PBS. There are ./configure variables during the build to
change these things too.
Let us know what fixes it for you.
-- Pete
More information about the mpiexec
mailing list