get_hosts problem on mpiexec-0.80 with torque-1.2.0p6

Pete Wyckoff pw at osc.edu
Wed Oct 5 17:48:48 EDT 2005


taminami at mac.com wrote on Wed, 05 Oct 2005 16:51 -0400:
> The job exits immediately and gives me the following error+output:
> 
> mpiexec: resolve_exe: prefixing dot to executable: "./s3d_dms_fftw".
> mpiexec: Error: get_hosts: pbs_connect: no error.
> 
> I have learned that the second line indicates MPI could not resolve hostnames used in PBS, from the archive of this mailing list. However, I couldn't get any more information about this.
> 
> Is there any way for me to know in what machine (in server or mom) and which hostname MPI tried to resolve and failed? 

I know very little about torque, but scanning their code points out
an environment variable you may be able to use to get more debugging
information.  Try (in bash-speak):

    PBSDEBUG=yup mpiexec s3d....

But likely your guess is correct.  In which case the traditional
advice for PBS server name is that it looks first at environment
variable PBS_DEFAULT.  But most systems use an installed file to
hold the default server name.  In torque this appears to be
<serverhome>/server_name.  And <serverhome> defaults to
/usr/spool/PBS.  There are ./configure variables during the build to
change these things too.

Let us know what fixes it for you.

		-- Pete


More information about the mpiexec mailing list