working directory incorrect with tcsh

Pete Wyckoff pw at osc.edu
Wed Jul 21 18:31:19 EDT 2004


djhale at sandia.gov wrote on Wed, 21 Jul 2004 14:40 -0700:
> The first example is using tcsh, the PWD env variable seems to be incorrect 
> from the mpiexec output.  The second example is using bash which works as I 
> would expect.  I'd like tcsh to act the same as bash, is there a fix for 
> this?

It's a bit complex.  The command that is actually invoked on the
nodes looks like:

    argv  0 /bin/sh
    argv  1 -c
    argv  2 if test -d "/b/pw/src"; then cd "/b/pw/src"; fi; exec /bin/bash -c 'exec env'

or /bin/tcsh instead of /bin/bash if you invoked qsub with
"-S /bin/tcsh" or had that as your passwd file shell.

This magic is to work around the case where the file system on the
other nodes is different from that on the invoking node.  If it simply
did "cd /tmp/foo" without the test, tcsh would exit immediately while bash
and others just issue a warning.

That said, can you run mpiexec with "-v -v -v" and look at your
equivalent three argv lines and see if anything is suspicious?  Maybe
your /bin/sh isn't really good old /bin/sh?  I can't guess what else
just yet.

If this investigation turns up nothing, you can always use the big
hammer.  Pick another node in the job (not the one running mpiexec)
and do something like

    strace -vFf -s 2000 -o /tmp/o $(pgrep pbs_mom)

and grep around for /bin/sh and lstat and chdir on "bin/test" to see
what's going wrong.  And/or send it to me.

		-- Pete



More information about the mpiexec mailing list