torque+mpiexec+mvapich = strange behavior

Alex korobka at nankai.edu.cn
Tue Dec 7 21:08:50 EST 2004


I've seen the same problem about a month ago. I commented out that close(0)
and it worked well afterwards. I think that strace log showed that with non-
interactive jobs fd 0 was being used by read_ib_startup_ports after that
close(0) call. 

Cheers,
Alex

ÔÚÄúµÄÀ´ÐÅÖÐÔø¾­Ìáµ½:
>From: Pete Wyckoff <pw at osc.edu>
>A.Starikov at utwente.nl wrote on Mon, 06 Dec 2004 02:57 +0100:
> > I'm using torque-1.1.0p4 + mvapich-0.9.4-103 mpiexec-cvs
> > And observe something strange.
> > When I submit interactive job, I can start mpi job without any problem 
> > in interactive session.
> > But when I submit non-interactive MPI job, I see:
> > "mpiexec: Error: read_ib_startup_ports: accept iter 0: Invalid argument"
.. 
> Because of your observation that changing your shell changes the
> behavior and your feelings about the fork :), I'm worried about a
> certain close in stdio.c.  The parent does close(0) unconditionally, but
> perhaps this is not correct.  Can you add, around stdio.c:331, the two
> debugging printfs below (untested):
> 
>     if (pid > 0) {
>         /* parent: do not listen to stdin but
>          * leave 1,2 open for debugging/error output (to pbs batch output files
>          * or to tty for interactive)
>          */
> 	printf("%s: pre-close-0 aggregate = %d %d %d\n", __func__,
> 	  aggregate[0], aggregate[1], aggregate[2]);
> 	printf("%s: abort_fd_in = %d %d\n", __func__,
> 	  abort_fd_in[0], abort_fd_in[1]);
>         close(0);
> 
> If we see that abort_fd_in[0] == 0, and aggregate[0] == -1, maybe we
> should pay attention to those instead of running straight to close(0).
> Or it could be a completely different problem.
> 
> What is your default shell, by the way, if not /bin/bash?  Do you
> specify any "-S" lines in your PBS script or on the command-line to
> qsub?
> 
> 		-- Pete
> _______________________________________________
> mpiexec mailing list
> mpiexec at osc.edu
> http://email.osc.edu/mailman/listinfo/mpiexec
>





More information about the mpiexec mailing list