mpich-shmem problems

Pete Wyckoff pw at osc.edu
Wed Aug 20 11:55:40 EDT 2003


ruda at ics.muni.cz said on Wed, 20 Aug 2003 17:02 +0200:
>   I have problems with mpiexec with mpich-shmem. I'm using mpiexec 0.74,
> mpich-1.2.5 and PBSPro 5.2.2 on dual processor linux cluster. When using 
> comm=mpich-p4 and mpich-gm, MPI jobs are started as expected. However, with
> comm-shmem (and job compiled with mpich configured to use only shmem),
> job is started twice (when job is submited to PBS using -l nodes=1:ppn=2)
> - mpiexec spawns two tasks while mpirun from mpich starts just one process
> (which forks second process later itself).
> 
> As a temporary fix, I have modified mpiexec to start only one process when
> comm=shmem is used. 
> 
> diff mpiexec.c.orig mpiexec.c
> 420a421,422
> >     /* hack, start only one proces when SHMEM is used */
> >     if (cl_args->comm == COMM_SHMEM) cl_args->pernode = 1;

That certainly fixes it, but the code that is supposed to do this lives
in a big section in get_hosts.c surrounded by "if (cl_args->comm ==
COMM_SHMEM)".  It tries to handle two cases:

    1. Time-shared hosts seem to use "ncpus".  A snippet of the server_priv
    nodes file is:
	coe3:ts np=24
	coe4:ts np=24
    Running a job:
	coe3$ qsub -I -l ncpus=2 -l walltime=2:00:00
	qsub: waiting for job 4122.coe3 to start
	qsub: job 4122.coe3 ready
	coe4$ qstat -f $PBS_JOBID | fgrep Resource_List
	    Resource_List.cput = 01:00:00
	    Resource_List.ncpus = 2
	    Resource_List.vmem = 1gb
	    Resource_List.walltime = 02:00:00

    2.  Non-time-shared hosts seem to use "nodect".  Nodes file might
    look like:
	mck026 np=2
	mck027 np=2
    Running a job:
	mck-login1$ qsub -I -l nodes=1:ppn=2
	qsub: waiting for job 31963.nfs1.osc.edu to start
	qsub: job 31963.nfs1.osc.edu ready
	mck027$ qstat -f $PBS_JOBID | fgrep Resource_List
	    Resource_List.neednodes = mck027:ppn=2
	    Resource_List.nodect = 1
	    Resource_List.nodes = 1:ppn=2
	    Resource_List.walltime = 01:00:00

You seem to fall in case (2), as do most cluster-type installations.
Does your nodes file look the same, and do you show similar output for
Resource_List variables?  There could be some PBSPro differences that
I do not know about.

Also can you add "-v" to mpiexec?  Do you get similar output?:

    mck027$ mpiexec -v --comm=shmem hello
    resolve_exe: prefixing dot to executable: "./hello"
    node  0: name = mck027, mpname = mck027, cpu = 1
    wait_one_task_start: evt = 2, task 0 host mck027
    All 1 task started.
    hello from 0/2 hostname mck027 pid 30794 with 0 args:
    hello from 1/2 hostname mck027 pid 30795 with 0 args:
    wait_tasks: numspawned = 1, got evt 3 for tid 11 host mck027 status 0

I'm confused why it tries to start two tasks in your case.

		-- Pete



More information about the mpiexec mailing list