mpich-shmem problems
Miroslav Ruda
ruda at ics.muni.cz
Thu Aug 21 11:23:51 EDT 2003
On Thu, 2003-08-21 at 17:02, Pete Wyckoff wrote:
> Okay, I see. This patch calls cull_nodes() for the ncpus case as well
> as for the nodect case. I suspect that "cat $PBS_NODEFILE" will show
> two lines in your batch job, although openpbs with ncpus (and no nodect)
> just has one line.
Yes, node is repeated twice in nodefile.
> if (have_ncpus)
> tasks[0].num_copies = have_ncpus; /* trust this one first */
> + /* note pbspro will set both ncpus and nodect, thus cull below
> + * is necessary to prune out extra nodes */
> else {
> if (have_nodect != 1)
> error("%s: pbs_statjob says nodect = %d,"
> " but shmem only handles nodect = 1", __func__, have_nodect);
> tasks[0].num_copies = numtask; /* ppn value for the single node */
> - cull_nodes(matching_node); /* discard other cpu tasks */
> }
> + cull_nodes(matching_node); /* discard other cpu tasks */
Your patch works, but I would suggest to change last line to
if (have_nodect) cull_nodes(matching_node);
If only have_ncpus is set, you probably don't want to call cull_nodes.
Best regards.
Mirek Ruda
More information about the mpiexec
mailing list