mpiexec 0.82 bugs?

Pete Wyckoff pw at osc.edu
Thu Mar 8 12:20:07 EST 2007


Thomas.Svedberg at chalmers.se wrote on Thu, 08 Mar 2007 13:11 +0100:
> Hello all, I am trying to solve a problem with starting MPI jobs on a 
> cluser with dual gigbit networks and started looking in to the code.
> I have so far found 2 things I do not really understand:
> 
> mpiexec.c line 677:
>     if (cl_args->verbose)
>     for (i=0; i<numtasks; i++)
>         printf("node %2d: name %s, cpu avail %u\n", i,
>           nodes[i].name, nodes[i].availcpu);
> 
> Here numtasks==0 (always?) and nothing ever is printed, should in not be 
> numcpus?

Should be numnodes, not numtasks.  Nice bug, that.

> config.c line 288:
>    for (i=0; i<numtasks; i++) {
>         if (!nodes[i].name) continue;  /* deleted */
>         for (j=i+1; j<numtasks; j++) {
>         if (tasks[j].node == -1) continue;  /* deleted */
> 
> Here numtasks the total number of cpus from earlier processing, but as 
> far as I can see nodes are only numnodes in size?
> I.e. on a 4-way SMP cluster, say I ask för nodes=2:ppn=4 from PBS, at 
> this point I have numtasks=8 and nodes only have 2 entries.
> So how is the second line supposed to work?
> I can see from testing that if remove it, all nodes start 4 tasks in one 
> go, leaving it makes some nodes start 4 times 1 task etc.

Also another bug.  That second line should look more like the
fourth.

Both of these were part of a sloppy checkin almost two years ago
when the nodes and tasks arrays were separated.  The first bug never
was a problem due to numtasks being 0 there.  The second bug never
shows up on 2-processor systems, which is all I ever test on, and
only affects mpich/p4 (or shmem).

Thanks for finding these!  You get an award for reporting two real
bugs in one mail.  Please apply the patch below and let me know if
it works on your :ppn=4 machine.  I'm assuming you'll be trying to
get --transform-hostname to work next.  It all still appears to
function here.

		-- Pete

Index: config.c
===================================================================
--- config.c	(revision 400)
+++ config.c	(working copy)
@@ -286,21 +286,21 @@ tasks_shmem_reduce(void)
 	 * possibly different cpu# and squeeze them together.
 	 */
 	for (i=0; i<numtasks; i++) {
-	    if (!nodes[i].name) continue;  /* deleted */
+	    if (tasks[i].node == -1) continue;  /* deleted */
 	    for (j=i+1; j<numtasks; j++) {
 		if (tasks[j].node == -1) continue;  /* deleted */
 		if (tasks[i].node == tasks[j].node
-		  && tasks[i].conf == tasks[j].conf) {
-		    /* grow array */
+		 && tasks[i].conf == tasks[j].conf) {
 		    if (tasks[i].num_copies == 1) {
+			/* initialize array */
 			tasks[i].cpu_index = Malloc(num_copies_chunk
 			                   * sizeof(*tasks[i].cpu_index));
 			tasks[i].cpu_index[0] = tasks[i].cpu_index_one;
 		    }
 		    if (((tasks[i].num_copies + 1) % num_copies_chunk) == 0) {
-			int *x;
+			/* if would overflow, grow array */
 			int len = tasks[i].num_copies + 1 + num_copies_chunk;
-			x = Malloc(len * sizeof(*tasks[i].cpu_index));
+			int *x = Malloc(len * sizeof(*tasks[i].cpu_index));
 			memcpy(x, tasks[i].cpu_index, tasks[i].num_copies
 			       * sizeof(*tasks[i].cpu_index));
 			free(tasks[i].cpu_index);
@@ -310,7 +310,7 @@ tasks_shmem_reduce(void)
 		    tasks[i].cpu_index[tasks[i].num_copies]
 			= tasks[j].cpu_index[0];
 		    ++tasks[i].num_copies;
-		    tasks[j].node = -1;
+		    tasks[j].node = -1;  /* mark deleted */
 		}
 	    }
 	}
Index: mpiexec.c
===================================================================
--- mpiexec.c	(revision 400)
+++ mpiexec.c	(working copy)
@@ -675,7 +675,7 @@ main(int argc, const char *argv[])
 	concurrent_get_nodes();
 
     if (cl_args->verbose)
-	for (i=0; i<numtasks; i++)
+	for (i=0; i<numnodes; i++)
 	    printf("node %2d: name %s, cpu avail %u\n", i,
 	      nodes[i].name, nodes[i].availcpu);
 


More information about the mpiexec mailing list