Error: read_ib_one: rank 383 out of bounds [0..16).

Pete Wyckoff pw at osc.edu
Mon Nov 7 11:31:03 EST 2005


christopher.walker at gmail.com wrote on Mon, 07 Nov 2005 11:14 -0500:
> I've just installed Rocks 4.0 on a Infiniband (Topspin) based Linux cluster
> running torque-2.0.0p0. When I run a job with mpiexec, I get the following:
> 
> mpiexec: Warning: read_ib_one: protocol version 0 not known, but might still
> work.
> mpiexec: Error: read_ib_one: rank 383 out of bounds [0..16).
> 
> 
> I'm using the CVS version of mpiexec, although version 0.80 produced the
> same error with read_ib_startup_ports in place of read_ib_one. I've looked
> through the list archives, but didn't find anything that seemed applicable
> to my case.

I'm guessing it's similar to this:

    http://email.osc.edu/pipermail/mpiexec/2004/000289.html

You can hack out the read_full(.., &version, ..) in either 0.80 or
the CVS version and manually set "version = 1" (or maybe 2 or 3 for
a few early broken releases) and cross fingers.

If you know a bit more about what mpich version you're using I might
remember what version that is.  Or you can wander through the source in

    mpich/mpid/vapi/process/pmgr_client_mpirun_rsh.c

and compare that to mpiexec/ib.c to figure out what should be
happening.  If you do find that you've got a versionless mpich,
complain to the Rocks people that they should update their distro
to a modern mpich release.  Let us know too for the archives.

		-- Pete

(P.S.  Can you turn off html in gmail?  It annoys mailman.)



More information about the mpiexec mailing list