Error: read_ib_one: rank 383 out of bounds [0..16).
Pete Wyckoff
pw at osc.edu
Mon Nov 7 11:31:03 EST 2005
christopher.walker at gmail.com wrote on Mon, 07 Nov 2005 11:14 -0500:
> I've just installed Rocks 4.0 on a Infiniband (Topspin) based Linux cluster
> running torque-2.0.0p0. When I run a job with mpiexec, I get the following:
>
> mpiexec: Warning: read_ib_one: protocol version 0 not known, but might still
> work.
> mpiexec: Error: read_ib_one: rank 383 out of bounds [0..16).
>
>
> I'm using the CVS version of mpiexec, although version 0.80 produced the
> same error with read_ib_startup_ports in place of read_ib_one. I've looked
> through the list archives, but didn't find anything that seemed applicable
> to my case.
I'm guessing it's similar to this:
http://email.osc.edu/pipermail/mpiexec/2004/000289.html
You can hack out the read_full(.., &version, ..) in either 0.80 or
the CVS version and manually set "version = 1" (or maybe 2 or 3 for
a few early broken releases) and cross fingers.
If you know a bit more about what mpich version you're using I might
remember what version that is. Or you can wander through the source in
mpich/mpid/vapi/process/pmgr_client_mpirun_rsh.c
and compare that to mpiexec/ib.c to figure out what should be
happening. If you do find that you've got a versionless mpich,
complain to the Rocks people that they should update their distro
to a modern mpich release. Let us know too for the archives.
-- Pete
(P.S. Can you turn off html in gmail? It annoys mailman.)
More information about the mpiexec
mailing list