Results: released vs. cvs mpiexec - was Re: Odd behavior with
mpiexec for voltaire infiiband package
Christopher D. Maestas
cdmaest at sandia.gov
Wed Feb 4 19:12:28 EST 2004
Ok, I'll set the value to 1 manually and wait for voltaire to merge with
the changes from OSU MVAPICH.
Here's some timings between the cvs and the released mpiexec running on
128 procs/64 nodes over 30 iterations:
-- mpiexec release
foreach n ( `seq 1 30` )
foreach? time /projects/mpiexec/bin/mpiexec -comm=ib -np 128
cpi_infiniband >& /dev/null
foreach? end
0.010u 0.060s 0:07.16 0.9% 0+0k 0+0io 521pf+0w
0.000u 0.120s 0:08.57 1.4% 0+0k 0+0io 521pf+0w
0.000u 0.270s 0:08.37 3.2% 0+0k 0+0io 521pf+0w
0.020u 0.050s 0:07.82 0.8% 0+0k 0+0io 521pf+0w
0.010u 0.060s 0:08.67 0.8% 0+0k 0+0io 521pf+0w
0.000u 0.110s 0:06.96 1.5% 0+0k 0+0io 521pf+0w
0.000u 0.090s 0:08.56 1.0% 0+0k 0+0io 521pf+0w
0.010u 0.110s 0:08.36 1.4% 0+0k 0+0io 521pf+0w
0.020u 0.080s 0:08.76 1.1% 0+0k 0+0io 521pf+0w
0.010u 0.060s 0:08.34 0.8% 0+0k 0+0io 521pf+0w
0.000u 0.110s 0:08.46 1.3% 0+0k 0+0io 521pf+0w
0.030u 0.130s 0:08.71 1.8% 0+0k 0+0io 521pf+0w
0.010u 0.400s 0:08.43 4.8% 0+0k 0+0io 521pf+0w
0.020u 0.090s 0:08.39 1.3% 0+0k 0+0io 521pf+0w
0.000u 0.160s 0:08.48 1.8% 0+0k 0+0io 521pf+0w
0.020u 0.100s 0:08.94 1.3% 0+0k 0+0io 521pf+0w
0.000u 0.110s 0:08.46 1.3% 0+0k 0+0io 521pf+0w
0.010u 0.070s 0:08.62 0.9% 0+0k 0+0io 521pf+0w
0.010u 0.090s 0:08.54 1.1% 0+0k 0+0io 521pf+0w
0.000u 0.070s 0:08.54 0.8% 0+0k 0+0io 521pf+0w
0.010u 0.070s 0:08.68 0.9% 0+0k 0+0io 521pf+0w
0.000u 0.170s 0:08.67 1.9% 0+0k 0+0io 521pf+0w
0.000u 0.090s 0:08.50 1.0% 0+0k 0+0io 521pf+0w
0.000u 0.100s 0:08.47 1.1% 0+0k 0+0io 521pf+0w
0.010u 0.060s 0:08.43 0.8% 0+0k 0+0io 521pf+0w
0.010u 0.140s 0:08.45 1.7% 0+0k 0+0io 521pf+0w
0.000u 0.060s 0:08.39 0.7% 0+0k 0+0io 521pf+0w
0.010u 0.380s 0:08.58 4.5% 0+0k 0+0io 521pf+0w
0.030u 0.070s 0:08.42 1.1% 0+0k 0+0io 521pf+0w
0.010u 0.130s 0:08.64 1.6% 0+0k 0+0io 521pf+0w
--
-- mpiexec cvs
foreach n ( `seq 1 30` )
foreach? time /projects/mpiexec.new/bin/mpiexec -comm=ib -np 128
cpi_infiniband >& /dev/null
foreach? end
0.000u 0.110s 0:07.44 1.4% 0+0k 0+0io 524pf+0w
0.000u 0.120s 0:07.33 1.6% 0+0k 0+0io 524pf+0w
0.020u 0.040s 0:08.42 0.7% 0+0k 0+0io 524pf+0w
0.010u 0.110s 0:08.52 1.4% 0+0k 0+0io 524pf+0w
0.000u 0.150s 0:08.72 1.7% 0+0k 0+0io 524pf+0w
0.000u 0.390s 0:08.42 4.6% 0+0k 0+0io 524pf+0w
0.000u 0.090s 0:08.66 1.0% 0+0k 0+0io 524pf+0w
0.010u 0.100s 0:08.30 1.3% 0+0k 0+0io 524pf+0w
0.010u 0.160s 0:08.33 2.0% 0+0k 0+0io 524pf+0w
0.010u 0.060s 0:08.44 0.8% 0+0k 0+0io 524pf+0w
0.040u 0.090s 0:08.53 1.5% 0+0k 0+0io 524pf+0w
0.000u 0.090s 0:08.45 1.0% 0+0k 0+0io 524pf+0w
0.010u 0.080s 0:08.45 1.0% 0+0k 0+0io 524pf+0w
0.020u 0.050s 0:08.64 0.8% 0+0k 0+0io 524pf+0w
0.000u 0.100s 0:08.73 1.1% 0+0k 0+0io 524pf+0w
0.010u 0.060s 0:08.63 0.8% 0+0k 0+0io 524pf+0w
0.000u 0.080s 0:08.47 0.9% 0+0k 0+0io 524pf+0w
0.010u 0.240s 0:08.68 2.8% 0+0k 0+0io 524pf+0w
0.010u 0.090s 0:08.78 1.1% 0+0k 0+0io 524pf+0w
0.000u 0.120s 0:08.45 1.4% 0+0k 0+0io 524pf+0w
0.020u 0.360s 0:08.39 4.5% 0+0k 0+0io 524pf+0w
0.010u 0.110s 0:08.40 1.4% 0+0k 0+0io 524pf+0w
0.030u 0.130s 0:08.75 1.8% 0+0k 0+0io 524pf+0w
0.000u 0.110s 0:08.69 1.2% 0+0k 0+0io 524pf+0w
0.010u 0.100s 0:08.50 1.2% 0+0k 0+0io 524pf+0w
0.010u 0.100s 0:08.85 1.2% 0+0k 0+0io 524pf+0w
0.010u 0.090s 0:08.53 1.1% 0+0k 0+0io 524pf+0w
0.010u 0.140s 0:08.37 1.7% 0+0k 0+0io 524pf+0w
0.020u 0.100s 0:08.46 1.4% 0+0k 0+0io 524pf+0w
0.030u 0.130s 0:08.46 1.8% 0+0k 0+0io 524pf+0w
--
Don't know what else I can measure offhand this late in the day to show.
:-)
Regards,
On Wed, 2004-02-04 at 11:25, Pete Wyckoff wrote:
> cdmaest at sandia.gov said on Wed, 04 Feb 2004 10:45 -0700:
> > The infiniband environment distributed with Voltaire's software
> > integration package (ibhost-hpc-2.0.0_10-1rh90.k) errors out when run
> > here:
> > ---
> > [cdmaest at ca894 mpi]$ /projects/mpiexec/bin/mpiexec -np 2 -pernode
> > -comm=ib cpi_infiniband
> > mpiexec: Warning: read_ib_startup_ports: protocol version 0 not known,
> > but might still work.
> > mpiexec: Error: read_ib_startup_ports: rank 48 out of bounds [0..2).
> > read: Connection reset by peer
> > ---
> >
> > If you comment out the first read_full in start_tasks.c, then things
> > work for the voltaire stuff.
>
> Yes, the version number is something I'm pushing into the OSU MVAPICH
> code, but it has not yet made it into Voltaire's release. Your fix of
> skipping the version check in mpiexec is just fine (but you may want to
> set the variable to 1 or not check it).
>
> The problem with _not_ having a version number is that in the future it
> will become difficult for mpiexec to figure out how to talk to the
> application if code changes require startup modifications. We suffered
> through this with MPICH/GM and I'd rather not have to deal with that
> again. I'm hoping instead that some short-term pain (sorry) will be
> bearable before IB becomes wildly popular. Perhaps the next MVAPICH and
> Voltaire releases will have incorporated versioning.
>
> Please grab the mpiexec from CVS if you plan to use it with IB. There
> are a couple of fixes there, one a performance improvement for startup
> and the other allows --with-default-comm=ib to work. Some other
> enhancements there aren't too important, but won't hurt.
>
> -- Pete
>
> _______________________________________________
> mpiexec mailing list
> mpiexec at osc.edu
> http://email.osc.edu/mailman/listinfo/mpiexec
More information about the mpiexec
mailing list