What does this error mean?
Kevin Van Workum
vanw at tticluster.com
Tue Jan 29 20:04:40 EST 2008
On 1/29/08, Thomas Zeiser <thomas.zeiser at rrze.uni-erlangen.de> wrote:
> Hi Kevin,
>
> what is the exact build id of your Intel MPI?
>
> You also might experiment with these Intel MPI environment
> variables (assuming you have a recent Intel MPI 3.1): (export them
> before calling mpiexec)
> I_MPI_USE_DYNAMIC_CONNECTIONS=on
> I_MPI_PMI_EXTENSIONS=off
> I_MPI_PMI_FAST_STARTUP=off
> All of them take on/off. The settings given above is what we are
> using.
>
> Regards,
>
> thomas
Intel(R) MPI Library 3.1 for Linux*
Package ID: l_mpi-rt_p_3.1.026
I tried setting all the above to on and off and as shown above, but
still the same problem.
--
Kevin
> On Tue, Jan 29, 2008 at 03:28:40PM -0500, Kevin Van Workum wrote:
> > On 1/29/08, Pete Wyckoff <pw at osc.edu> wrote:
> > > vanw at tticluster.com wrote on Tue, 29 Jan 2008 14:31 -0500:
> > > > On 1/29/08, Pete Wyckoff <pw at osc.edu> wrote:
> > > > > vanw at tticluster.com wrote on Tue, 29 Jan 2008 08:19 -0500:
> > > > > > Can you please interpret this error for me:
> > > > > >
> > > > > > mpiexec: Error: read_keyvals: no '=' found in keyval 4 of line: cmd.
> > > > > >
> > > > > > I get this error if my command line is either of the following:
> > > > > >
> > > > > > mpiexec -allstdin -comm=mpich2-pmi -verbose a.out blah blah blah
> > > > > > mpiexec -allstdin -comm mpich2-pmi -verbose a.out blah blah blah
> > > > > >
> > > > > > I should note that a.out was built with Intel MPI.
> > > > >
> > > > > That's coming from part of the PMI message exchange between starting
> > > > > processes and mpiexec. The Intel MPI is somewhat of a rare find in
> > > > > the wild. If you run with lots of "-v -v -v", it will print out all
> > > > > the details and we may be able to guess why it's unhappy.
> > > >
> > > > Here's the error message and some surrounding messages. Let me know if
> > > > you need more of the earlier debug output.
> > > >
> > > > mpiexec: handle_pmi: rank 7 spawn 0 kvsname 53001.jman-spawn-0.
> > > > mpiexec: read_keyvals: keyval 0 key cmd val get_my_kvsname.
> > > > mpiexec: handle_pmi: cmd=get_my_kvsname.
> > > > mpiexec: do_child: poll got 1.
> > > > mpiexec: handle_pmi: rank 0 spawn 0 kvsname 53001.jman-spawn-0.
> > > > mpiexec: read_keyvals: keyval 0 key cmd val put.
> > > > mpiexec: read_keyvals: keyval 1 key kvsname val 53001.jman-spawn-0.
> > > > mpiexec: read_keyvals: keyval 2 key key val DAPL_PROVIDER.
> > > > mpiexec: read_keyvals: keyval 3 key value val <NULL.
> > > > mpiexec: Error: read_keyvals: no '=' found in keyval 4 of line: cmd.
> > > > mpiexec: process_obit_event: evt 10 task 0 on n143104 stat 13.
> > > > mpiexec: stdio_msg_parent_read: pipe closed.
> > > > mpiexec: kill_stdio: sent SIGTERM, waiting on 25806.
> > > > mpiexec: Warning: tasks 0-7 exited with status 13.
> > >
> > > (Adding some Intel MPI folks to CC.)
> > >
> > > I should have asked for one more "-v" so we could see the actual PMI
> > > line sent by the Intel MPI library. But we can mostly reconstruct
> > > it from the above.
> > >
> > > Rank 0 tries to do a PMI put, to store a value into its dictionary.
> > > The line from task 0 probably looks like this:
> > >
> > > cmd=put kvsname=53001.jman-spawn-0 key=DAPL_PROVIDER value=<NULL cmd
> > >
> > > Not sure if anything follows.
> > >
> > > Is DAPL_PROVIDER a key anyone recognizes? Any ideas how it could
> > > work out to be "<NULL"? Maybe some misconfiguration on Kevin's
> > > system?
> > >
> > > -- Pete
> > >
> >
> > With one more '-v' I got this for PMI line:
> >
> > mpiexec: read_keyvals: read 73 chars: cmd=put
> > kvsname=53002.jman-spawn-0 key=DAPL_PROVIDER value=<NULL string>
> >
> > --
> > Kevin
>
--
Kevin Van Workum, Ph.D.
Tsunamic Technologies Inc.
Vice President
www.clusterondemand.com
ONLINE COMPUTER CLUSTERS
More information about the mpiexec
mailing list