Mpiexec with SGE and/or command line?

Rayson Ho raysonho at eseenet.com
Tue Jul 22 17:44:11 EDT 2003


>On Mon, 21 Jul 2003, Bill Broadley wrote:
>> I tried to install mpiexec on a system as a mpirun replacement.
>> Especially the -kill feature to allow a partially dead parallel run to
>> clean up after itself.
>> 
>> Is it possible to install mpiexec for use with the sun grid engine or
>> command line without installing *PBS*?

If you want to control parallel jobs under SGE, you need to use the tight 
PE integration.

>
>The whole point of mpiexec is to integrate MPI with a batch system,
>so outside of a batch environment it's not going to work.  (The -kill
>functionality relies on the batch system for signal delivery, for
>instance.)
>
>I don't know if SGE has a publicly documented API equivalent to the PBS
>tm interface.  I suspect such an API exists inside SGE, but my cursory
>search of the SGE docs didn't turn up anything.

I think SGE only provides "qrsh -inherit" for tight parallel integration,
I
don't think there is an API.

HOWEVER, I was thinking about writing a tm interface for SGE, so that for
example mpiexec would work with SGE, which means tight PE integration
would
work as long as the parallel system supports PBS.

I found that mpiexec only uses a subset of the tm library:

tm interface                                     equ. SGE call/command:
=======================================================================
tm_init()                                          N/A
tm_spawn()                                     exec "qrsh -inherit"
tm_nodeinfo()                                need from the PE node file
tm_obit()                                          N/A
tm_kill()                                      exec qdel <job id>
tm_finalize()                                      N/A

As you can see, most of what mpiexec needs is already there. The "N/A"s
are
just some internal data structure init or lookup.

My code (300 lines) already provides the above functions, the remaining
work
is debugging/testing, and also we need to modify a place in mpiexec which
gets
information from PBS to using the equ. SGE call.

(I haven't touched this code for more than 3 months... I wrote it to prove

that it is doable)

Rayson


>  (SGE is pretty bad in this
>regard, IMHO.  The Maui devs had an extremely hard time adding an SGE
driver
>to Maui because SGE also didn't have a publicly documented scheduler API
>until very recently.)
>
>	--Troy
>-- 
>Troy Baer                       email:  troy at osc.edu
>Science & Technology Support    phone:  614-292-9701
>Ohio Supercomputer Center       web:  http://oscinfo.osc.edu
>
>_______________________________________________
>mpiexec mailing list
>mpiexec at osc.edu
>http://email.osc.edu/mailman/listinfo/mpiexec
>
>
---------------------------------------------------------
Get your FREE E-mail account at http://www.eseenet.com !



More information about the mpiexec mailing list