Mpiexec with SGE and/or command line?
Rayson Ho
raysonho at eseenet.com
Tue Jul 22 17:44:11 EDT 2003
>On Mon, 21 Jul 2003, Bill Broadley wrote:
>> I tried to install mpiexec on a system as a mpirun replacement.
>> Especially the -kill feature to allow a partially dead parallel run to
>> clean up after itself.
>>
>> Is it possible to install mpiexec for use with the sun grid engine or
>> command line without installing *PBS*?
If you want to control parallel jobs under SGE, you need to use the tight
PE integration.
>
>The whole point of mpiexec is to integrate MPI with a batch system,
>so outside of a batch environment it's not going to work. (The -kill
>functionality relies on the batch system for signal delivery, for
>instance.)
>
>I don't know if SGE has a publicly documented API equivalent to the PBS
>tm interface. I suspect such an API exists inside SGE, but my cursory
>search of the SGE docs didn't turn up anything.
I think SGE only provides "qrsh -inherit" for tight parallel integration,
I
don't think there is an API.
HOWEVER, I was thinking about writing a tm interface for SGE, so that for
example mpiexec would work with SGE, which means tight PE integration
would
work as long as the parallel system supports PBS.
I found that mpiexec only uses a subset of the tm library:
tm interface equ. SGE call/command:
=======================================================================
tm_init() N/A
tm_spawn() exec "qrsh -inherit"
tm_nodeinfo() need from the PE node file
tm_obit() N/A
tm_kill() exec qdel <job id>
tm_finalize() N/A
As you can see, most of what mpiexec needs is already there. The "N/A"s
are
just some internal data structure init or lookup.
My code (300 lines) already provides the above functions, the remaining
work
is debugging/testing, and also we need to modify a place in mpiexec which
gets
information from PBS to using the equ. SGE call.
(I haven't touched this code for more than 3 months... I wrote it to prove
that it is doable)
Rayson
> (SGE is pretty bad in this
>regard, IMHO. The Maui devs had an extremely hard time adding an SGE
driver
>to Maui because SGE also didn't have a publicly documented scheduler API
>until very recently.)
>
> --Troy
>--
>Troy Baer email: troy at osc.edu
>Science & Technology Support phone: 614-292-9701
>Ohio Supercomputer Center web: http://oscinfo.osc.edu
>
>_______________________________________________
>mpiexec mailing list
>mpiexec at osc.edu
>http://email.osc.edu/mailman/listinfo/mpiexec
>
>
---------------------------------------------------------
Get your FREE E-mail account at http://www.eseenet.com !
More information about the mpiexec
mailing list