Query about concurrency

Eoin McHugh eoin.mchugh at ichec.ie
Tue Mar 21 11:24:11 EST 2006


On Tue, Mar 21, 2006 at 10:59:52AM -0500, Pete Wyckoff wrote:
> David is right on with his explanation, but the limitation you're
> seeing is only because you're using mpich-p4.  Communication between
> task#0 and mpiexec as implemented in mpich-p4 restricts both
> processes to be on the same node.  So your first case works fine
> because task#0 of each of the two parallel jobs end up on node#0
> because "-pernode" only uses one of the two CPUs on node#0, and the
> second CPU of node#0 is assigned to task#0 of the second job.  In
> your second case, task#0 of the second test job ends up on a
> different node because both CPUs of node#0 are taken by the first
> job.  You can run with "-v" to see the allocations chosen.

Pete,

Thanks for the reply. That is what I thought was happening, I just 
didn't realise that this was a hard limitation of mpich-p4. I suppose 
this confusion could really be avoided if the mpiexec man page included 
this limitation in the concurrency section.

> Other MPI implementations do not have this restriction.  It
> shouldn't be too difficult to fix mpich to support your usage.  If
> you want to do this, I can point you to the place in the mpich code
> to help you get started.

I don't think I'll be making those changes at this point in time as the
potential risks out weigh the gain for a few users. 

Regards,

-- 
Eoin McHugh
ICHEC Systems
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://email.osc.edu/pipermail/mpiexec/attachments/20060321/4ea5f204/attachment.bin


More information about the mpiexec mailing list