tm-problems, smp-prob?

Stefan Friedel stefan.friedel at iwr.uni-heidelberg.de
Tue Feb 25 10:34:40 EST 2003


Hi *,
I've got 2 different questions: (Linux Cluster, running Openpbs 2.3.16, mpich-gm, Linux/Debian woody, 256x2 Nodes) -

- what means the following??:

#################
Error received by batch job output Feb 24 23:31 trailing.queue.e1279

mpiexec: Error: wait_tasks: tm_poll remote: tm: system error.

Asynchron communication in mpicall never finished or died

DDD [000] ERROR 04200: receive-timeout for IF 14 in DDD_IFAExchange
DDD [000] ERROR 04201:   waiting for message (from proc 1, size 2080)
#################

Any hints?

- with one of our applications we have the problem that it is just running on one node/one processor configuration with
+mpiexec (same system, same pbs job with the mpirun.ch_gm/mpirun from myri runs fine - eg. nodes=16:ppn=2 or something).
+We then get errors like:

##################
[3] Error: Unable to get GM local node id !
[3] Error: write to socket failed !
[2] Error: Unable to get GM local node id !
[2] Error: write to socket failed !
##################

any hints here??

Regards, Stefan Friedel
-- 
Stefan Friedel  IWR - Zentrale Dienste
Universitaet Heidelberg
stefan.friedel at iwr.uni-heidelberg.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 232 bytes
Desc: not available
Url : http://email.osc.edu/pipermail/mpiexec/attachments/20030225/38e4d894/attachment.bin


More information about the mpiexec mailing list