stdin question

Clifton Kirby ckirby3 at colsa.com
Mon Aug 22 13:34:16 EDT 2005


I am running Torque 1.2.0p4 with mpiexec 0.79 both compiled using gcc 3.3.
We are trying to run on 512 nodes using Myrinet with the gm 2.0.21 driver.
I've applied the OS X patch for gm.c mentioned in the mailing list.

What is needed to run a program as follows,

#mpiexec -n 10 myprogram < inputdata >& output

How can I redirect the "inputdata" file as input to "myprogram" and also
redirect the stdout to the "output" file?  I've played around
with -nostdin, -nostdout and -allstdin arguments but nothing seems to work.
The program does random reads to "inputdata" and only reads portions of
"inputdata" at a time.  All mpi ranks need to read "inputdata" as stdin.

However, we are able to successfully run this,

mpiexec -n 10 myprogram >& output

Another problem shows up in the mom logs of torque with the following
message being repeated several time and the job will fail.  However I can
get one job to run successfully after rebooting but subsequent job fail.

08/20/2005 14:22:29;0001;   pbs_mom;Svr;pbs_mom;im_eof, Premature end of
message from addr xxx.xxx.xxx.xxx:15003
08/20/2005 14:22:29;0001;   pbs_mom;Svr;pbs_mom;task_check, cannot tm_reply
to 17606.mach5c.mach5.roc task 1
08/20/2005 14:22:29;0001;   pbs_mom;Svr;pbs_mom;task_check, cannot tm_reply
to 17606.mach5c.mach5.roc task 1

and the stderr file from torque shows this,

[196] Error: write to socket failed !
[200] Error: write to socket failed !
mpiexec: Error: read_gm_startup_ports: eof in gmpi_port#1 iter 195.

TIA!

- Trip



-- 
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.338 / Virus Database: 267.10.13/78 - Release Date: 8/19/2005



More information about the mpiexec mailing list