Mpiexec runs 1 process only

Danny Sternkopf dsternkopf at hpce.nec.com
Mon Aug 30 07:15:24 EDT 2004


HI,

thanks for your answer.

> The shmem device in mpich expects that mpiexec will spawn one process,
> then within the mpich library it forks itself into as many as requested
> on the SMP, during MPI_Init().  mpiexec tells the mpich library using
> the environment MPICH_NP which should be 2 in your case.

Yes, it is 2.

> 
> You're pretty sure you complied mpich using --with-device=ch_shmem and
> not, say, ch_p4 or something else?  Other devices expect that mpiexec
> will spawn all the tasks, one at a time.

MPICH compiled with ch_p4. 

If I specify the ch_p4 devices for mpicexec I get the following error:

asama.ess.nec.de:~/mpiexec-0.76 2915> /home/danny/mpiexec-0.76/mpiexec -v -v -v -nostdout -nostdin -comm mpich-p4 /home/danny/mpiexec-0.76/hello
stat_exe: testing "/home/danny/mpiexec-0.76/hello"
resolve_exe: using absolute exe "/home/danny/mpiexec-0.76/hello"
get_hosts: numtask=1 ncpus=2 nodect=0
node  0: name = asama.ess.nec.de, mpname = asama.ess.nec.de, cpu = 0
node  1: name = asama.ess.nec.de, mpname = asama.ess.nec.de, cpu = 0
command to 0/1
argv  0 /bin/sh
argv  1 -c
argv  2 if test -d "/home/danny/mpiexec-0.76"; then cd "/home/danny/mpiexec-0.76"; fi; exec /bin/bash -c 'exec /home/danny/mpiexec-0.76/hello -p4wd /home/danny/mpiexec-0.76 -execer_id mpiexec -master_host asama.ess.nec.de -my_hostname asama.ess.nec.de -my_nodenum 0 -my_numprocs 2 -total_numnodes 1 -master_port 0'
environment to 0/1
env  0 PWD=/home/danny/mpiexec-0.76
env  1 TMPDIR=/tmp/pbs.230.asama
env  2 PBS_JOBNAME=STDIN
env  3 PBS_ENVIRONMENT=PBS_INTERACTIVE
env  4 HOSTNAME=asama.ess.nec.de
env  5 HISTFILESIZE=3000
env  6 PVM_RSH=/usr/bin/rsh
env  7 history_control=ignoredups
env  8 QTDIR=/usr/lib/qt-2.3.1
env  9 LESSOPEN=|/usr/bin/lesspipe.sh %s
env 10 PS1=asama.ess.nec.de:\w \!>
env 11 PBS_O_QUEUE=workq
env 12 PS2=continue:>
env 13 PROMPT_COMMAND=date +%X
env 14 PBS_QUEUE=workq
env 15 PBS_MOMPORT=15003
env 16 XPVM_ROOT=/usr/share/pvm3/xpvm
env 17 PS3=blue:
env 18 KDEDIR=/usr
env 19 PBS_O_SYSTEM=Linux
env 20 USER=danny
env 21 LS_COLORS=no=00:fi=00:di=01;34:ln=01;36:pi=40;33:so=01;35:bd=40;33;01:cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:ex=01;32:*.cmd=01;32:*.exe=01;32:*.com=01;32:*.btm=01;32:*.bat=01;32:*.sh=01;32:*.csh=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.gz=01;31:*.bz2=01;31:*.bz=01;31:*.tz=01;31:*.rpm=01;31:*.cpio=01;31:*.jpg=01;35:*.gif=01;35:*.bmp=01;35:*.xbm=01;35:*.xpm=01;35:*.png=01;35:*.tif=01;35:
env 22 MACHTYPE=ia64-redhat-linux-gnu
env 23 PBS_JOBID=230.asama
env 24 PBS_O_LOGNAME=danny
env 25 PBS_O_MAIL=/var/spool/mail/danny
env 26 OLDPWD=/home/danny
env 27 MAIL=/var/spool/mail/danny
env 28 INPUTRC=/etc/inputrc
env 29 PBS_O_LANG=en_US
env 30 PBS_TASKNUM=1
env 31 LANG=en_US
env 32 PBS_O_HOST=asama.ess.nec.de
env 33 PBS_JOBCOOKIE=442903FB
env 34 PBS_NODENUM=0
env 35 LOGNAME=danny
env 36 SHLVL=1
env 37 NOMETAMAIL=blub
env 38 PBS_NODEFILE=/usr/spool/PBS/aux/230.asama
env 39 LC_CTYPE=iso_8859_1
env 40 PGPPATH=/home/danny/.pgp
env 41 PBS_O_SHELL=/bin/bash
env 42 NCPUS=2
env 43 SHELL=/bin/bash
env 44 RNINIT=-g4 -m -S -hMessage -hSender -hKeyword
env 45 HOSTTYPE=ia64
env 46 CDPATH=.:~:/
env 47 mail=/var/spool/mail/danny
env 48 PBS_O_HOME=/home/danny
env 49 OMP_NUM_THREADS=2
env 50 OSTYPE=linux-gnu
env 51 HISTSIZE=350
env 52 LAMHELPFILE=/etc/lam/lam-helpfile
env 53 PVM_ROOT=/usr/share/pvm3
env 54 PBS_O_PATH=/opt/mpich/gnu/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/home/danny/bin
env 55 HOME=/home/danny
env 56 TERM=xterm
env 57 SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass
env 58 PATH=/opt/mpich-1.2.5.2-ch_p4-gcc/bin:/usr/kerberos/bin:/bin:/usr/bin:/usr/X11R6/bin:/home/danny/bin
env 59 SAVEDDIR=/home/danny/.secure
env 60 PBS_O_WORKDIR=/home/danny/proj/submit
env 61 _=/home/danny/mpiexec-0.76/mpiexec
p0_12062: (-0.000131) send_message: to=1; invalid conn type=5
p0_12062:  p4_error: subtree_broadcast_p4 failed, type=: 1010101010
wait_one_task_start: evt = 2, task 0 host asama.ess.nec.de
All 1 task started.
wait_tasks: waiting for asama.ess.nec.de/0
wait_tasks: numspawned = 1, got evt 3 for tid 4 host asama.ess.nec.de status 1
mpiexec: Warning: task 0 exited with status 1.


How does mpiexec know where a certain MPICH version is installed?

I have build mpicexec as follows:
./configure --with-pbs=/usr/local --disable-mpich-gm --disable-lam --disable-emp
make


> 
> If you send me the output of "qstat -f $PBS_JOBID" within the batch job,
> and run "mpiexec -v -v -v ..." to show lots of its debugging messages
> too, maybe we'll be able to see a problem.

(It's a interactive batch session.)
asama.ess.nec.de:~/mpiexec-0.76 2938> qstat -f 230
Job Id: 230.asama
    Job_Name = STDIN
    Job_Owner = danny at asama.ess.nec.de
    resources_used.cpupercent = 0
    resources_used.cput = 00:00:00
    resources_used.mem = 21072kb
    resources_used.ncpus = 2
    resources_used.vmem = 60960kb
    resources_used.walltime = 00:04:46
    job_state = R
    queue = workq
    server = asama
    Checkpoint = u
    ctime = Mon Aug 30 13:06:31 2004
    Error_Path = /dev/ttyp0
    exec_host = asama.ess.nec.de/0*2
    Hold_Types = n
    interactive = True
    Join_Path = n
    Keep_Files = n
    Mail_Points = a
    mtime = Mon Aug 30 13:06:31 2004
    Output_Path = /dev/ttyp0
    Priority = 0
    qtime = Mon Aug 30 13:06:31 2004
    Rerunable = True
    Resource_List.ncpus = 2
    session_id = 11886
    Variable_List = PBS_O_HOME=/home/danny,PBS_O_LANG=en_US,PBS_O_LOGNAME=danny,
        PBS_O_PATH=/opt/mpich/gnu/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/u
        sr/bin:/usr/X11R6/bin:/home/danny/bin,PBS_O_MAIL=/var/spool/mail/danny,
        PBS_O_SHELL=/bin/bash,PBS_O_HOST=asama.ess.nec.de,
        PBS_O_WORKDIR=/home/danny/proj/submit,PBS_O_SYSTEM=Linux,
        PBS_O_QUEUE=workq
    comment = Job run on node asama.ess.nec.de - at Mon Aug 30 at 13:06
    etime = Mon Aug 30 13:06:31 2004


Best regards,

Danny



More information about the mpiexec mailing list