cleaning up SHM segments

Brent M. Clements bclem at rice.edu
Thu Apr 1 18:13:28 EST 2004


Most of the mpich's have a utility called cleanipcs.

We use an epilogue script to clean up after each job...even those that
don't use shared memory segments(it doesn't matter if cleanipcs is run
either way).

Attached is our epilogue script. It's a variation of one that osc uses.

-Brent

Brent Clements
Linux Technology Specialist
Information Technology
Rice University


On Fri, 2 Apr 2004, Chris Samuel wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On Thu, 1 Apr 2004 07:09 pm, Victoria Pennington wrote:
>
> > but I'm assuming that only one such job would be running on each node
> > at any one time.
>
> Unfortunately this won't scale to SMP systems where a user could have multiple
> single CPU jobs on the same node.
>
> It's a similar problem we're looking at with trying to clean up temporary
> files on nodes where you want to be certain you're not about to clobber a
> file that's legitimate.
>
> - --
>  Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin
>  Victorian Partnership for Advanced Computing http://www.vpac.org/
>  Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.2 (GNU/Linux)
>
> iD8DBQFAbJ9uO2KABBYQAh8RAgeLAJ4qsFyqPTJZSva5H4VEu8hAUtY0hwCfRuHK
> VcJ5QYCeFkOwjgpiDkipiI8=
> =/QYj
> -----END PGP SIGNATURE-----
>
>
> _______________________________________________
> mpiexec mailing list
> mpiexec at osc.edu
> http://email.osc.edu/mailman/listinfo/mpiexec
>
-------------- next part --------------
#!/bin/sh

# Clean up MPICH slave processes as well as ipcs stuff,  after an MPICH/PBS job terminates
                                                                                                                                                             
JOBID=$1
USER=$2

# add separator
#echo "--------------------------------------" >> /shared.scratch/$JOBID.epilogue
#echo "     Running PBS epilogue script" >> /shared.scratch/$JOBID.epilogue
#echo "--------------------------------------" >> /shared.scratch/$JOBID.epilogue

# print user job and user info 
#echo "JOBID=$JOBID" >> /shared.scratch/$JOBID.ep
#echo "USER=$USER" >> /shared.scratch/$JOBID.epilogue

# Do we have a nodefile? (Can't use PBS_NODEFILE, as epilogue scripts get
# a blank environment!)

if test -r "/var/spool/torque/aux/$1"; then

   PBS_NODEFILE="/var/spool/torque/aux/$1"
   export PBS_NODEFILE
   nodes=$(cat $PBS_NODEFILE | uniq)

#  Run the cleanipcs script via. ssh. Take on the user's identity; run
#  a login shell so that /etc/profile.d is parsed (to set up MPI environment
#  variables); force the use of a known shell so that we can pass through the
#  PBS_NODEFILE environment variable (su by default blanks the environment)

   for i in $nodes
   do
      #echo "Doing node: $i" >> /shared.scratch/$JOBID.epilogue
      # echo -n "IPCS cleaning: " >> /shared.scratch/$JOBID.epilogue
      /usr/bin/ssh $i "/bin/su - $USER -c '/usr/bin/cleanipcs'"
      ret=$?
      #test $ret -eq 0 && echo "	OK" >> /shared.scratch/$JOBID.epilogue
      #test $ret -ne 0 && echo " FAILED" >> /shared.scratch/$JOBID.epilogue
   done
fi


More information about the mpiexec mailing list