core dump with gm-shared memory

Glen Beane beaneg at umcs.maine.edu
Tue Jul 1 15:52:49 EDT 2003


I no longer have this problem.
I simply recompiled with gcc.

Previously I had compiled with pgcc and there were a couple warnings. 
When I used gcc the warnings went away, and so dod the seg fault
problem...

glen


On Tue, 2003-07-01 at 15:47, Pete Wyckoff wrote:
> beaneg at umcs.maine.edu said on Thu, 26 Jun 2003 10:06 -0400:
> > if I build mpiexec to use gm-shmem on SMP nodes, mpiexec causes a
> > segmentation fault, but it is always after my MPI program has finished
> > properly, so it seems to be when mpiexec is cleaning up.
> > 
> > If I build mpiexec without gm-shmem there are no problems.
> > 
> > gm-shmem has been changed slighly on my system.  After discussing some
> > problems with myricom we decided to change the default location of the
> > shared memory file on our system(done by editing gmpi_smppriv.c and
> > mpirun.ch_gm.pl).  Since /tmp was NFS mounted, we were having problems
> > with a large number of nodes writing shared memory files to /tmp.  The
> > shared memory file is now located in ramdisk( location of the shared
> > memory file will likely be a configurable option in the next MPICH/GM
> > release)
> > 
> > This setup works fine with mpirun.ch_gm,  but has been causing
> > segmentation faults with mpiexec which don't seem to affect the actual
> > MPI program.
> > 
> > Since mpirun.ch_gm.pl references the temp file, I was wondering if
> > mpiexec did anywhere, but looking quickly through the source code I
> > didn't find any reference to it.
> > 
> > 
> > Does anyone know what might be causing the problem?  Other than the
> > inability to use gm-shmem, we really like mpiexec so far.
> 
> I'm a bit confused by this.  Release 0.72 of mpiexec and earlier did
> have a configure option "--disable-gm-shmem" which could be used to
> control the ability to use a command-line setting "-no-shmem" which
> only changed the environment to contain "GMPI_SHMEM=0".
> 
> This was removed since it is just as easy to do something like:
> 
>     export GMPI_SHMEM=0
>     mpiexec a.out
> 
> in your batch script and have the same effect.  There are plenty of
> other GPMI_ variables that can be set this way too.
> 
> Back before Aug 2002, it was necessary for mpiexec to think about the
> path to the mpich/gm shared memory file, but that too is currently
> handled only by the mpich library.  Mpiexec does not choose a location
> for the shared memory file or get involved in the process at all.  In
> fact, I don't think that mpiexec ever messes with /tmp unless you told
> it your executable is there.
> 
> I can't guess at what would cause mpiexec itself to SEGV, then, since
> all it talks to is PBS through the TM interface.  It is not linked with
> any MPICH or GM code.  If you can run mpiexec under gdb and get it to
> segv, I'd definitely like to see what caused it to die.
> 
> 		-- Pete



More information about the mpiexec mailing list