core dump with gm-shared memory
Glen Beane
beaneg at umcs.maine.edu
Tue Jul 1 15:52:49 EDT 2003
I no longer have this problem.
I simply recompiled with gcc.
Previously I had compiled with pgcc and there were a couple warnings.
When I used gcc the warnings went away, and so dod the seg fault
problem...
glen
On Tue, 2003-07-01 at 15:47, Pete Wyckoff wrote:
> beaneg at umcs.maine.edu said on Thu, 26 Jun 2003 10:06 -0400:
> > if I build mpiexec to use gm-shmem on SMP nodes, mpiexec causes a
> > segmentation fault, but it is always after my MPI program has finished
> > properly, so it seems to be when mpiexec is cleaning up.
> >
> > If I build mpiexec without gm-shmem there are no problems.
> >
> > gm-shmem has been changed slighly on my system. After discussing some
> > problems with myricom we decided to change the default location of the
> > shared memory file on our system(done by editing gmpi_smppriv.c and
> > mpirun.ch_gm.pl). Since /tmp was NFS mounted, we were having problems
> > with a large number of nodes writing shared memory files to /tmp. The
> > shared memory file is now located in ramdisk( location of the shared
> > memory file will likely be a configurable option in the next MPICH/GM
> > release)
> >
> > This setup works fine with mpirun.ch_gm, but has been causing
> > segmentation faults with mpiexec which don't seem to affect the actual
> > MPI program.
> >
> > Since mpirun.ch_gm.pl references the temp file, I was wondering if
> > mpiexec did anywhere, but looking quickly through the source code I
> > didn't find any reference to it.
> >
> >
> > Does anyone know what might be causing the problem? Other than the
> > inability to use gm-shmem, we really like mpiexec so far.
>
> I'm a bit confused by this. Release 0.72 of mpiexec and earlier did
> have a configure option "--disable-gm-shmem" which could be used to
> control the ability to use a command-line setting "-no-shmem" which
> only changed the environment to contain "GMPI_SHMEM=0".
>
> This was removed since it is just as easy to do something like:
>
> export GMPI_SHMEM=0
> mpiexec a.out
>
> in your batch script and have the same effect. There are plenty of
> other GPMI_ variables that can be set this way too.
>
> Back before Aug 2002, it was necessary for mpiexec to think about the
> path to the mpich/gm shared memory file, but that too is currently
> handled only by the mpich library. Mpiexec does not choose a location
> for the shared memory file or get involved in the process at all. In
> fact, I don't think that mpiexec ever messes with /tmp unless you told
> it your executable is there.
>
> I can't guess at what would cause mpiexec itself to SEGV, then, since
> all it talks to is PBS through the TM interface. It is not linked with
> any MPICH or GM code. If you can run mpiexec under gdb and get it to
> segv, I'd definitely like to see what caused it to die.
>
> -- Pete
More information about the mpiexec
mailing list