Comm_spawn is now fully functional. I'll send out a separate message about some of the problems encountered, and resulting action items.
This commit was SVN r3770.
w/o threads runs correctly at this stage. fixed errors with using
addresses valid in remote memory scope, not local memory scope.
This commit was SVN r3767.
* remove ignore on the bjs discovery / mapping code. It seems to work
correctly
* Fix up svn:ignore properties in the dynamic-mca directory
This commit was SVN r3761.
node, since the default hostfile shipped with Open MPI contains only
"localhost". This is performed after all the other checks, so it should
only get activated as a last-ditch effort.
This commit was SVN r3758.
as loop ending condition
* Add resolver code for a couple of different node formats for BProc. Given
hostfiles can now be a combination of notations:
0
n1
2
master
self
This commit was SVN r3757.
PLEASE NOTE: there are some diagnostic messages in oob_xcast that will print out. Please don't have a cow about them - they won't hurt nor injure anyone, and it's just there for a little while to help Tim and I debug a problem. Just didn't want to create yet another MCA parameter to debug 10 lines of code. :-)
This commit was SVN r3756.
for hints, and a ompi_rb_tree_t for all other search operations;
registration occurs only when a different memory range or access
privileges is needed; deregistration is lazy, and only done when
registration fails with VAPI_EAGAIN (no resources...)
This commit was SVN r3755.
Most of this checkin consists of more debugging stuff. Hopefully, you won't see any printf's that aren't protected by debug flags - if you do, let me know and I'll take them out with my apologies.
Outside of debugging, the biggest change was a revamp of the shutdown process. For several reasons, we had chosen to have all processes "wait" for a shutdown message before exiting. This message is typically generated by mpirun, but in the case of comm_spawn we needed to do something else. We have decided that the best way to solve this problem is to:
(a) replace the shutdown message (which needed to be generated by somebody - usually mpirun) with an oob_barrier call. This still requires that the rank 0 process be alive. However, we terminate all processes if one abnormally terminates anyway, so this isn't a problem (with the standard or our implementation); and
(b) have the state-of-health monitoring subsystem issue the call to cleanup the job from the registry. Since the state-of-health subsystem isn't available yet, we have temporarily assigned that responsibility to the rank 0 process. Once the state-of-health subsystem is available, we will have it monitor the job for all-processes-complete and then it can tell the registry to cleanup the job (i.e., remove all data relating to this job).
Hope that helps a little. I'll put all this into the design docs soon.
This commit was SVN r3754.
.ompi_ignore file.
If you have an empty .ompi_unignore file (presumably alongside an
.ompi_ignore file), we'll build the component.
If you have a non-empty .ompi_unignore file (presumable alongside an
.ompi_ignore file), and your username can be found in that file, we'll
build the component. If your username is *not* the the .ompi_unignore
file, we *won't* build the component.
This commit was SVN r3747.
MPI_Offset
- Make the ROMIO IO component use MPI_Offset for the back-end type for
ADIO_Offset
- Removed some extra verbage from configure warnings
- Add some logic to configure to deduce an MPI datatype that
corresponds to MPI_Offset (because ROMIO needs it). This is a bit
of an abuse (i.e., ROMIO's configure should figure this out), but
it's not too gratuitous because a) the ROMIO component is included
in Open MPI, and b) other io components to be defined in the future
could also use this information
- Rename MCA: MPI Component Architecture -> Modular Component
Architecture
This commit was SVN r3742.
being used. I added some comments about why it's not being used
(because one would naievely think that increasing / decreasing the
refcount would be a Good Thing for the group constructor /
destructor).
This commit was SVN r3739.
- stdio
- death notification messages
- killing
and I think there are some places we leak memory way too much. And a
ton of printfs. But we're making progress
This commit was SVN r3721.