opal_shmem_segment_create by testing whether or not the target mount has enough
space to accommodate the shared-memory backing store. Fixes trac:2827. Will work
with Shiqing to add Windows support (if required).
This commit was SVN r27433.
The following Trac tickets were found above:
Ticket 2827 --> https://svn.open-mpi.org/trac/ompi/ticket/2827
* Use the hwloc logical index, not the os_index. Fixes problems with
opal_hwloc_base_cset2str() output (e.g., --report-bindings output)
on machines where the os_index is not tightly packed in the range
![0, n-1]
This commit was SVN r27394.
ompi/mca/sbgp/basesmsocket
orte/mca/rmaps/lama
Remove stale configure.params files from the sbgp framework as the OMPI build system no longer looks at those files.
This commit was SVN r27377.
Cannot start the data clearing at the root object level as the root object has a different struct attached to userdata.
This commit was SVN r27357.
The following Trac tickets were found above:
Ticket 3322 --> https://svn.open-mpi.org/trac/ompi/ticket/3322
This now results in the procs being bound within their assigned location. It also causes us to use only the 0th HT on a core unless --use-hwthread-cpus has been specified (in which case, we use all the HTs in a core). Bind to core binds you to all HTs regardless - the --use-hwthread-cpus only impacts the oversubscribed determination and when binding to HT.
cmr:v1.7
This commit was SVN r27342.
We ran into a case where the OMPI SVN trunk grew a new acceptable MCA
parameter value, but this new value was not accepted on the v1.6
branch (hwloc_base_mem_bind_failure_action -- on the trunk it accepts
the value "silent", but on the older v1.6 branch, it doesn't). If you
set "hwloc_base_mem_bind_failure_action=silent" in the default MCA
params file and then accidentally ran with the v1.6 branch, every OMPI
executable (including ompi_info) just failed because hwloc_base_open()
would say "hey, 'silent' is not a valid value for
hwloc_base_mem_bind_failure_action!". Kaboom.
The only problem is that it didn't give you any indication of where
this value was being set. Quite maddening, from a user perspective.
So we changed the ompi_info handles this case. If any framework open
function return OMPI_ERR_BAD_PARAM (either because its base MCA params
got a bad value or because one of its component register/open
functions return OMPI_ERR_BAD_PARAM), ompi_info will stop, print out
a warning that it received and error, and then dump out the parameters
that it has received so far in the framework that had a problem.
At a minimum, this will show the user the MCA param that had an error
(it's usually the last one), and ''where it was set from'' (so that
they can go fix it).
We updated ompi_info to check for O???_ERR_BAD_PARAM from each from
the framework opens. Also updated the doxygen docs in mca.h for this
O???_BAD_PARAM behavior. And we noticed that mca.h had MCA_SUCCESS
and MCA_ERR_??? codes. Why? I think we used them in exactly one
place in the code base (mca_base_components_open.c). So we deleted
those and just used the normal OPAL_* codes instead.
While we were doing this, we also cleaned up a little memory
management during ompi_info/orte-info/opal-info finalization.
Valgrind still reports a truckload of memory still in use at ompi_info
termination, but they mostly look to be components not freeing
memory/resources properly (and outside the scope of this fix).
This commit was SVN r27306.
The following Trac tickets were found above:
Ticket 3275 --> https://svn.open-mpi.org/trac/ompi/ticket/3275
following:
* Provides a fixed number of resource slots (i.e., "hotel rooms").
* Allows one thing to occupy a resource slot at a time (i.e., each
hotel room can have an occupant check in to that room).
* Resource slots can be vacated at any time (i.e., occupants can
voluntarily check out of their hotel room).
* Resource slots can be occupied for a specific maximum amount of
time. If that time expires, the occupant is forcibly evicted and
the upper layer is notified via (libevent) callback (i.e., the maid
will kick an occupant of out of their room when their reservation
is over).
This class can be to be used for things like retransmission schemes
for unreliable transports. For example, a message sent on an
unreliable transport can be checked in to a hotel room. If an ACK for
that message is received, the message can be checked out. But if the
ACK is never received, the message will eventually be evicted from its
room and the upper layer will be notified that the message failed to
check out in time (i.e., that an ACK for that message was not received
in time).
Code using this class is currently being developed off-trunk, but will
be coming to SVN soon.
This commit was SVN r27067.
"num_app_ctx" - the number of app_contexts in the job
"first_rank" - the MPI rank of the first process in each app_context
"np" - the number of procs in each app_context
Still need clarification on the MPI_Init portion of the ticket. Specifically, does the ticket call for returning an error is someone calls MPI_Init more than once in a program? We set a flag to tell us that we have been initialized, but currently never check it.
This commit was SVN r27005.
distinguish between unknown ''options'' (i.e., command line options
that are registered and have some meaning) and unknown ''tokens''
(i.e., strings that do not begin with "-"). Hence, if you did:
mpirun --fo my_mpi_program
(when perhaps you meant to type "--foo", mpirun would complain that no
such executable "--fo" existed. That is ''correct,'' but perhaps not
completely useful. It is more accurate for mpirun to report that
there is no such "--fo" option.
This change to cmd_line.c makes it so that we will ''always'' report
errors regarding tokens that begin with "-".
This commit was SVN r26953.
* NULL's out the hwloc_obj_t->userdata in
hwloc_base_util.c:free_object() and
hwloc_base_util.c:opal_hwloc_base_free_topology() after it has been
OBJ_RELEASE'd.
* Adds a userdata field to opal_hwloc_topo_data_t. This field will
be used in an upcoming rmaps component ("lama") to cache some
associated data during hardware tree traversals.
This commit was SVN r26938.
particularly with respect to threading flags.
Before this change, the following scenario would fail (e.g., on Linux
with pthreads):
{{{
$ ./configure --disable-shared --enable-static ...
$ make clean install
$ cd examples
$ make clean all
}}}
Linking the Fortran examples would fail with missing pthread symbols.
This commit was SVN r26927.