This commit fixes edge cases of `r = 38` and `r = 308`.
As defined in the MPI standard, `TYPE_CREATE_F90_REAL` and
`TYPE_CREATE_F90_COMPLEX` must be consistent with the Fortran
`SELECTED_REAL_KIND` function. The `SELECTED_REAL_KIND` function is
defined based on the `RANGE` function. The `RANGE` function returns
`INT(MIN(LOG10(HUGE(X)), -LOG10(TINY(X))))` for a real value `X`.
The old code considers only `INT(LOG10(HUGE(X)))` using `*_MAX_10_EXP`.
This commit adds `INT(-LOG10(TINY(X)))` part using `*_MIN_10_EXP`.
This bug affected the following `p`-`r` combinations.
| p | r | expected | returned | expected | returned |
| :------------ | --: | :-------- | :-------- | :------- | :-------- |
| MPI_UNDEFINED | 38 | REAL8 | REAL4 | COMPLEX16 | COMPLEX8 |
| 0 <= p <= 6 | 38 | REAL8 | REAL4 | COMPLEX16 | COMPLEX8 |
| MPI_UNDEFINED | 308 | REAL16 | REAL8 | COMPLEX32 | COMPLEX16 |
| 0 <= p <= 15 | 308 | REAL16 | REAL8 | COMPLEX32 | COMPLEX16 |
MPICH returns the same result as Open MPI with this fix.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
(cherry picked from commit 6fb01f64fe2bcdb4668e520eb458ffd3477e5e6f)
check for providing a data representation that is actually supported
by ompio.
Add also one check for a non-NULL pointer in mpi/c/file_set_view
for the data representation.
Also fixes parts of issue #5643
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
so latest ROM-IO can be used with Open MPI.
Note this first and naive implementation does not use the wait_fn callback.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
This code is the implementation of Software-base Performance Counters as described in the paper 'Using Software-Base Performance Counters to Expose Low-Level Open MPI Performance Information' in EuroMPI/USA '17 (http://icl.cs.utk.edu/news_pub/submissions/software-performance-counters.pdf). More practical usage information can be found here: https://github.com/davideberius/ompi/wiki/How-to-Use-Software-Based-Performance-Counters-(SPCs)-in-Open-MPI.
All software events functions are put in macros that become no-ops when SOFTWARE_EVENTS_ENABLE is not defined. The internal timer units have been changed to cycles to avoid division operations which was a large source of overhead as discussed in the paper. Added a --with-spc configure option to enable SPCs in the Open MPI build. This defines SOFTWARE_EVENTS_ENABLE. Added an MCA parameter, mpi_spc_enable, for turning on specific counters. Added an MCA parameter, mpi_spc_dump_enabled, for turning on and off dumping SPC counters in MPI_Finalize. Added an SPC test and example.
Signed-off-by: David Eberius <deberius@vols.utk.edu>
the interface if file_get_type_extent did not check
whether the input datatype is valid or not.
Makes the e_get_type_extend_2 test from the ibm testsuite pass.
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
There was a race condition in 35438ae9b5: if multiple threads invoked
ompi_mpi_init() simultaneously (which could happen from both MPI and
OSHMEM), the code did not catch this condition -- Bad Things would
happen.
Now use an atomic cmp/set to ensure that only one thread is able to
advance ompi_mpi_init from NOT_INITIALIZED to INIT_STARTED.
Additionally, change the prototype of ompi_mpi_init() so that
oshmem_init() can safely invoke ompi_mpi_init() multiple times (as
long as MPI_FINALIZE has not started) without displaying an error. If
multiple threads invoke oshmem_init() simultaneously, one of them will
actually do the initialization, and the rest will loop waiting for it
to complete.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
This checkin mainly concerns our internal info keys that are registering
for callbacks via opal_infosubscribe_subscribe(). Those keys need to have
an extra __IN_<key>/val stored to preserve their pre-callback value. So
that means our internal keys are limited to 5 chars shorter than the usual
key length limit.
The code previously would have been silently inactive if a large key happened
to come in, now it warns and also uses snprintf() to avoid compiler warnings.
I'm also making the top-level MPI_Info_set warn if the user uses our reserved
"__IN_" prefix. I had wanted the feature to be more invisible than that, but
it would require a more sophisticated approach to change that.
Signed-off-by: Mark Allen <markalle@us.ibm.com>
Per MPI-3.1:8.7.1 p361:11-13, it's valid for MPI_FINALIZED to be
invoked during an attribute destruction callback (e.g., during the
destruction of keyvals on MPI_COMM_SELF during the very beginning of
MPI_FINALIZE). In such cases, MPI_FINALIZED must return "false".
Prior to this commit, we hung in FINALIZED if it were invoked during
a COMM_SELF attribute destruction callback in FINALIZE. See
https://github.com/open-mpi/ompi/issues/5084.
This commit converts the MPI_INITIALIZED / MPI_FINALIZED
infrastructure to use a single enum (ompi_mpi_state, set atomically)
to represent the state of MPI:
- not initialized
- init started
- init completed
- finalize started
- finalize past COMM_SELF destruction
- finalize completed
The "finalize past COMM_SELF destruction" state is what allows us to
return "false" from MPI_FINALIZED before COMM_SELF has been fully
destroyed / all attribute callbacks have been invoked.
Since this state is checked at nearly every MPI API call (to see if
we're outside of the INIT/FINALIZE epoch), care was taken to use
atomics to *set* the ompi_mpi_state value in ompi_mpi_init() and
ompi_mpi_finalize(), but performance-critical code paths can simply
read the variable without needing to use a slow call to an
opal_atomic_*() function.
Thanks to @AndrewGaspar for reporting the issue.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
This commit adds a new configure option: --enable-mpi1-compat. Without
this option we will no longer provide APIs, typedefs, and defines that
were removed from the standard in MPI-3.0. This option will exist for
one major release (Open MPI v4.x.x) and then the option and associated
code will be removed in Open MPI v5.x.x. Open MPI has already
internally prepared for this change. Please prepare your codes
accordingly.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Allowing MPI_PROC_NULL as a neighbor in any topology allows us to add
gaps on the send and recv buffers. This does make the traditional
neighbor collective have a similar behavior as the V version, but in
same time it allows the users to skip the step where they prepare the
counts and the displacement array.
For more info please take a look at issue #4675.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
As discussed on https://github.com/mpi-forum/mpi-issues/issues/77#issuecomment-369663119
the conversion to double in the MPI_Wtime decrease the range
and accuracy of the resulting timer. By setting the timer to
0 at the first usage we basically maintain the accuracy for
194 days even for gettimeofday.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Per MPI 3.1 chapter 13.3 :
"Derived etypes can be constructed by using any of the MPI
datatype constructor routines, provided all resulting typemap
displacements are non-negative and monotonically nondecreasing."
Same restriction applies to ftypes.
add the OMPI_DATATYPE_CHECK_FOR_VIEW() macro that is
check the underlying opal_datatype_t is monotonic, on top
of all checks performed in OMPI_DATATYPE_CHECK_FOR_RECV().
Since checking monotoniciy is expensive, check is only performed
when needed, but the result is cached by ompi_datatype_is_monotonic().
Thanks Wei-keng Liao for the valuable feedback.
Thanks George for the guidance.
Refs. open-mpi/ompi#4682
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
It should have always been #define'd in order to correctly handle the
multi-threaded case.
Also fix indentation in ompi/mpi/c/comm_get_errhandler.c
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
The current versions of these functions have a fatal flaw. If a
errhandler set and free call is made by another thread while the
thread calling get is between the cmpset and retain then we will
retain an invalid object. Fixing this by just using locking. This is
not a critical path so this should be ok.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit renames the atomic compare-and-swap functions to indicate
the return value. This is in preperation for adding support for a
compare-and-swap that returns the old value. At the same time the
return type has been changed to bool.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
MPI_IN_PLACE is not a valid send buffer for neighborhood collectives, so do not
invoke memchecker in this case.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Per MPI-3.1, ensure to raise an MPI exception with value
MPI_ERR_INFO_NOKEY if we try to MPI_INFO_DELETE a key that does not
exist. Thanks to @dalcinl (Lisando Dalcin) for raising the issue.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
The `ompi_comm_set` function never sets `NULL` to its first argument
`ncomm`. So `NULL` check is unnecessary in its callers. Furthermore,
`NULL` check may obscure a real return code when an error occurs
if the variable is initialized to a `NULL` value.
Also, `NULL` check is added in the `ompi_comm_set` function to
avoid segmentation fault in an out-of-memory condition.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
This commit adds a helper function to get the inbound and outbound
neighbor count and updates the neighbor_allgatherv bindings to use the
correct count when checking the input parameters.
Fixes#2324
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
we now have 12 cases to deal (4 writers and 3 readers) :
1. C `void*` is written into the attribute value, and the value is read into a C `void*` (unity)
2. C `void*` is written, Fortran `INTEGER` is read
3. C `void*` is written, Fortran `INTEGER(KIND=MPI_ADDRESS_KIND)` is read
4. Fortran `INTEGER` is written, C `void*` is read
5. Fortran `INTEGER` is written, Fortran `INTEGER` is read (unity)
6. Fortran `INTEGER` is written, Fortran `INTEGER(KIND=MPI_ADDRESS_KIND)` is read
7. Fortran `INTEGER(KIND=MPI_ADDRESS_KIND)` is written, C `void*` is read
8. Fortran `INTEGER(KIND=MPI_ADDRESS_KIND)` is written, Fortran `INTEGER` is read
9. Fortran `INTEGER(KIND=MPI_ADDRESS_KIND)` is written, Fortran `INTEGER(KIND=MPI_ADDRESS_KIND)` is read (unity)
10. Intrinsic is written, C `void*` is read
11. Intrinsic is written, Fortran `INTEGER` is read
12. Intrinsic is written, Fortran `INTEGER(KIND=MPI_ADDRESS_KIND)` is read
MPI-2 Fortran "integer representation" has type `INTEGER(KIND=MPI_ADDRESS_KIND)` as clarified
at https://github.com/mpiwg-rma/rma-issues/issues/1
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
* Protects us from segv when ROMIO 314 is selected and one of the
following operations is called:
- MPI_File_iread_at_all
- MPI_File_iwrite_at_all
- MPI_File_iread_all
- MPI_File_iwrite_all
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
This commit adds the `req_start` member to the `ompi_request_t` struct.
The `MPI_START` and `MPI_STARTALL` routines call this callback function
instead of `MCA_PML_CALL(start(...))`. So components that return
persistent request must set this member to their request objects.
`mca_pml_base_module_t::pml_start` is not deleted because
`MCA_PML_CALL(start(...))` is still used elsewhere across OMPI.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
Finally Merging this in. MPI_*_get_info/set_info().
Targeting v3.1 release. @hjelmn were you interested in switching some internal pieces to begin using this? Should we target v3.1 (or whatever we call the Oct 15th release?)
The expected sequence of events for processing info during object creation
is that if there's an incoming info arg, it is opal_info_dup()ed into the obj
at obj->s_info first. Then interested components register callbacks for
keys they want to know about using opal_infosubscribe_infosubscribe().
Inside info_subscribe_subscribe() the specified callback() is called with
whatever matching k/v is in the object's info, or with the default. The
return string from the callback goes into the new k/v stored in info, and
the input k/v is saved as __IN_<key>/<val>. It's saved the same way
whether the input came from info or whether it was a default. A null return
from the callback indicates an ignored key/val, and no k/v is stored for
it, but an __IN_<key>/<val> is still kept so we still have access to the
original.
At MPI_*_set_info() time, opal_infosubscribe_change_info() is used. That
function calls the registered callbacks for each item in the provided info.
If the callback returns non-null, the info is updated with that k/v, or if
the callback returns null, that key is deleted from info. An __IN_<key>/<val>
is saved either way, and overwrites any previously saved value.
When MPI_*_get_info() is called, opal_info_dup_mpistandard() is used, which
allows relatively easy changes in interpretation of the standard, by looking
at both the <key>/<val> and __IN_<key>/<val> in info. Right now it does
1. includes system extras, eg k/v defaults not expliclty set by the user
2. omits ignored keys
3. shows input values, not callback modifications, eg not the internal values
Currently the callbacks are doing things like
return some_condition ? "true" : "false"
that is, returning static strings that are not to be freed. If the return
strings start becoming more dynamic in the future I don't see how unallocated
strings could support that, so I'd propose a change for the future that
the callback()s registered with info_subscribe_subscribe() do a strdup on
their return, and we change the callers of callback() to free the strings
it returns (there are only two callers).
Rough outline of the smaller changes spread over the less central files:
comm.c
initialize comm->super.s_info to NULL
copy into comm->super.s_info in comm creation calls that provide info
OBJ_RELEASE comm->super.s_info at free time
comm_init.c
initialize comm->super.s_info to NULL
file.c
copy into file->super.s_info if file creation provides info
OBJ_RELEASE file->super.s_info at free time
win.c
copy into win->super.s_info if win creation provides info
OBJ_RELEASE win->super.s_info at free time
comm_get_info.c
file_get_info.c
win_get_info.c
change_info() if there's no info attached (shouldn't happen if callbacks
are registered)
copy the info for the user
The other category of change is generally addressing compiler warnings where
ompi_info_t and opal_info_t were being used a little too interchangably. An
ompi_info_t* contains an opal_info_t*, at &(ompi_info->super)
Also this commit updates the copyrights.
Signed-off-by: Mark Allen <markalle@us.ibm.com>
ompi_communicator_t, ompi_win_t, ompi_file_t all have a super class of type opal_infosubscriber_t instead of a base/super type of opal_object_t (in previous code comm used c_base, but file used super). It may be a bit bold to say that being a subscriber of MPI_Info is the foundational piece that ties these three things together, but if you object, then I would prefer to turn infosubscriber into a more general name that encompasses other common features rather than create a different super class. The key here is that we want to be able to pass comm, win and file objects as if they were opal_infosubscriber_t, so that one routine can heandle all 3 types of objects being passed to it.
MPI_INFO_NULL is still an ompi_predefined_info_t type since an MPI_Info is part of ompi but the internal details of the underlying information concept is part of opal.
An ompi_info_t type still exists for exposure to the user, but it is simply a wrapper for the opal object.
Routines such as ompi_info_dup, etc have all been moved to opal_info_dup and related to the opal directory.
Fortran to C translation tables are only used for MPI_Info that is exposed to the application and are therefore part of the ompi_info_t and not the opal_info_t
The data structure changes are primarily in the following files:
communicator/communicator.h
ompi/info/info.h
ompi/win/win.h
ompi/file/file.h
The following new files were created:
opal/util/info.h
opal/util/info.c
opal/util/info_subscriber.h
opal/util/info_subscriber.c
This infosubscriber concept is that communicators, files and windows can have subscribers that subscribe to any changes in the info associated with the comm/file/window. When xxx_set_info is called, the new info is presented to each subscriber who can modify the info in any way they want. The new value is presented to the next subscriber and so on until all subscribers have had a chance to modify the value. Therefore, the order of subscribers can make a difference but we hope that there is generally only one subscriber that cares or modifies any given key/value pair. The final info is then stored and returned by a call to xxx_get_info.
The new model can be seen in the following files:
ompi/mpi/c/comm_get_info.c
ompi/mpi/c/comm_set_info.c
ompi/mpi/c/file_get_info.c
ompi/mpi/c/file_set_info.c
ompi/mpi/c/win_get_info.c
ompi/mpi/c/win_set_info.c
The current subscribers where changed as follows:
mca/io/ompio/io_ompio_file_open.c
mca/io/ompio/io_ompio_module.c
mca/osc/rmda/osc_rdma_component.c (This one actually subscribes to "no_locks")
mca/osc/sm/osc_sm_component.c (This one actually subscribes to "blocking_fence" and "alloc_shared_contig")
Signed-off-by: Mark Allen <markalle@us.ibm.com>
Conflicts:
AUTHORS
ompi/communicator/comm.c
ompi/debuggers/ompi_mpihandles_dll.c
ompi/file/file.c
ompi/file/file.h
ompi/info/info.c
ompi/mca/io/ompio/io_ompio.h
ompi/mca/io/ompio/io_ompio_file_open.c
ompi/mca/io/ompio/io_ompio_file_set_view.c
ompi/mca/osc/pt2pt/osc_pt2pt.h
ompi/mca/sharedfp/addproc/sharedfp_addproc.h
ompi/mca/sharedfp/addproc/sharedfp_addproc_file_open.c
ompi/mca/topo/treematch/topo_treematch_dist_graph_create.c
ompi/mpi/c/lookup_name.c
ompi/mpi/c/publish_name.c
ompi/mpi/c/unpublish_name.c
opal/mca/mpool/base/mpool_base_alloc.c
opal/util/Makefile.am