1
1
Граф коммитов

9586 Коммитов

Автор SHA1 Сообщение Дата
Piotr Lesnicki
99453e6b10 mtl/portals4: get retransmission REPLY code
Signed-off-by: Todd Kordenbrock <thkgcode@gmail.com>
2017-07-09 22:12:25 -05:00
Piotr Lesnicki
06b15cebbf mtl/portals4: add timeout to get retransmit
Signed-off-by: Todd Kordenbrock <thkgcode@gmail.com>
2017-07-09 22:12:08 -05:00
Chris Ward
5de3d5dde6 Fix MPI_SIZEOF for gfortran 4.8
Add copyrights.

Revise the README to take out the 'most notably' statement about GNU Fortran 4.8

Signed-off-by: Chris Ward <tjcw@uk.ibm.com>
2017-07-07 13:47:35 +01:00
Gilles Gouaillardet
fc11c37223 Merge pull request #3646 from ggouaillardet/spacc-fix-coverity-warnings
coll/spacc: misc fixes
2017-07-06 11:39:14 +09:00
Mikhail Kurnosov
44acc92104 Fix buffer overflow
Add check for bounds of sindex[] and rindex[].

Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
2017-07-06 10:49:08 +09:00
Gilles Gouaillardet
5fceca235b coll/spacc: silence more coverity warnings in mca_coll_spacc_allreduce_intra_redscat_allgather()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-07-06 10:49:08 +09:00
Mikhail Kurnosov
2f0f476642 Silence spacc coverity warnings
1. Add assert for opal_hibit return value: comm_size is always > 1.
2. Modified verbose output (dead-code warning).

Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
2017-07-06 10:49:08 +09:00
Ralph Castain
31130a4bee Replace syntax with something less strictly C99
Fixes #3809

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-05 16:54:36 -07:00
Gilles Gouaillardet
d1c5955b73 coll/base: optimize handling of zero-byte datatypes in mca_coll_base_alltoallv_intra_basic_inplace()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-06-30 09:47:08 +09:00
Ralph Castain
bd4a6fee22 Attempt to detect when we are direct-launched without the necessary PMI support, and thus are incorrectly identified as being "singleton". Advise the user on the required PMI(x) support and error out.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-29 15:26:53 -07:00
Gilles Gouaillardet
7e5e5fe887 Merge pull request #3719 from ggouaillardet/topic/libnbc_revamp
coll/libnbc: revisit NBC_Handle usage
2017-06-29 11:13:58 +09:00
Nathan Hjelm
022c658bbf osc/rdma: rework locking code to improve behavior of unlock
This commit changes the locking code to allow the lock release to be
non-blocking. This helps with releasing the accumulate lock which may
occur in a BTL callback.

Fixes #3616

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-06-27 15:29:51 -06:00
George Bosilca
f8ffec926e
Protect the monitoring infrastructure initialization. 2017-06-27 18:35:24 +02:00
Clément FOYER
c885ee3f3c Fix Coverity warning CID 1413323 (#3764)
Signed-off-by: Clement Foyer <clement.foyer@inria.fr>
2017-06-27 12:39:31 +02:00
bosilca
d55b666834 Topic/monitoring (#3109)
Add a monitoring PML, OSC and IO. They track all data exchanges between processes,
with capability to include or exclude collective traffic. The monitoring infrastructure is
driven using MPI_T, and can be tuned of and on any time o any communicators/files/windows.
Documentations and examples have been added, as well as a shared library that can be
used with LD_PRELOAD and that allows the monitoring of any application.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: Clement Foyer <clement.foyer@inria.fr>


* add ability to querry pml monitorinting results with MPI Tools interface
using performance variables "pml_monitoring_messages_count" and
"pml_monitoring_messages_size"

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>

* Fix a convertion problem and add a comment about the lack of component
retain in the new component infrastructure.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>

* Allow the pvar to be written by invoking the associated callback.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>

* Various fixes for the monitoring.
Allocate all counting arrays in a single allocation
Don't delay the initialization (do it at the first add_proc as we
know the number of processes in MPI_COMM_WORLD)

Add a choice: with or without MPI_T (default).

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>

* Cleanup for the monitoring module.
Fixed few bugs, and reshape the operations to prepare for
global or communicator-based monitoring. Start integrating
support for MPI_T as well as MCA monitoring.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>

* Adding documentation about how to use pml_monitoring component.

Document present the use with and without MPI_T.
May not reflect exactly how it works right now, but should reflects
how it should work in the end.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Change rank into MPI_COMM_WORLD and size(MPI_COMM_WORLD) to global variables in pml_monitoring.c.
Change mca_pml_monitoring_flush() signature so we don't need the size and rank parameters.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>

* Improve monitoring support (including integration with MPI_T)

Use mca_pml_monitoring_enable to check status state. Set mca_pml_monitoring_current_filename iif parameter is set
Allow 3 modes for pml_monitoring_enable_output: - 1 : stdout; - 2 : stderr; - 3 : filename
Fix test : 1 for differenciated messages, >1 for not differenciated. Fix output.
Add documentation for pml_monitoring_enable_output parameter. Remove useless parameter in example
Set filename only if using mpi tools
Adding missing parameters for fprintf in monitoring_flush (for output in std's cases)
Fix expected output/results for example header
Fix exemple when using MPI_Tools : a null-pointer can't be passed directly. It needs to be a pointer to a null-pointer
Base whether to output or not on message count, in order to print something if only empty messages are exchanged
Add a new example on how to access performance variables from within the code
Allocate arrays regarding value returned by binding

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add overhead benchmark, with script to use data and create graphs out of the results
Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix segfault error at end when not loading pml
Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Start create common monitoring module. Factorise version numbering
Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix microbenchmarks script
Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Improve readability of code

NULL can't be passed as a PVAR parameter value. It must be a pointer to NULL or an empty string.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add osc monitoring component

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add error checking if running out of memory in osc_monitoring

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Resolve brutal segfault when double freeing filename
Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Moving to ompi/mca/common the proper parts of the monitoring system
Using common functions instead of pml specific one. Removing pml ones.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add calls to record monitored data from osc. Use common function to translate ranks.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix test_overhead benchmark script distribution

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix linking library with mca/common

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add passive operations in monitoring_test

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix from rank calculation. Add more detailed error messages

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix alignments. Fix common_monitoring_get_world_rank function. Remove useless trailing new lines

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix osc_monitoring mget_message_count function call

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Change common_monitoring function names to respect the naming convention. Move to common_finalize the common parts of finalization. Add some comments.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add monitoring common output system

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add error message when trying to flush to a file, and open fails. Remove erroneous info message when flushing wereas the monitoring is already disabled.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Consistent output file name (with and without MPI_T).

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Always output to a file when flushing at pvar_stop(flush).

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Update the monitoring documentation.
Complete informations from HowTo. Fix a few mistake and typos.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Use the world_rank for printf's.
Fix name generation for output files when using MPI_T. Minor changes in benchmarks starting script

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Clean potential previous runs, but keep the results at the end in order to potentially reprocess the data. Add comments.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add security check for unique initialization for osc monitoring

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Clean the amout of symbols available outside mca/common/monitoring

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Remove use of __sync_* built-ins. Use opal_atomic_* instead.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Allocate the hashtable on common/monitoring component initialization. Define symbols to set the values for error/warning/info verbose output. Use opal_atomic instead of built-in function in osc/monitoring template initialization.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Deleting now useless file : moved to common/monitoring

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add histogram ditribution of message sizes

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add histogram array of 2-based log of message sizes. Use simple call to reset/allocate arrays in common_monitoring.c

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add informations in dumping file. Separate per category (pt2pt/osc/coll (to come)) monitored data

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add coll component for collectives communications monitoring

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix warning messages : use c_name as the magic id is not always defined. Moreover, there was a % missing. Add call to release underlying modules. Add debug info messages. Add warning which may lead to further analysis.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix log10_2 constant initialization. Fix index calculation for histogram array.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add debug info messages to follow more easily initialization steps.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Group all the var/pvar definitions to common_monitoring. Separate initial filename from the current on, to ease its lifetime management. Add verifications to ensure common is initialized once only. Move state variable management to common_monitoring.
monitoring_filter only indicates if filtering is activated.
Fix out of range access in histogram.
List is not used with the struct mca_monitoring_coll_data_t, so heritate only from opal_object_t.
Remove useless dead code.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix invalid memory allocation. Initialize initial_filename to empty string to avoid invalid read in mca_base_var_register.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Don't install the test scripts.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix missing procs in hashtable. Cache coll monitoring data.
    * Add MCA_PML_BASE_FLAG_REQUIRE_WORLD flag to the PML layer.
    * Cache monitoring data relative to collectives operations on creation.
    * Remove double caching.
    * Use same proc name definition for hash table when inserting and
      when retrieving.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Use intermediate variable to avoid invalid write while retrieving ranks in hashtable.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add missing release of the last element in flush_all. Add release of the hashtable in finalize.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Use a linked list instead of a hashtable to keep tracks of communicator data. Add release of the structure at finalize time.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Set world_rank from hashtable only if found

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Use predefined symbol from opal system to print int

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Move collective monitoring data to a hashtable. Add pvar to access the monitoring_coll_data. Move functions header to a private file only to be used in ompi/mca/common/monitoring

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix pvar registration. Use OMPI_ERROR isntead of -1 as returned error value. Fix releasing of coll_data_t objects. Affect value only if data is found in the hashtable.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add automated check (with MPI_Tools) of monitoring.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix procs list caching in common_monitoring_coll_data_t

    * Fix monitoring_coll_data type definition.
    * Use size(COMM_WORLD)-1 to determine max number of digits.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add linking to Fortran applications for LD_PRELOAD usage of monitoring_prof

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add PVAR's handles. Clean up code (visibility, add comments...). Start updating the documentation

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix coll operations monitoring. Update check_monitoring accordingly to the added pvar. Fix monitoring array allocation.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Documentation update.
Update and then move the latex and README documentation to a more logical place

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Aggregate monitoring COLL data to the generated matrix. Update documentation accordingly.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix monitoring_prof (bad variable.vector used, and wrong array in PMPI_Gather).

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add reduce_scatter and reduce_scatter_block monitoring. Reduce memory footprint of monitoring_prof. Unify OSC related outputs.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add the use of a machine file for overhead benchmark

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Check for out-of-bound write in histogram

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix common_monitoring_cache object init for MPI_COMM_WORLD

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add RDMA benchmarks to test_overhead
Add error file output. Add MPI_Put and MPI_Get results analysis. Add overhead computation for complete sending (pingpong / 2).

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add computation of average and median of overheads. Add comments and copyrigths to the test_overhead script

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add technical documentation

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Adapt to the new definition of communicators

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Update expected output in test/monitoring/monitoring_test.c

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add dumping histogram in edge case

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Adding a reduce(pml_monitoring_messages_count, MPI_MAX) example

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add consistency in header inclusion.
Include ompi/mpi/fortran/mpif-h/bindings.h only if needed.
Add sanity check before emptying hashtable.
Fix typos in documentation.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* misc monitoring fixes

* test/monitoring: fix test when weak symbols are not available
* monitoring: fix a typo and add a missing file in Makefile.am
and have monitoring_common.h and monitoring_common_coll.h included in the distro
* test/monitoring: cleanup all tests and make distclean a happy panda
* test/monitoring: use gettimeofday() if clock_gettime() is unavailable
* monitoring: silence misc warnings (#3)

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

* Cleanups.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>

* Changing int64_t to size_t.
Keep the size_t used accross all monitoring components.
Adapt the documentation.
Remove useless MPI_Request and MPI_Status from monitoring_test.c.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add parameter for RMA test case

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Clean the maximum bound computation for proc list dump.
Use ptrdiff_t instead of OPAL_PTRDIFF_TYPE to reflect the changes from commit fa5cd0dbe5.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add communicator-specific monitored collective data reset

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add monitoring scripts to the 'make dist'
Also install them in the build and the install directories.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-06-26 18:21:39 +02:00
Nathan Hjelm
31ab83362a osc/rdma: cleanup local peer setup and fix a bug
The data endpoint was not being set correctly for local peers in some
cases. This commit fixes the bug and cleans the associated code to
simplify the logic.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-06-22 13:28:45 -06:00
Ralph Castain
814e858082 Merge pull request #3696 from rhc54/topic/pmix3
Update PMIx and integration glue
2017-06-20 10:38:29 -07:00
Ralph Castain
952726c121 Update to latest PMIx master - equivalent to 2.0rc2. Update the thread support in the opal/pmix framework to protect the framework-level structures.
This now passes the loop test, and so we believe it resolves the random hangs in finalize.

Changes in PMIx master that are included here:

* Fixed a bug in the PMIx_Get logic
* Fixed self-notification procedure
* Made pmix_output functions thread safe
* Fixed a number of thread safety issues
* Updated configury to use 'uname -n' when hostname is unavailable

Work on cleaning up the event handler thread safety problem
Rarely used functions, but protect them anyway
Fix the last part of the intercomm problem
Ensure we don't cover any PMIx calls with the framework-level lock.
Protect against NULL argv comm_spawn

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-20 09:02:15 -07:00
George Bosilca
1f291c8728
Add the fragment to the unexpected frags only after extracting the
pml_proc.
2017-06-20 16:03:52 +02:00
Gilles Gouaillardet
9ba85b85e1 coll/libnbc: revisit NBC_Handle usage
make NBC_Handle (almost) an internal structure created
by NBC_Schedule_request()
use a local variable instead of what was previously handle->tmpbuf

Refs open-mpi/ompi#3487

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-06-20 17:24:16 +09:00
Gilles Gouaillardet
68ac95003f coll/base: fix zero size datatype handling in mca_coll_base_alltoallv_intra_basic_inplace()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-06-20 14:36:35 +09:00
Nathan Hjelm
0c8c7e50d0 Merge pull request #3682 from hjelmn/comm_assertions
ompi: add support for new communicator info assertions
2017-06-19 09:49:59 -06:00
Edgar Gabriel
70107b3e52 Merge pull request #3703 from edgargabriel/pr/cart-comm-file-open-fix
Pr/cart comm file open fix
2017-06-15 15:03:38 -05:00
Edgar Gabriel
3b0b8fa12c io/ompio: update cartesian based grouping strategy
update the cartesian communicator based grouping strategy to match the other
algorithms used in the aggregator selection process.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-06-15 14:05:54 -05:00
Edgar Gabriel
bd6b430798 common/ompio: remove function call to cart_based_grouping
the cart_based_grouping aggregator strategy was not correctly updated
during the last major rewrite of the aggregator selection algorithm.
It is also not supposed to be called from file_open (but from
file_set_view).

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-06-15 14:04:03 -05:00
George Bosilca
e9d533e62e
Fix warnings from non-debug mode.
Thanks Ralph for the report.
2017-06-13 16:57:42 -04:00
Gilles Gouaillardet
72c7329462 configury: use 'uname -n' when 'hostname' is not available
the 'hostname' command might not be available on some platforms
such as Fedora Core 26, so mimick config/libtool.m4 and fallback
to 'uname -n' if needed

Refs. #3680

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-06-12 15:04:32 +09:00
KAWASHIMA Takahiro
b5b6b22848 Merge pull request #3678 from kawashima-fj/pr/signal-abort-delay
Apply `opal_abort_delay` to the OPAL signal handler
2017-06-12 10:35:11 +09:00
Joshua Hursey
80a91dc244 io/romio314: Add work around support for missing MPI_File ops
* Add work around support for the following missing ops in ROMIO 3.1.4
    - `MPI_File_iread_at_all`
    - `MPI_File_iwrite_at_all`
    - `MPI_File_iread_all`
    - `MPI_File_iwrite_all`

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-06-09 14:42:59 -05:00
Joshua Hursey
29609631a2 mpi/c: Protect some IO functions not widely implemented
* Protects us from segv when ROMIO 314 is selected and one of the
   following operations is called:
   - MPI_File_iread_at_all
   - MPI_File_iwrite_at_all
   - MPI_File_iread_all
   - MPI_File_iwrite_all

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-06-09 11:42:26 -05:00
Nathan Hjelm
db2204f2f3 ompi: add support for new communicator info assertions
This commit adds code to allow support for the info assertions added
by mpi-forum/mpi-issues#11. The assertions added are:
mpi_assert_no_any_tag, mpi_assert_no_any_source,
mpi_assert_exact_length, and mpi_assert_allow_overtaking.

This commit also adds support for the mpi_assert_no_any_source and
mpi_assert_allow_overtaking info keys to the ob1 pml.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-06-08 15:52:12 -06:00
KAWASHIMA Takahiro
362445d486 Use same prefix format for [host:pid]
Hostname and PID are output as a message prefix in many places in
our code. Their printf-formats were either `[%s:%d]` or `[%s:%05d]`.
This commit changes `[%s:%d]` to `[%s:%05d]`. The latter was more
widely used in our code (including OPAL output system and the signal
handler).

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2017-06-08 19:35:03 +09:00
KAWASHIMA Takahiro
6b91eddc8b Apply opal_abort_delay to the signal handler
This commit expands the effect of the MCA parameter `opal_abort_delay`
to the OPAL signal handler. This allows attaching of a debugger on
segmentation fault etc. before quitting the job.

The sleep code is moved to the `opal_delay_abort` function from the
`ompi_mpi_abort` and `oshmem_shmem_abort` functions for code cleanup.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2017-06-08 19:34:48 +09:00
Mark Allen
aeb2c02d2f Type_create_darray with mix of BLOCK/CYCLIC
Example (using MPI_ORDER_C so the below has 6 rows of 4 ints to parcel out)
    size = 4;
    rank = 0;
    ndims=2;
    gsizes[0] = 6;
    gsizes[1] = 4;
    distribs[0] = MPI_DISTRIBUTE_CYCLIC;
    distribs[1] = MPI_DISTRIBUTE_BLOCK;
    dargs[0] = 2;
    dargs[1] = 2;
    psizes[0] = 2;
    psizes[1] = 2;
    MPI_Type_create_darray(size, rank, ndims,
        gsizes, distribs, dargs, psizes,
        MPI_ORDER_C, MPI_INT, &mydt);

Expectation for the layout:
   inner dimension (1) is
       4 items (ints) distributed block over 2 ranks with 2 items each
       eg for rank 0: [ x x . . ]
   outer dimension (0) is:
       6 items (the above [ x x . .]) cyclic over 2 ranks with 2 items each
       eg for rank 0:
           [ x x . . ]    :  offset=0 bytes=8
           [ x x . . ]    :  ofset=16 bytes=8
           [ . . . . ]
           [ . . . . ]
           [ x x . . ]    :  offset=64 bytes=8
           [ x x . . ]    :  offset=80 bytes=8

Or more specifically a stream of ints 0,1,2,3,4,5,6,7 sent into that
type should be
    [ 0 1 . . ]
    [ 2 3 . . ]
    [ . . . . ]
    [ . . . . ]
    [ 4 5 . . ]
    [ 6 7 . . ]
The data was laying out though as
    [ 0 1 2 3 ]
    [ . . . . ]
    [ . . . . ]
    [ . . . . ]
    [ 4 5 6 7 ]
    [ . . . . ]
because the recursive construction inside the block() function (which
creates the smaller row datatype [ x x . . ]) wasn't setting the extent
of that type.

Signed-off-by: Mark Allen <markalle@us.ibm.com>
2017-06-07 16:53:03 -04:00
Jeff Squyres
44aef39b24 Merge pull request #3641 from ggouaillardet/topic/fortran_strings
fortran/base: rename strings.h into fortran_base_strings.h
2017-06-05 15:31:08 -04:00
Ralph Castain
dea9ef2020 Merge pull request #3637 from hjelmn/osc_sm_info_fix
osc/sm: fix SEGV in new info usage
2017-06-05 05:45:21 -07:00
KAWASHIMA Takahiro
c8d38d31c6 Merge pull request #3618 from kawashima-fj/pr/java-doc-man
java: Detect `javadoc` path and improve `mpijavac` man page
2017-06-02 10:24:05 +09:00
Gilles Gouaillardet
08526e8adc fortran/base: rename strings.h into fortran_base_strings.h
rename ompi/mpi/fortran/base/strings.h so it does not get pulled
when /usr/include/strings.h is expected.

Refs open-mpi/ompi#3639

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-06-02 09:46:20 +09:00
Josh Hursey
1665d771a6 Merge pull request #3635 from wlepera/fix/ibm/155305
MPI_Sendreceive_replace data error with > 2k msg
2017-06-01 14:38:01 -05:00
Jeff Squyres
d520c24f3a predefined MPI object padding: set to fixed number of bytes (#3634)
Convert the predefined MPI object padding to a fixed number of bytes
(vs. a multiple of sizeof(void*)) so that the padding is the same size
between 32 and 64 bit builds.  I.e., we won't have a situation where
we've run out of padding in 32 bit builds but still have more space
available in 64 bit builds.

Fixes #3610

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-06-01 15:28:23 -04:00
Nathan Hjelm
d10e6455a0 osc/sm: fix SEGV in new info usage
This commit moves the info subscribe for the blocking_fence to after
the global_state is allocated and moves setting win->w_osc_module to
before the info subscribe for alloc_shared_contig. This fixes a SEGV
caught by MTT.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-06-01 12:32:30 -06:00
William LePera
a7c9c4aef3 MPI_Sendreceive_replace data error with > 2k msg (RTC 155305)
Signed-off-by: William LePera <lepera@us.ibm.com>
2017-06-01 13:08:58 -04:00
Gilles Gouaillardet
5e9be7667b Merge pull request #3600 from ggouaillardet/topic/osc_rdma_get_segment
osc/rdma: fix osc_rdma_get_remote_segment() length parameter
2017-06-01 13:09:14 +09:00
Nathan Hjelm
e1a997c0cb Merge pull request #3593 from hjelmn/bug_3575
osc/rdma: fix typo in ompi_osc_rdma_lock_acquire_exclusive
2017-05-31 08:54:40 -06:00
KAWASHIMA Takahiro
76b1f80664 java: Use correct date/version in mpijava man page
`mpijavac.1` should be generated at `make`-time...

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2017-05-31 17:24:49 +09:00
KAWASHIMA Takahiro
63f0945dcc java: Detect the path of javadoc in configure
Without this change, the directory of `javadoc` command must be
included in the `PATH` environment variable at `make`-time.
Paths of `javac`, `javah`, and `jar` commands are detected in
`configure`. So the path of `javadoc` also should be detected.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2017-05-31 14:26:14 +09:00
Ralph Castain
ed4078e2dd Protect against the condition where the port string is actually NULL
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-05-28 20:51:09 -07:00
Gilles Gouaillardet
e622ca8c1c osc/rdma: fix osc_rdma_get_remote_segment() length parameter
a buffer defined by (buf, count, dt)
will have data starting at buf+offset and ending len bytes later with
len = opal_datatype_span(&dt.super, count, &offset);

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-05-29 11:08:03 +09:00
Ralph Castain
9f60cd0fe7 Update the connect/accept support so we check to see if we have the proper infrastructure and RTE support, including whether we have ompi-server available if the connect/accept spans multiple applications. Print pretty help messages in all cases where we do not have support
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-05-27 10:47:08 -07:00
Nathan Hjelm
b83c5dbee5 osc/rdma: fix typo in ompi_osc_rdma_lock_acquire_exclusive
Fixes #3575

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-05-26 14:21:08 -06:00