Nathan Hjelm
e564c69769
Merge pull request #1330 from hjelmn/osc_rdma_fix
...
osc/rdma: fix typo in ompi_osc_rdma_complete_atomic
2016-01-26 19:26:59 -07:00
Nathan Hjelm
a19c265ab5
osc/rdma: fix typo in ompi_osc_rdma_complete_atomic
...
The typo caused SEGVs on systems with only fetching atomic
support.
Fixes open-mpi/ompi#1329
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-01-26 15:44:07 -07:00
Edgar Gabriel
b4a725c26a
need to check for the parent dir as well, since the file might not exist yet.
2016-01-26 13:49:21 -06:00
Edgar Gabriel
722aab92e6
- extend opal_path_nfs to retrieve the file system type
...
- use opal_path_nfs in the fs_base function to avoid code duplication.
2016-01-26 13:36:21 -06:00
Gilles Gouaillardet
6d149554a7
hwloc: have opal_hwloc_base_get_pu search for HWLOC_OBJ_PU when mpirun is invoked with --use-hwthread-cpus
...
Fixes open-mpi/ompi#1247
2016-01-26 18:10:33 +09:00
Gilles Gouaillardet
704f14f91e
f08: do not BIND(C) to subroutines with LOGICAL parameters
...
Thanks Paul Romano for reporting this issue.
2016-01-26 13:56:24 +09:00
rhc54
53185e7621
Merge pull request #1326 from ggouaillardet/topic/pmix_check_icc_varargs
...
pmix configury: add missing PMIX_CHECK_ICC_VARARGS function
2016-01-25 19:38:12 -08:00
Gilles Gouaillardet
15e26da1e1
pmix configury: add missing PMIX_CHECK_ICC_VARARGS function
...
Thanks Paul Hargrove for the report
(back-ported from pmix/master@7b16e914bf )
2016-01-26 10:57:16 +09:00
Joshua Ladd
69e3c6f289
Merge pull request #1321 from jladd-mlnx/topic/add-allgatherv-reduce
...
Adding entry points for Allgatherv, iAllgatherv, Reduce, and iReduce.
2016-01-25 20:46:52 -05:00
Edgar Gabriel
86765d796a
Merge pull request #1325 from edgargabriel/pvfs2-configure-logic-3
...
revampt the pvfs2 configure logic
2016-01-25 14:05:04 -06:00
Edgar Gabriel
9c93df5a47
revampt the pvfs2 configure logic
2016-01-25 12:03:01 -06:00
Nathan Hjelm
aec3060109
Merge pull request #1313 from igor-ivanov/pr/issue-1301
...
orte/oob: Fix issue #1301
2016-01-25 10:24:26 -07:00
Nathan Hjelm
500e90422d
Merge pull request #1320 from hjelmn/osc_rdma_fix
...
osc/rdma: fix hang when performing large unaligned gets
2016-01-25 09:36:13 -07:00
Mike Dubman
ad3aa3879a
Merge pull request #1322 from jladd-mlnx/topic/BufFix-for-coll-hcoll-coll_request
...
BufFix for coll/hcoll: coll_request must be set to ACTIVE when allocated
2016-01-25 08:36:26 +02:00
Nathan Hjelm
45da311473
osc/rdma: fix hang when performing large unaligned gets
...
This commit adds code to handle large unaligned gets. There are two
possible code paths for these transactions:
1) The remote region and local region have the same alignment. In
this case the get will be broken down into at most three get
transactions: 1 transaction to get the unaligned start of the region
(buffered), 1 transaction to get the aligned portion of the region,
and 1 transaction to get the end of the region.
2) The remote and local regions do not have the same alignment. This
should be an uncommon case and is not optimized. In this case a
buffer is allocated and registered locally to hold the aligned data
from the remote region. There may be cases where this fails (low
memory, can't register memory). Those conditions are unlikely and
will be handled later.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-01-22 21:06:46 -07:00
Valentin Petrov
5e2a2c0755
BufFix for coll/hcoll: coll_request must be set to ACTIVE when alloced
...
If the state of the request is not set to OMPI_REQUEST_ACTIVE
then MPI_Test would immediately signal such request completed
while hcoll may still be working on it.
Signed-off-by: Joshua Ladd <jladd.mlnx@gmail.com>
2016-01-23 03:23:59 +02:00
Joshua Ladd
e398bf6f3a
Adding entry points for Allgatherv, iAllgatherv, Reduce, and iReduce.
...
Signed-off-by: Joshua Ladd <jladd.mlnx@gmail.com>
2016-01-23 03:09:29 +02:00
Nathan Hjelm
70787d1574
Merge pull request #1319 from hjelmn/osc_rdma_fix
...
osc/rdma: use correct endpoint for local state
2016-01-22 11:53:06 -07:00
Mike Dubman
89c7fea492
Merge pull request #1315 from alex-mikheev/topic/oshmem_ucx_atomic
...
OSHMEM/UCX: implements atomic support
2016-01-22 20:46:44 +02:00
Nathan Hjelm
49d2f44b97
osc/rdma: use correct endpoint for local state
...
If atomics are not globally visible (cpu and nic atomics do not mix)
then a btl endpoint must be used to access local ranks. To avoid
issues that are caused by having the same region registered with
multiple handles osc/rdma was updated to always use the handle for
rank 0. There was a bug in the update that caused osc/rdma to continue
using the local endpoint for accessing the state even though the
pointer/handle are not valid for that endpoint. This commit fixes the
bug.
Fixes open-mpi/ompi#1241 .
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-01-22 10:41:27 -07:00
Nathan Hjelm
243d973cfe
Merge pull request #1316 from hjelmn/datatype_pack_threads
...
ompi/datatype: make datatype pack thread safe
2016-01-21 20:14:10 -07:00
Nathan Hjelm
0fe4818454
Merge pull request #1318 from hjelmn/osc_rdma_fixes
...
osc/rdma: disable put aggregation when using threads
2016-01-21 17:54:52 -07:00
Nathan Hjelm
b921831f2b
ompi/datatype: make datatype pack thread safe
...
This commit makes ompi_datatype_get_pack_description thread safe. The
call is used by osc/pt2pt to send the packed description to remote
peers. Before this commit if MPI_THREAD_MULTIPLE is enabled and the
user uses MPI_Put, MPI_Get, etc we could hit a race where multiple
threads attempt to store the packed description on the datatype. Since
the code in question is not performance-critical the threading fix
uses opal_atomic_* calls instead of bothering with OPAL_THREAD_*.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-01-21 17:53:28 -07:00
Nathan Hjelm
6180386bea
osc/rdma: disable put aggregation when using threads
...
Optimizing put aggregation in the presence of threads will require a
redesign of the code. For now just ensure that put aggregation is
turned off when MPI_THREAD_MULTIPLE is enabled.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-01-21 15:50:35 -07:00
Ralph Castain
ae3df2968a
Add the 1.10.2 NEWS items
2016-01-21 10:00:41 -08:00
Edgar Gabriel
b253d4e887
fix CID 1349739, CID 1349738, CID 1349736 and (probably) CID 1349740 (not entirely sure about the last one, since I don't understand why block[i] is a problem but max_len[i] allocated and treated exactly the same way 1 line later is not).
2016-01-21 08:32:23 -06:00
Alex Mikheev
f627608e42
OSHMEM/UCX: implements atomic support
...
ucx atomic component has a real code now.
fixes bug in spml ucx arr_procs
removes redundant parameter checks from atomic components.
2016-01-21 16:02:28 +02:00
Jeff Squyres
655b4be97c
find_common_syms: update for OS X symbol naming
...
OS X tends to prefix its symbols with "_".
2016-01-20 16:18:43 -05:00
Edgar Gabriel
9b8d769e41
will rivist the addproc component later in spring, right now it is constantly in the way of doing my tests.
2016-01-20 15:05:51 -06:00
Edgar Gabriel
1671604dbc
Merge pull request #1307 from edgargabriel/fcoll-dynamic_gen2
...
Fcoll dynamic gen2
2016-01-20 10:19:56 -06:00
Igor Ivanov
34d861dfe9
orte/oob: Fix issue #1301
...
Signed-off-by: Igor Ivanov <Igor.Ivanov@itseez.com>
2016-01-20 12:08:00 +02:00
Gilles Gouaillardet
2adbe273d6
mpi: have MPI_Wtick() return the period (and not the frequency) if OPAL_TIMER_CYCLE_NATIVE
2016-01-20 14:14:47 +09:00
Jeff Squyres
bd04192087
Merge pull request #1234 from ggouaillardet/poc/travis_gcc5
...
Poc/travis gcc5
2016-01-18 09:26:56 -05:00
Gilles Gouaillardet
c0f8f2ce32
ompi/dpm: correctly handle sentinels in construct peers
...
This fix is similar to open-mpi/ompi@4c1ea4a171
and open-mpi/ompi@213b2abde4
2016-01-18 09:57:38 +09:00
Gilles Gouaillardet
7d6b75f3b2
orte_util_snprintf_jobid: return ORTE_SUCCESS or ORTE_ERROR
2016-01-18 09:44:33 +09:00
Edgar Gabriel
a9ca37059a
improve the communicaton abstraction. This commit also allows all aggregators to work simultaniously, instead of the slightly staggered way of the previous version.
2016-01-17 09:48:49 -06:00
Edgar Gabriel
56e11bfc97
initialize the stripe_size variable as well.
2016-01-17 09:48:49 -06:00
Edgar Gabriel
26c57ef374
separate the size of the buffer used for the shuffle step and the size of the buffer used for a pwritev operation.
2016-01-17 09:48:49 -06:00
Edgar Gabriel
39d5c8c281
further bug fixes silencing a compiler warning and fixing a memory overrun
2016-01-17 09:48:49 -06:00
Edgar Gabriel
2bcae84e11
further debugging
2016-01-17 09:48:49 -06:00
Edgar Gabriel
2bdd6ba17a
correctly free some buffers, and ensure that lustre_stripe_size and stripe_count are always read from the file system.
2016-01-17 09:48:49 -06:00
Edgar Gabriel
4bbb22bd0b
add a new field to the ompio data structure (stripe_count) and set it correctly on pvfs2 and lustre.
2016-01-17 09:48:49 -06:00
Edgar Gabriel
d282e94b67
add the new dynamic_gen2 component, designed to coexist for now with the original dynamic component
2016-01-17 09:48:49 -06:00
rhc54
b172b8599b
Merge pull request #1285 from ggouaillardet/topic/pmix_dist_fix
...
pmix: do not include automatically generated include/private/autogen/…
2016-01-16 20:49:41 -08:00
Ralph Castain
fc6b260146
Protect against PMIx-based requests that don't come thru the MPI comm_spawn interface
2016-01-16 13:36:06 -08:00
Ralph Castain
4dad5de8ff
Silence a couple of warnings - strncpy returns a char*, not an int
2016-01-16 09:44:52 -08:00
Jeff Squyres
348ac507c2
usnic: explain why we still have OPAL_HAVE_HWLOC
...
Put in a comment explaining why btl_usnic_compat.h still defines
OPAL_HAVE_HWLOC, even though master/v2.x no longer does.
2016-01-16 04:11:05 -08:00
Jeff Squyres
0f5fcf9029
usnic: fix common symbol
2016-01-16 03:55:27 -08:00
Jeff Squyres
6c96cb1ad0
find_common_syms: arrgh -- re-add the x bit
...
Previous commit accidentally removed the x bit from this script. This
commit puts it back.
2016-01-16 03:53:43 -08:00
Jeff Squyres
60ffe713b8
common syms: whitelist bison-generated common symbols
...
Bison generates some common symbols that we can't do anything about,
so whitelist them.
2016-01-16 03:53:14 -08:00