1
1
Граф коммитов

24452 Коммитов

Автор SHA1 Сообщение Дата
Nathan Hjelm
a19c265ab5 osc/rdma: fix typo in ompi_osc_rdma_complete_atomic
The typo caused SEGVs on systems with only fetching atomic
support.

Fixes open-mpi/ompi#1329

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-01-26 15:44:07 -07:00
Edgar Gabriel
b4a725c26a need to check for the parent dir as well, since the file might not exist yet. 2016-01-26 13:49:21 -06:00
Edgar Gabriel
722aab92e6 - extend opal_path_nfs to retrieve the file system type
- use opal_path_nfs in the fs_base function to avoid code duplication.
2016-01-26 13:36:21 -06:00
Gilles Gouaillardet
6d149554a7 hwloc: have opal_hwloc_base_get_pu search for HWLOC_OBJ_PU when mpirun is invoked with --use-hwthread-cpus
Fixes open-mpi/ompi#1247
2016-01-26 18:10:33 +09:00
Gilles Gouaillardet
704f14f91e f08: do not BIND(C) to subroutines with LOGICAL parameters
Thanks Paul Romano for reporting this issue.
2016-01-26 13:56:24 +09:00
rhc54
53185e7621 Merge pull request #1326 from ggouaillardet/topic/pmix_check_icc_varargs
pmix configury: add missing PMIX_CHECK_ICC_VARARGS function
2016-01-25 19:38:12 -08:00
Gilles Gouaillardet
15e26da1e1 pmix configury: add missing PMIX_CHECK_ICC_VARARGS function
Thanks Paul Hargrove for the report

(back-ported from pmix/master@7b16e914bf)
2016-01-26 10:57:16 +09:00
Joshua Ladd
69e3c6f289 Merge pull request #1321 from jladd-mlnx/topic/add-allgatherv-reduce
Adding entry points for Allgatherv, iAllgatherv, Reduce, and iReduce.
2016-01-25 20:46:52 -05:00
Edgar Gabriel
86765d796a Merge pull request #1325 from edgargabriel/pvfs2-configure-logic-3
revampt the pvfs2 configure logic
2016-01-25 14:05:04 -06:00
Edgar Gabriel
9c93df5a47 revampt the pvfs2 configure logic 2016-01-25 12:03:01 -06:00
Nathan Hjelm
aec3060109 Merge pull request #1313 from igor-ivanov/pr/issue-1301
orte/oob: Fix issue #1301
2016-01-25 10:24:26 -07:00
Nathan Hjelm
500e90422d Merge pull request #1320 from hjelmn/osc_rdma_fix
osc/rdma: fix hang when performing large unaligned gets
2016-01-25 09:36:13 -07:00
Mike Dubman
ad3aa3879a Merge pull request #1322 from jladd-mlnx/topic/BufFix-for-coll-hcoll-coll_request
BufFix for coll/hcoll: coll_request must be set to ACTIVE when allocated
2016-01-25 08:36:26 +02:00
Nathan Hjelm
45da311473 osc/rdma: fix hang when performing large unaligned gets
This commit adds code to handle large unaligned gets. There are two
possible code paths for these transactions:

 1) The remote region and local region have the same alignment. In
 this case the get will be broken down into at most three get
 transactions: 1 transaction to get the unaligned start of the region
 (buffered), 1 transaction to get the aligned portion of the region,
 and 1 transaction to get the end of the region.

 2) The remote and local regions do not have the same alignment. This
 should be an uncommon case and is not optimized. In this case a
 buffer is allocated and registered locally to hold the aligned data
 from the remote region. There may be cases where this fails (low
 memory, can't register memory). Those conditions are unlikely and
 will be handled later.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-01-22 21:06:46 -07:00
Valentin Petrov
5e2a2c0755 BufFix for coll/hcoll: coll_request must be set to ACTIVE when alloced
If the state of the request is not set to OMPI_REQUEST_ACTIVE
       then MPI_Test would immediately signal such request completed
       while hcoll may still be working on it.

Signed-off-by: Joshua Ladd <jladd.mlnx@gmail.com>
2016-01-23 03:23:59 +02:00
Joshua Ladd
e398bf6f3a Adding entry points for Allgatherv, iAllgatherv, Reduce, and iReduce.
Signed-off-by: Joshua Ladd <jladd.mlnx@gmail.com>
2016-01-23 03:09:29 +02:00
Nathan Hjelm
70787d1574 Merge pull request #1319 from hjelmn/osc_rdma_fix
osc/rdma: use correct endpoint for local state
2016-01-22 11:53:06 -07:00
Mike Dubman
89c7fea492 Merge pull request #1315 from alex-mikheev/topic/oshmem_ucx_atomic
OSHMEM/UCX: implements atomic support
2016-01-22 20:46:44 +02:00
Nathan Hjelm
49d2f44b97 osc/rdma: use correct endpoint for local state
If atomics are not globally visible (cpu and nic atomics do not mix)
then a btl endpoint must be used to access local ranks. To avoid
issues that are caused by having the same region registered with
multiple handles osc/rdma was updated to always use the handle for
rank 0. There was a bug in the update that caused osc/rdma to continue
using the local endpoint for accessing the state even though the
pointer/handle are not valid for that endpoint. This commit fixes the
bug.

Fixes open-mpi/ompi#1241.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-01-22 10:41:27 -07:00
Nathan Hjelm
243d973cfe Merge pull request #1316 from hjelmn/datatype_pack_threads
ompi/datatype: make datatype pack thread safe
2016-01-21 20:14:10 -07:00
Nathan Hjelm
0fe4818454 Merge pull request #1318 from hjelmn/osc_rdma_fixes
osc/rdma: disable put aggregation when using threads
2016-01-21 17:54:52 -07:00
Nathan Hjelm
b921831f2b ompi/datatype: make datatype pack thread safe
This commit makes ompi_datatype_get_pack_description thread safe. The
call is used by osc/pt2pt to send the packed description to remote
peers. Before this commit if MPI_THREAD_MULTIPLE is enabled and the
user uses MPI_Put, MPI_Get, etc we could hit a race where multiple
threads attempt to store the packed description on the datatype. Since
the code in question is not performance-critical the threading fix
uses opal_atomic_* calls instead of bothering with OPAL_THREAD_*.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-01-21 17:53:28 -07:00
Nathan Hjelm
6180386bea osc/rdma: disable put aggregation when using threads
Optimizing put aggregation in the presence of threads will require a
redesign of the code. For now just ensure that put aggregation is
turned off when MPI_THREAD_MULTIPLE is enabled.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-01-21 15:50:35 -07:00
Ralph Castain
ae3df2968a Add the 1.10.2 NEWS items 2016-01-21 10:00:41 -08:00
Edgar Gabriel
b253d4e887 fix CID 1349739, CID 1349738, CID 1349736 and (probably) CID 1349740 (not entirely sure about the last one, since I don't understand why block[i] is a problem but max_len[i] allocated and treated exactly the same way 1 line later is not). 2016-01-21 08:32:23 -06:00
Alex Mikheev
f627608e42 OSHMEM/UCX: implements atomic support
ucx atomic component has a real code now.
fixes bug in spml ucx arr_procs
removes redundant parameter checks from atomic components.
2016-01-21 16:02:28 +02:00
Jeff Squyres
655b4be97c find_common_syms: update for OS X symbol naming
OS X tends to prefix its symbols with "_".
2016-01-20 16:18:43 -05:00
Edgar Gabriel
9b8d769e41 will rivist the addproc component later in spring, right now it is constantly in the way of doing my tests. 2016-01-20 15:05:51 -06:00
Edgar Gabriel
1671604dbc Merge pull request #1307 from edgargabriel/fcoll-dynamic_gen2
Fcoll dynamic gen2
2016-01-20 10:19:56 -06:00
Igor Ivanov
34d861dfe9 orte/oob: Fix issue #1301
Signed-off-by: Igor Ivanov <Igor.Ivanov@itseez.com>
2016-01-20 12:08:00 +02:00
Gilles Gouaillardet
2adbe273d6 mpi: have MPI_Wtick() return the period (and not the frequency) if OPAL_TIMER_CYCLE_NATIVE 2016-01-20 14:14:47 +09:00
Jeff Squyres
bd04192087 Merge pull request #1234 from ggouaillardet/poc/travis_gcc5
Poc/travis gcc5
2016-01-18 09:26:56 -05:00
Gilles Gouaillardet
c0f8f2ce32 ompi/dpm: correctly handle sentinels in construct peers
This fix is similar to open-mpi/ompi@4c1ea4a171
and open-mpi/ompi@213b2abde4
2016-01-18 09:57:38 +09:00
Gilles Gouaillardet
7d6b75f3b2 orte_util_snprintf_jobid: return ORTE_SUCCESS or ORTE_ERROR 2016-01-18 09:44:33 +09:00
Edgar Gabriel
a9ca37059a improve the communicaton abstraction. This commit also allows all aggregators to work simultaniously, instead of the slightly staggered way of the previous version. 2016-01-17 09:48:49 -06:00
Edgar Gabriel
56e11bfc97 initialize the stripe_size variable as well. 2016-01-17 09:48:49 -06:00
Edgar Gabriel
26c57ef374 separate the size of the buffer used for the shuffle step and the size of the buffer used for a pwritev operation. 2016-01-17 09:48:49 -06:00
Edgar Gabriel
39d5c8c281 further bug fixes silencing a compiler warning and fixing a memory overrun 2016-01-17 09:48:49 -06:00
Edgar Gabriel
2bcae84e11 further debugging 2016-01-17 09:48:49 -06:00
Edgar Gabriel
2bdd6ba17a correctly free some buffers, and ensure that lustre_stripe_size and stripe_count are always read from the file system. 2016-01-17 09:48:49 -06:00
Edgar Gabriel
4bbb22bd0b add a new field to the ompio data structure (stripe_count) and set it correctly on pvfs2 and lustre. 2016-01-17 09:48:49 -06:00
Edgar Gabriel
d282e94b67 add the new dynamic_gen2 component, designed to coexist for now with the original dynamic component 2016-01-17 09:48:49 -06:00
rhc54
b172b8599b Merge pull request #1285 from ggouaillardet/topic/pmix_dist_fix
pmix: do not include automatically generated include/private/autogen/…
2016-01-16 20:49:41 -08:00
Ralph Castain
fc6b260146 Protect against PMIx-based requests that don't come thru the MPI comm_spawn interface 2016-01-16 13:36:06 -08:00
Ralph Castain
4dad5de8ff Silence a couple of warnings - strncpy returns a char*, not an int 2016-01-16 09:44:52 -08:00
Jeff Squyres
348ac507c2 usnic: explain why we still have OPAL_HAVE_HWLOC
Put in a comment explaining why btl_usnic_compat.h still defines
OPAL_HAVE_HWLOC, even though master/v2.x no longer does.
2016-01-16 04:11:05 -08:00
Jeff Squyres
0f5fcf9029 usnic: fix common symbol 2016-01-16 03:55:27 -08:00
Jeff Squyres
6c96cb1ad0 find_common_syms: arrgh -- re-add the x bit
Previous commit accidentally removed the x bit from this script.  This
commit puts it back.
2016-01-16 03:53:43 -08:00
Jeff Squyres
60ffe713b8 common syms: whitelist bison-generated common symbols
Bison generates some common symbols that we can't do anything about,
so whitelist them.
2016-01-16 03:53:14 -08:00
Jeff Squyres
96f94f8228 fortran: whitelist deliberate common symbols
The Fortran library has a number of common symbols that are
deliberate, so whitelist them.
2016-01-16 03:53:14 -08:00