1
1

24495 Коммитов

Автор SHA1 Сообщение Дата
Joshua Ladd
69e3c6f289 Merge pull request #1321 from jladd-mlnx/topic/add-allgatherv-reduce
Adding entry points for Allgatherv, iAllgatherv, Reduce, and iReduce.
2016-01-25 20:46:52 -05:00
Edgar Gabriel
86765d796a Merge pull request #1325 from edgargabriel/pvfs2-configure-logic-3
revampt the pvfs2 configure logic
2016-01-25 14:05:04 -06:00
Edgar Gabriel
9c93df5a47 revampt the pvfs2 configure logic 2016-01-25 12:03:01 -06:00
Nathan Hjelm
aec3060109 Merge pull request #1313 from igor-ivanov/pr/issue-1301
orte/oob: Fix issue #1301
2016-01-25 10:24:26 -07:00
Nathan Hjelm
500e90422d Merge pull request #1320 from hjelmn/osc_rdma_fix
osc/rdma: fix hang when performing large unaligned gets
2016-01-25 09:36:13 -07:00
Mike Dubman
ad3aa3879a Merge pull request #1322 from jladd-mlnx/topic/BufFix-for-coll-hcoll-coll_request
BufFix for coll/hcoll: coll_request must be set to ACTIVE when allocated
2016-01-25 08:36:26 +02:00
Nathan Hjelm
45da311473 osc/rdma: fix hang when performing large unaligned gets
This commit adds code to handle large unaligned gets. There are two
possible code paths for these transactions:

 1) The remote region and local region have the same alignment. In
 this case the get will be broken down into at most three get
 transactions: 1 transaction to get the unaligned start of the region
 (buffered), 1 transaction to get the aligned portion of the region,
 and 1 transaction to get the end of the region.

 2) The remote and local regions do not have the same alignment. This
 should be an uncommon case and is not optimized. In this case a
 buffer is allocated and registered locally to hold the aligned data
 from the remote region. There may be cases where this fails (low
 memory, can't register memory). Those conditions are unlikely and
 will be handled later.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-01-22 21:06:46 -07:00
Valentin Petrov
5e2a2c0755 BufFix for coll/hcoll: coll_request must be set to ACTIVE when alloced
If the state of the request is not set to OMPI_REQUEST_ACTIVE
       then MPI_Test would immediately signal such request completed
       while hcoll may still be working on it.

Signed-off-by: Joshua Ladd <jladd.mlnx@gmail.com>
2016-01-23 03:23:59 +02:00
Joshua Ladd
e398bf6f3a Adding entry points for Allgatherv, iAllgatherv, Reduce, and iReduce.
Signed-off-by: Joshua Ladd <jladd.mlnx@gmail.com>
2016-01-23 03:09:29 +02:00
Nathan Hjelm
70787d1574 Merge pull request #1319 from hjelmn/osc_rdma_fix
osc/rdma: use correct endpoint for local state
2016-01-22 11:53:06 -07:00
Mike Dubman
89c7fea492 Merge pull request #1315 from alex-mikheev/topic/oshmem_ucx_atomic
OSHMEM/UCX: implements atomic support
2016-01-22 20:46:44 +02:00
Nathan Hjelm
49d2f44b97 osc/rdma: use correct endpoint for local state
If atomics are not globally visible (cpu and nic atomics do not mix)
then a btl endpoint must be used to access local ranks. To avoid
issues that are caused by having the same region registered with
multiple handles osc/rdma was updated to always use the handle for
rank 0. There was a bug in the update that caused osc/rdma to continue
using the local endpoint for accessing the state even though the
pointer/handle are not valid for that endpoint. This commit fixes the
bug.

Fixes open-mpi/ompi#1241.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-01-22 10:41:27 -07:00
Nathan Hjelm
243d973cfe Merge pull request #1316 from hjelmn/datatype_pack_threads
ompi/datatype: make datatype pack thread safe
2016-01-21 20:14:10 -07:00
Nathan Hjelm
0fe4818454 Merge pull request #1318 from hjelmn/osc_rdma_fixes
osc/rdma: disable put aggregation when using threads
2016-01-21 17:54:52 -07:00
Nathan Hjelm
b921831f2b ompi/datatype: make datatype pack thread safe
This commit makes ompi_datatype_get_pack_description thread safe. The
call is used by osc/pt2pt to send the packed description to remote
peers. Before this commit if MPI_THREAD_MULTIPLE is enabled and the
user uses MPI_Put, MPI_Get, etc we could hit a race where multiple
threads attempt to store the packed description on the datatype. Since
the code in question is not performance-critical the threading fix
uses opal_atomic_* calls instead of bothering with OPAL_THREAD_*.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-01-21 17:53:28 -07:00
Nathan Hjelm
6180386bea osc/rdma: disable put aggregation when using threads
Optimizing put aggregation in the presence of threads will require a
redesign of the code. For now just ensure that put aggregation is
turned off when MPI_THREAD_MULTIPLE is enabled.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-01-21 15:50:35 -07:00
Ralph Castain
ae3df2968a Add the 1.10.2 NEWS items 2016-01-21 10:00:41 -08:00
Edgar Gabriel
b253d4e887 fix CID 1349739, CID 1349738, CID 1349736 and (probably) CID 1349740 (not entirely sure about the last one, since I don't understand why block[i] is a problem but max_len[i] allocated and treated exactly the same way 1 line later is not). 2016-01-21 08:32:23 -06:00
Alex Mikheev
f627608e42 OSHMEM/UCX: implements atomic support
ucx atomic component has a real code now.
fixes bug in spml ucx arr_procs
removes redundant parameter checks from atomic components.
2016-01-21 16:02:28 +02:00
Jeff Squyres
655b4be97c find_common_syms: update for OS X symbol naming
OS X tends to prefix its symbols with "_".
2016-01-20 16:18:43 -05:00
Edgar Gabriel
9b8d769e41 will rivist the addproc component later in spring, right now it is constantly in the way of doing my tests. 2016-01-20 15:05:51 -06:00
Edgar Gabriel
1671604dbc Merge pull request #1307 from edgargabriel/fcoll-dynamic_gen2
Fcoll dynamic gen2
2016-01-20 10:19:56 -06:00
Igor Ivanov
34d861dfe9 orte/oob: Fix issue #1301
Signed-off-by: Igor Ivanov <Igor.Ivanov@itseez.com>
2016-01-20 12:08:00 +02:00
Gilles Gouaillardet
2adbe273d6 mpi: have MPI_Wtick() return the period (and not the frequency) if OPAL_TIMER_CYCLE_NATIVE 2016-01-20 14:14:47 +09:00
Jeff Squyres
bd04192087 Merge pull request #1234 from ggouaillardet/poc/travis_gcc5
Poc/travis gcc5
2016-01-18 09:26:56 -05:00
Gilles Gouaillardet
c0f8f2ce32 ompi/dpm: correctly handle sentinels in construct peers
This fix is similar to open-mpi/ompi@4c1ea4a171
and open-mpi/ompi@213b2abde4
2016-01-18 09:57:38 +09:00
Gilles Gouaillardet
7d6b75f3b2 orte_util_snprintf_jobid: return ORTE_SUCCESS or ORTE_ERROR 2016-01-18 09:44:33 +09:00
Edgar Gabriel
a9ca37059a improve the communicaton abstraction. This commit also allows all aggregators to work simultaniously, instead of the slightly staggered way of the previous version. 2016-01-17 09:48:49 -06:00
Edgar Gabriel
56e11bfc97 initialize the stripe_size variable as well. 2016-01-17 09:48:49 -06:00
Edgar Gabriel
26c57ef374 separate the size of the buffer used for the shuffle step and the size of the buffer used for a pwritev operation. 2016-01-17 09:48:49 -06:00
Edgar Gabriel
39d5c8c281 further bug fixes silencing a compiler warning and fixing a memory overrun 2016-01-17 09:48:49 -06:00
Edgar Gabriel
2bcae84e11 further debugging 2016-01-17 09:48:49 -06:00
Edgar Gabriel
2bdd6ba17a correctly free some buffers, and ensure that lustre_stripe_size and stripe_count are always read from the file system. 2016-01-17 09:48:49 -06:00
Edgar Gabriel
4bbb22bd0b add a new field to the ompio data structure (stripe_count) and set it correctly on pvfs2 and lustre. 2016-01-17 09:48:49 -06:00
Edgar Gabriel
d282e94b67 add the new dynamic_gen2 component, designed to coexist for now with the original dynamic component 2016-01-17 09:48:49 -06:00
rhc54
b172b8599b Merge pull request #1285 from ggouaillardet/topic/pmix_dist_fix
pmix: do not include automatically generated include/private/autogen/…
2016-01-16 20:49:41 -08:00
Ralph Castain
fc6b260146 Protect against PMIx-based requests that don't come thru the MPI comm_spawn interface 2016-01-16 13:36:06 -08:00
Ralph Castain
4dad5de8ff Silence a couple of warnings - strncpy returns a char*, not an int 2016-01-16 09:44:52 -08:00
Jeff Squyres
348ac507c2 usnic: explain why we still have OPAL_HAVE_HWLOC
Put in a comment explaining why btl_usnic_compat.h still defines
OPAL_HAVE_HWLOC, even though master/v2.x no longer does.
2016-01-16 04:11:05 -08:00
Jeff Squyres
0f5fcf9029 usnic: fix common symbol 2016-01-16 03:55:27 -08:00
Jeff Squyres
6c96cb1ad0 find_common_syms: arrgh -- re-add the x bit
Previous commit accidentally removed the x bit from this script.  This
commit puts it back.
2016-01-16 03:53:43 -08:00
Jeff Squyres
60ffe713b8 common syms: whitelist bison-generated common symbols
Bison generates some common symbols that we can't do anything about,
so whitelist them.
2016-01-16 03:53:14 -08:00
Jeff Squyres
96f94f8228 fortran: whitelist deliberate common symbols
The Fortran library has a number of common symbols that are
deliberate, so whitelist them.
2016-01-16 03:53:14 -08:00
Jeff Squyres
c43d4fd915 find_common_syms: trivial updates
Look for "common_sym_whitelist.txt" files (not
"common_sym_whitelist").  Also, skip blank lines in the
whitelistfiles, too.
2016-01-16 03:53:14 -08:00
rhc54
ef24f710a7 Merge pull request #1303 from timattox/remove_unused_var
hwloc_base_util.c: Remove newly unused variable 'i'.
2016-01-15 00:35:14 -08:00
Tim Mattox
958de82471 hwloc_base_util.c: Remove newly unused variable 'i'. 2016-01-14 16:35:47 -05:00
Joshua Ladd
18c5a21562 Fix typo in error handling flow. 2016-01-14 22:28:54 +02:00
Joshua Ladd
afa62d8ca1 Addressing reviewers' comments for https://github.com/open-mpi/ompi-release/pull/891 2016-01-14 19:22:27 +02:00
Gilles Gouaillardet
1d38430e43 opal: replace opal_convert_jobid_to_string with opal_snprintf_jobid 2016-01-14 10:39:03 +09:00
Jeff Squyres
e5cf2db3b7 Merge pull request #1291 from jsquyres/pr/hotel-fix
opal hotel: only delete events that have not yet fired
2016-01-13 14:51:51 -05:00