1
1
Граф коммитов

3940 Коммитов

Автор SHA1 Сообщение Дата
Vasily Filipov
99bd5977bd MTL MXM: small fix in the mxm_req_probe func interface.
This commit was SVN r26850.
2012-07-24 08:46:38 +00:00
George Bosilca
6ebbacb054 Complete the dump function for the SM BTL. Now we can see all fragments in all
the queues as long as the BTL is dump-friendly (only SM right now).

This commit was SVN r26849.
2012-07-24 00:22:22 +00:00
George Bosilca
55bc3c4763 Fix the copyright.
This commit was SVN r26848.
2012-07-24 00:20:24 +00:00
George Bosilca
1ad6c82015 Implement the dump function for the PML OB1.
This commit was SVN r26847.
2012-07-24 00:19:18 +00:00
Samuel Gutierrez
76d94bf9bf Plug leak. Thanks, Nathan.
This commit was SVN r26846.
2012-07-23 21:11:21 +00:00
Samuel Gutierrez
8096852a16 Towards RML-less shared-memory initialization (primarily for eventual BTL
move).  Extended common sm API with: mca_common_sm_module_create and
mca_common_sm_module_attach. Please note that the new routines aren't currently
used -- but will be...

This commit was SVN r26845.
2012-07-23 19:38:13 +00:00
Eugene Loh
10e3dc396b Add a missing return value.
This commit was SVN r26815.
2012-07-20 01:32:06 +00:00
Brian Barrett
2518014037 Fix a number of issues with IN_PLACE
This commit was SVN r26814.
2012-07-19 21:29:43 +00:00
Nathan Hjelm
cd2cbdca09 btl/openib: limit each process to a ppn fraction of the available registered memory when using mellanox hardware (mlx4 and mthca). fixed
This commit was SVN r26811.
2012-07-19 17:52:21 +00:00
Ralph Castain
66fe57f746 Revert r26804 so openib can build again
This commit was SVN r26810.

The following SVN revision numbers were found above:
  r26804 --> open-mpi/ompi@610be870f9
2012-07-19 16:16:38 +00:00
Ralph Castain
44bd855717 Silence warnings
This commit was SVN r26808.
2012-07-19 14:29:32 +00:00
Vasily Filipov
597a422272 MTL: make MXM work with read (in blocking send case) call-backs.
This commit was SVN r26807.
2012-07-19 13:28:06 +00:00
Nathan Hjelm
610be870f9 btl/openib: limit each process to a ppn fraction of the available registered memory when using mellanox hardware (mlx4 and mthca)
This commit was SVN r26804.
2012-07-18 17:29:48 +00:00
Nathan Hjelm
4a97ecbdd2 btl/openib: remove tab characters
This commit was SVN r26803.
2012-07-18 17:29:37 +00:00
Eugene Loh
a3e02fdaff With non-blocking collectives, a "round schedule" could fall on any address
alignment, which typically causes problems on SPARC.  Further, the pointer
manipulation to access elements in a round schedule was clumsy.  This change
introduces macros to facilitate addressing and make it more portable.

This commit was SVN r26802.
2012-07-18 17:08:24 +00:00
Nathan Hjelm
771b427027 udcm: unmonitor the fd BEFORE tearing down the listen qp
This commit was SVN r26800.
2012-07-18 14:22:45 +00:00
Nathan Hjelm
35de50b823 remove the elan btl
This commit was SVN r26798.
2012-07-17 14:51:41 +00:00
Nathan Hjelm
fc1b295606 udcm: evict from the lru of the openib device's grdma mpool if a qp can not be created. Note: there doesn't appear to be a standard way to differentiate between ibv_create_qp failing because the node is out of registered memory and failing because no more qps are available
This commit was SVN r26797.
2012-07-14 01:58:29 +00:00
Nathan Hjelm
3798f38386 do not print out an error message if ibv_reg_mr fails
This commit was SVN r26796.
2012-07-14 01:35:45 +00:00
Abhishek Kulkarni
1878f276cd Replace the pattern while(flag) { opal_progress() }; in the C/R code
with the ORTE_WAIT_FOR_COMPLETION macro.

This commit was SVN r26794.
2012-07-13 23:31:56 +00:00
Nathan Hjelm
4d1920ee87 Fix a bug on 32-bit systems introduced by r26626. This fix ensures that all supported btls (with exception of wv-- shiqing will need to help bring that one up to date with r26626) set the lval in prepare_src/dst when preparing a put or get segment. This fix also ensures a consistent use of lval in put and get for both local and remote segments.
This commit was SVN r26793.

The following SVN revision numbers were found above:
  r26626 --> open-mpi/ompi@249066e06d
2012-07-13 21:19:16 +00:00
Nathan Hjelm
344fe61616 remove assertion in udcm
This commit was SVN r26790.
2012-07-13 15:14:48 +00:00
Jeff Squyres
e719d6ab78 It turns out that "sppp" on the Oracle Mx000 series of servers (where x =
{3, 4, 5, 9}, SPARC VI-based machines) is not a 127.x.y.z interface,
so it needs to stay in the exclude list.

This commit was SVN r26789.
2012-07-13 12:11:41 +00:00
Jeff Squyres
196bc0a53e Update the TCP BTL MCA param btl_tcp_if_exclude default value to use
CIDR notation 127.0.0.1/8 to ignore localhost devices instead of the
imprecise (and not always correct!) "lo,sppp".

This commit was SVN r26788.
2012-07-12 15:13:08 +00:00
Edgar Gabriel
92271c571d set the status field for collective read and write operations.
This commit was SVN r26786.
2012-07-12 10:26:27 +00:00
Nathan Hjelm
b79a61a360 move btl_vader.c to btl_vader_module.c
This commit was SVN r26785.
2012-07-11 20:14:19 +00:00
Terry Dontje
6f3195faca add some missing casts
This commit was SVN r26779.
2012-07-10 18:03:29 +00:00
Nathan Hjelm
05c5c1f412 remove unused i_initiate function from udcm
This commit was SVN r26778.
2012-07-10 17:22:19 +00:00
Jeff Squyres
bb13e21538 Roll back r26730, but bump the default CQ length base up to 1500, not
1000.  Refs trac:3154.

IB/iWarp vendors need to get together to figure out a real fix.

This commit was SVN r26777.

The following SVN revision numbers were found above:
  r26730 --> open-mpi/ompi@5315c91baf

The following Trac tickets were found above:
  Ticket 3154 --> https://svn.open-mpi.org/trac/ompi/ticket/3154
2012-07-10 16:53:27 +00:00
Nathan Hjelm
4c0c937953 Remove use of ompi_ptr_ltop in BTLs. This fixes a crash seen on big-endian 32-bit platforms with MPI one-sided.
This commit was SVN r26776.
2012-07-10 16:18:53 +00:00
George Bosilca
7d6006a5a6 Fix various compiler warnings.
This commit was SVN r26774.
2012-07-10 15:57:15 +00:00
Abhishek Kulkarni
2ca8292f46 Fix a typo in the sm btl (related to CMA support).
This commit was SVN r26772.
2012-07-10 00:12:05 +00:00
Abhishek Kulkarni
5c58a1c9c1 Fix C/R support in the trunk.
Among other things, this patch deals with the following issues:
* fix ompi-checkpoint argument parsing
* ompi-restart -showme prints an extraneous "Restarted child with PID" 
  message. Move around the debug statement to avoid this.
* fixes for the state machine changes

This commit was SVN r26770.
2012-07-09 23:34:13 +00:00
Terry Dontje
43314776ae add cast to correct a type mismatch warning
This commit was SVN r26767.
2012-07-09 18:32:39 +00:00
Edgar Gabriel
d18dad7109 remove the file io_ompio_coll_offset. These routines were the predecessors of
the routines in io_ompio_coll_array. No need to keep both versions around.

This commit was SVN r26766.
2012-07-09 17:12:46 +00:00
Edgar Gabriel
8ae22cacc1 - remove two functions that were not used anymore
- change the location where we mark the file view as contiguous and the
  condition on how it is determined to be contiguous
- remove the unnecessary include statements

This commit was SVN r26763.
2012-07-08 12:57:17 +00:00
George Bosilca
46e808c940 Release the lock a little later.
This commit was SVN r26762.
2012-07-08 12:57:00 +00:00
George Bosilca
ee1410a8d2 A size_t variable should not start at a negative value.
This commit was SVN r26761.
2012-07-08 12:56:33 +00:00
George Bosilca
4326537fe9 Remove compiler warning about uninitialized variable.
This commit was SVN r26760.
2012-07-08 00:07:52 +00:00
George Bosilca
57f08ec2c8 Make it compile!
This commit was SVN r26759.
2012-07-08 00:06:13 +00:00
Brian Barrett
58413fa1e4 * properly setup communication infrastructure for libnbc.
* Prevent infinite recursion in progress loop.

Should fix improper barrier eugene was seeing.

This commit was SVN r26758.
2012-07-06 13:59:03 +00:00
Brian Barrett
e0ceabd486 Need to set MPI_ERROR in the status before calling ompi_request_complete.
This commit was SVN r26757.
2012-07-06 01:14:35 +00:00
Brian Barrett
27d45ad550 Implement reduce_scatter_block and ireduce_scatter_block, although possibly
not nearly as optimal as they should be.

This commit was SVN r26756.
2012-07-05 22:11:48 +00:00
Terry Dontje
025b42bbb7 corrected the change of pval to lval introduced in r26626
This commit was SVN r26751.

The following SVN revision numbers were found above:
  r26626 --> open-mpi/ompi@249066e06d
2012-07-05 13:31:24 +00:00
George Bosilca
ec760454a6 Cleaning ...
This commit was SVN r26747.
2012-07-04 21:22:13 +00:00
George Bosilca
63278df92d Prevent the coll SM from looking for information about remote procs
during the init phase. This information is only available at a 
later stage.

This commit was SVN r26746.
2012-07-04 21:15:40 +00:00
Terry Dontje
95a3b4a423 corrected the change of pval to lval introduced in r26626
This commit was SVN r26732.

The following SVN revision numbers were found above:
  r26626 --> open-mpi/ompi@249066e06d
2012-07-03 18:52:18 +00:00
Terry Dontje
1895ca2bc4 corrected a typo (lval instead of pval) introduced in r26626
This commit was SVN r26731.

The following SVN revision numbers were found above:
  r26626 --> open-mpi/ompi@249066e06d
2012-07-03 17:46:43 +00:00
Jeff Squyres
5315c91baf Fixes trac:3152: slightly more advanced than the patch on the ticket:
* If the MCA param btl_openib_cq_size is set to 0 (which is the
   default), use the device CQ max size. Otherwise, use the MCA param
   value (and never adjust it again).
 * Remove the CQ size adjustment code. Since we default to max CQ
   size, there really isn't much point in having it any more. I think
   people setting an absolute CQ size is going to be rare, so let's
   not do anything fancy with it.
 * If the MCA param value is larger than what the device supports,
   print a warning (only once per process) and default to using the
   device max
 * Add a BTL_VERBOSE displaying which CQ size we used

This commit was SVN r26730.

The following Trac tickets were found above:
  Ticket 3152 --> https://svn.open-mpi.org/trac/ompi/ticket/3152
2012-07-03 16:49:59 +00:00
Nathan Hjelm
9f3717959e remove sync step from udcm as it really isn't necessary
This commit was SVN r26724.
2012-07-02 22:54:44 +00:00
Brian Barrett
d56de80b5d * Properly initialize handle variable as a request (since the coll_libnbc_request contains everything an NBC_Handle used to contain). Not sure how this slipped through...
This commit was SVN r26710.
2012-07-02 16:39:42 +00:00
Brian Barrett
7e67bfa175 Use OMPI's ops instead of the libnbc ops.
This commit was SVN r26708.
2012-07-02 15:47:22 +00:00
Pavel Shamis
f7664b3814 1. Adding 2 new components:
ofacm - generic connection manager for IB interconnects.
ofautils - IB common utilities and compatibility code

2. Updating OpenIB configure code

- ORNL & Mellanox Teams 

This commit was SVN r26707.
2012-07-02 15:20:12 +00:00
Yevgeny Kliteynik
0e28fa984b Remove dead code that was related to ticket #2971
This commit was SVN r26701.
2012-07-02 11:19:09 +00:00
Nathan Hjelm
a847df9ba5 ugni: fix eager get
This commit was SVN r26699.
2012-06-29 15:43:29 +00:00
Jeff Squyres
5d030278e1 Refs trac:3130: Per comment 8 on the ticket, this MX patch fixes the cases
where the MX BTL and MTL are stepping on each other regarding the
mpool.  Thanks to Yong Qin for assistance in tracking this down.

This commit was SVN r26698.

The following Trac tickets were found above:
  Ticket 3130 --> https://svn.open-mpi.org/trac/ompi/ticket/3130
2012-06-29 13:52:40 +00:00
Jeff Squyres
b936229b54 Refs trac:3130: fix the openib BTL to properly set the memalign malloc
hook early in the setup, but ''not'' during the component register
function.  And then properly unset it if was set.

This commit was SVN r26697.

The following Trac tickets were found above:
  Ticket 3130 --> https://svn.open-mpi.org/trac/ompi/ticket/3130
2012-06-29 13:51:36 +00:00
Jeff Squyres
f3a8722360 Fix comment.
This commit was SVN r26696.
2012-06-29 01:38:04 +00:00
Brian Barrett
0b887ab5a1 * Remove unneeded prototype that was causing compile issues anyway
* Use proper tag space (the negatives below the blocking communicators)
  instead of the point-to-point space
* Use the PML interface instead of the MPI interface, since the MPI
  interface 1) shouldn't be used by components and 2) doesn't like
  negative tags

This commit was SVN r26693.
2012-06-28 16:52:03 +00:00
Edgar Gabriel
b0954a6a3e set the internal OMPIO file pointer to the end of the file if file has been
opened using the APPEND mode.

This commit was SVN r26692.
2012-06-28 15:15:47 +00:00
Edgar Gabriel
32b0dfed31 * set the status _ucount field correctly for individual read and write
operations
* removing a lingering reference to the ylib fcoll component, which will not
be part of the 1.7 branch.

This commit was SVN r26691.
2012-06-28 14:43:56 +00:00
Edgar Gabriel
be6ea52bb4 some further cleanup of resources in case of an error.
This commit was SVN r26690.
2012-06-28 13:58:23 +00:00
Ralph Castain
a1344bc5c0 Add missing header to tarball
This commit was SVN r26689.
2012-06-28 13:07:18 +00:00
Brian Barrett
32e70b691a Re-enable non-blocking collectives in libnbc after finding issue with the definition of
NBC_CACHE_SCHEDULE not being propogated to all uses.

This commit was SVN r26686.
2012-06-27 22:08:19 +00:00
Edgar Gabriel
b7a72feb1d minor code cleanup and make the MPI_MODE_DELETE_ON_CLOSE work
This commit was SVN r26685.
2012-06-27 20:54:58 +00:00
Brian Barrett
d85fdd2605 temporarily back out r26682 and r26683 until I can figure out why they cause crashes during shutdown
This commit was SVN r26684.

The following SVN revision numbers were found above:
  r26682 --> open-mpi/ompi@15a30af11f
  r26683 --> open-mpi/ompi@f6ea4b7234
2012-06-27 19:32:53 +00:00
Brian Barrett
f6ea4b7234 Remove now unneeded header file
This commit was SVN r26683.
2012-06-27 18:43:40 +00:00
Brian Barrett
15a30af11f Turn on all the non-blocking collectives provided by libnbc...
This commit was SVN r26682.
2012-06-27 18:32:57 +00:00
Brian Barrett
3933d0a8f0 Ibarrier works! :)
This commit was SVN r26680.
2012-06-27 15:58:17 +00:00
Ralph Castain
0dfe29b1a6 Roll in the rest of the modex change. Eliminate all non-modex API access of RTE info from the MPI layer - in some cases, the info was already present (either in the ompi_proc_t or in the orte_process_info struct) and no call was necessary. This removes all calls to orte_ess from the MPI layer. Calls to orte_grpcomm remain required.
Update all the orte ess components to remove their associated APIs for retrieving proc data. Update the grpcomm API to reflect transfer of set/get modex info to the db framework.

Note that this doesn't recreate the old GPR. This is strictly a local db storage that may (at some point) obtain any missing data from the local daemon as part of an async methodology. The framework allows us to experiment with such methods without perturbing the default one.

This commit was SVN r26678.
2012-06-27 14:53:55 +00:00
Josh Hursey
28681deffa Backout the ORCA commit. :(
There is a linking issue on Mac OSX that needs to be addressed before this is able to come back into the trunk.

This commit was SVN r26676.
2012-06-27 01:28:28 +00:00
Josh Hursey
32050f026f protect the ORTE_CHECK_PMI define in the OMPI layer for --no-orte builds
This commit was SVN r26674.
2012-06-27 00:28:37 +00:00
Josh Hursey
542330e3a7 Commit of ORCA: Open MPI Runtime Collaborative Abstraction
This is a runtime interposition project that sits between the OMPI and ORTE layers in Open MPI.

The project is described on the wiki:
  https://svn.open-mpi.org/trac/ompi/wiki/Runtime_Interposition

And on this email thread:
  http://www.open-mpi.org/community/lists/devel/2012/06/11109.php

This commit was SVN r26670.
2012-06-26 21:42:16 +00:00
Edgar Gabriel
288d044097 get rid of the fcache framework. It was not being used as originally intended.
This commit was SVN r26668.
2012-06-26 19:53:26 +00:00
Nathan Hjelm
086000ce8d remove mpool/rdma
This commit was SVN r26665.
2012-06-26 15:56:07 +00:00
Nathan Hjelm
37c624ee43 prepare to delete mpool/rdma
This commit was SVN r26664.
2012-06-26 15:55:23 +00:00
Brian Barrett
7bdeafb772 Start bringing in libnbc. .ompi_ignored, as there's still a long way to go
This commit was SVN r26658.
2012-06-25 22:38:06 +00:00
Edgar Gabriel
6a2dd16ee3 cleaning up the usage of CFLAGS vs. CPPFLAGS. Thanks Jeff for helping with
that!

This commit was SVN r26655.
2012-06-25 20:32:58 +00:00
Nathan Hjelm
2dbe630138 fix more udapl warnings/errors
This commit was SVN r26648.
2012-06-25 15:18:50 +00:00
Brian Barrett
b9e8e4aeb9 * Initial merge of the non-blocking collectives interface. No implementation of
the back-end yet, coming real soon now, need to solve some tag issues first.

This commit was SVN r26641.
2012-06-22 20:54:12 +00:00
Nathan Hjelm
6a0ccf41e6 one more file
This commit was SVN r26638.
2012-06-22 18:21:57 +00:00
Ralph Castain
e6f3586415 Remove the orte notifier framework, per discussion at the devel meeting and follow-up with Jeff (who took the action item)
This commit was SVN r26637.
2012-06-22 18:09:23 +00:00
Nathan Hjelm
03f00c42b8 fix udapl compile problems from r26626
This commit was SVN r26635.

The following SVN revision numbers were found above:
  r26626 --> open-mpi/ompi@249066e06d
2012-06-22 14:20:45 +00:00
Nathan Hjelm
77f7171186 remove hdr_segkey from OMPI_OSC_RDMA_BASE_HDR_NTOH and OMPI_OSC_RDMA_BASE_HDR_HTON
This commit was SVN r26634.
2012-06-22 14:15:26 +00:00
Nathan Hjelm
249066e06d Timeout! Per RFC update the BTL interface to hide segment keys. All BTLs (with the exception of wv), all relevant PMLs, and osc/rdma have been updated for the new interface.
This commit was SVN r26626.
2012-06-21 17:09:12 +00:00
Ralph Castain
1e1c755fbc Remove non-existant windows file
This commit was SVN r26624.
2012-06-21 01:37:36 +00:00
Nathan Hjelm
e3bc6c0f73 btl/ugni: use grdma mpool to take advantage of shared lru
This commit was SVN r26623.
2012-06-20 23:03:59 +00:00
Nathan Hjelm
3d86b5055e btl/ugni: don't call opal_convertor_pack if there is nothing to pack
This commit was SVN r26622.
2012-06-20 23:01:37 +00:00
Nathan Hjelm
f5fd87a446 mpool/grdma: temporarily remove support for remote (local) process eviction and remove ignore.
This commit was SVN r26621.
2012-06-20 23:00:25 +00:00
Yevgeny Kliteynik
df783c0472 Precise speed of FDR and EDR
This commit was SVN r26614.
2012-06-17 07:06:37 +00:00
Nathan Hjelm
fbd1636ea4 fix seg fault when size == 0
This commit was SVN r26612.
2012-06-15 16:58:23 +00:00
Rolf vandeVaart
d6881f3a4f Rename one function. Add some new functions that can support asynchronous CUDA copies.
This commit was SVN r26611.
2012-06-15 16:56:30 +00:00
Brian Barrett
defaefd59e Clean up resources from flowcontrol on shutdown
This commit was SVN r26605.
2012-06-14 22:38:35 +00:00
Brian Barrett
946ec4cd97 * Update usage of PtlHandleIsEqual to match new semantic
* Properly set message to MPI_MESSAGE_NULL in the right places
* Fix double free of buffer for non-contiguous blocking sends
* Remove useless debugging output

This commit was SVN r26604.
2012-06-14 22:24:23 +00:00
Nathan Hjelm
0d13cbf11c ob1: bug fix. put fallback on send never actually worked. fixed.
This commit was SVN r26602.
2012-06-14 17:29:58 +00:00
Terry Dontje
634fc278d9 Fix issue with sctp config scripts not detecting netinet/in.h dependency. Also removing tabs from sctp m4 file
This commit was SVN r26599.
2012-06-13 10:38:28 +00:00
Nathan Hjelm
a809881f78 ob1: reset the converter after a failed sendi before trying send
This commit was SVN r26597.
2012-06-12 15:44:47 +00:00
Ralph Castain
269cb2b8d9 Some cleanup to remove calls to opal_progress when running with orte progress threads, and to ensure that all orte-related events are in the orte event base.
This commit was SVN r26591.
2012-06-11 19:59:53 +00:00
Brian Barrett
31279eb641 Fix segfault with long expected messages when using the rndv protocol. We were
freeing the ME before the get to grab the long part of the message.

This commit was SVN r26589.
2012-06-11 16:37:01 +00:00
Brian Barrett
7406ef1241 Make all the PMI components depend on the common pmi library and properly
install the common pmi library

This commit was SVN r26588.
2012-06-11 15:58:09 +00:00