1
1
Граф коммитов

4237 Коммитов

Автор SHA1 Сообщение Дата
Joshua Ladd
16beaa3878 This fixes the nasty configure.m4 hack that was added long ago and not removed. My fault for not catching earlier. I've also removed the '.ompi_ignore' in coll/hcoll. Throwing this to Nathan for review. Upon successful review, this should be added to cmr:v1.7:reviewer=hjelmn
This commit was SVN r28753.
2013-07-11 09:55:46 +00:00
Jeff Squyres
28dac8010b The hcoll component configure.m4 commits multiple sins, and breaks
many builds.  I am temporarily .ompi_ignore'ing this component until
it can be fixed by its owner.

 * It calls AC_MSG_ERROR, which configure.m4 scripts are ''never''
   supposed to do.  If you don't want to build, then call $2.
 * All static and --disable-dlopen builds are broken; they fall afoul
   of whatever test configure.m4 is doing and therefore error out of
   configure entirely (vs. simply disabling the hcoll component).
 * There appear to be multiple shell scripting errors in the
   configure.m4.  Here's the output of "./configure --disable-dlopen":
{{{
--- MCA component coll:hcoll (m4 configuration macro)
checking for MCA component coll:hcoll compile mode... static
checking --with-hcoll value... simple ok (unspecified)
./configure: line 421: test: basic: integer expression expected
configure: error: Can not use coll/hcoll and coll/ml (static build)
   simultaneously. You have two options:
                1. Use static build & disable ml with:
   --enable-mpi-no-build=coll-ml
                2. Use dso build for ML & disable ml at runtime: -mca
   coll self
./configure: line 310: return: basic: numeric argument required
./configure: line 320: exit: basic: numeric argument required
}}}

Finally, all of these configure.m4 errors aside, I don't understand
why there is a ''compile-time'' exclusion between the hcoll and ml
components.  Why isn't this a ''run-time'' decision?  Having what
seems to be an unnecessary compile-time exclusion goes against the
general Open MPI philosophy.

Note: Open MPI 1.7 is also broken in all the same ways.  I suggest
that the RM's .ompi_ignore hcoll over there, too.

Mellanox: please fix.

This commit was SVN r28748.
2013-07-10 16:03:15 +00:00
Jeff Squyres
80145742a3 Fix typo in comment
This commit was SVN r28747.
2013-07-10 15:13:08 +00:00
Jeff Squyres
ea94936531 First cut at assigning some fine-grained "levels" to MCA parameters
for the SM and TCP BTLs, as well as the mca_btl_base_param_register()
function (which registers MCA params for all BTLs).

The guidelines in
https://svn.open-mpi.org/trac/ompi/wiki/MCAParamLevels were used to
pick these levels.

This commit was SVN r28746.
2013-07-10 00:47:52 +00:00
Aurelien Bouteiller
e1066143a4 rename ompi_free_list operations to _mt, as per discussions at last face to face meeting
This commit was SVN r28734.
2013-07-08 22:07:52 +00:00
Brian Barrett
ecbbf888d3 * Update Portals 4 MTL's multi-md code to be a bit cleaner (no if statements
in the path) and not create MDs due to boundary crossing
* Add the same logic to the Coll component

This commit was SVN r28733.
2013-07-08 21:27:37 +00:00
Brian Barrett
84aeb6a6a5 Update request alloc to use free list get instead of free list wait.
This commit was SVN r28729.
2013-07-05 20:24:43 +00:00
George Bosilca
dc9352faf6 Remove some unused variables.
This commit was SVN r28726.
2013-07-05 13:31:54 +00:00
George Bosilca
8b01c3da33 Slightly reorder the code.
This commit was SVN r28725.
2013-07-05 13:29:29 +00:00
George Bosilca
483ed8da8c Remove an unused variable resulting from the removal of the last parameter of
the OMPI_FREE_LIST_GET macro.

This commit was SVN r28723.
2013-07-04 09:19:00 +00:00
George Bosilca
c9e5ab9ed1 Our macros for the OMPI-level free list had one extra argument, a possible return
value to signal that the operation of retrieving the element from the free list
failed. However in this case the returned pointer was set to NULL as well, so the
error code was redundant. Moreover, this was a continuous source of warnings when
the picky mode is on.

The attached parch remove the rc argument from the OMPI_FREE_LIST_GET and
OMPI_FREE_LIST_WAIT macros, and change to check if the item is NULL instead of
using the return code.

This commit was SVN r28722.
2013-07-04 08:34:37 +00:00
Brian Barrett
d3b49535b5 Only allow communication from the same user, since we don't have job-level
protection.

This commit was SVN r28715.
2013-07-03 17:29:02 +00:00
Jeff Squyres
d1ce64f049 Fix some "malloc of 0 bytes" warnings
This commit was SVN r28713.
2013-07-03 12:05:33 +00:00
Brian Barrett
81efd0e3cf Properly shut down Portals collective component
This commit was SVN r28707.
2013-07-02 22:07:27 +00:00
Brian Barrett
133dafd3dc First take at Barrier and Ibarrier, both of which seem to work.
This commit was SVN r28706.
2013-07-02 21:42:10 +00:00
Brian Barrett
c4577723ed fix misuse of param api
This commit was SVN r28705.
2013-07-02 21:41:42 +00:00
Brian Barrett
c9a8217af6 Portals 4 doesn't have a BTL, need to default to MTL, rather than finding some stupid slow BTL. THis selection logic sucks.
This commit was SVN r28704.
2013-07-02 21:18:04 +00:00
Brian Barrett
e4698f5cd4 Shell of the Portals 4 collectives componetn
This commit was SVN r28703.
2013-07-02 15:23:55 +00:00
Joshua Ladd
5d2d5e958c Deleting garbage I accidentally committed. Thanks, Nathan\!
This commit was SVN r28698.
2013-07-01 22:50:54 +00:00
Joshua Ladd
d7a50343bf Per the details and schedule outlined in the attached RFC, Mellanox Technologies would like to CMR the new 'coll/hcoll' component. This component enables Mellanox Technologies' latest HPC middleware offering - 'Hcoll'. 'Hcoll' is a high-performance, standalone collectives library with support for truly asynchronous, non-blocking, hierarchical collectives via hardware offload on supporting Mellanox HCAs (ConnectX-3 and above.) To build the component, libhcoll must first be installed on your system, then you must configure OMPI with the configure flag: '--with-hcoll=/path/to/libhcoll'. Subsequent to installing, you may select the 'coll/hcoll' component at runtime as you would any other coll component, e.g. '-mca coll hcoll,tuned,libnbc'. This has been reviewed by Josh Ladd and should be added to cmr:v1.7:reviewer=jladd
This commit was SVN r28694.
2013-07-01 22:39:43 +00:00
George Bosilca
ae190246df Oops, thanks Jeff for noticing.
This commit was SVN r28693.
2013-07-01 17:51:52 +00:00
George Bosilca
e665cda6c2 Add the empty basic component where the function pointer from the
base will be copied over. Without such a decoy component the
entire framework will not function correctly.

This commit was SVN r28692.
2013-07-01 17:47:44 +00:00
George Bosilca
dc1e68c3c1 Remove the item from the list before releasing it.
This commit was SVN r28691.
2013-07-01 16:54:48 +00:00
George Bosilca
702e669636 Remove a [very] annoying warning.
This commit was SVN r28690.
2013-07-01 16:49:13 +00:00
George Bosilca
5fae72b9aa Add the MPI 2.2 MPI_Dist_graph functionality.
This patch reshape the way we deal with topologies completely. Where
our topologies were mainly storage components (they were not capable
of creating the new communicator), the new version is built around a
[possibly] common representation (in mca/topo/topo.h), but the functions
to attach and retrieve the topological information are specific to each
component. As a result the ompi_create_cart and ompi_create_graph functions
become useless and have been removed.

In addition to adding the internal infrastructure to manage the topology
information, it updates the MPI interface, and the debuggers support and
provides all Fortran interfaces.

This commit was SVN r28687.
2013-07-01 12:40:08 +00:00
George Bosilca
b82abf6bef Silence a compiler warning.
This commit was SVN r28686.
2013-07-01 11:40:42 +00:00
Rolf vandeVaart
adda653fc1 Fix two bugs from previous commit.
This commit was SVN r28684.
2013-06-28 16:32:51 +00:00
Rolf vandeVaart
850d325f32 Adjust how search is done for dynamic load of library. CUDA only.
This commit was SVN r28683.
2013-06-27 22:13:25 +00:00
Jeff Squyres
e3d0782788 Move the assignment after the bozo check.
This commit was SVN r28669.
2013-06-22 12:38:32 +00:00
Rolf vandeVaart
5ebb74bee3 Fix case where amount of data sent is less than expected. Otherwise, we will get hang when running the RGET protocol.
Reviewed by hjelm,bosilca.

This commit was SVN r28667.
2013-06-21 18:35:16 +00:00
Joshua Ladd
0b5c1f2ea8 Add 'generic' support for PMI2 (previously, we checked for PMI2 only on Cray systems.) If your resource manager (e.g. SLURM) has support for PMI2, then the --with-pmi configure flag will enable its usage. If you don't have PMI2, then you will fallback to regular old PMI1. This patch was submitted by Ralph Castain and reviewed and pushed by Josh Ladd. This should be added to cmr:v1.7:reviewer=jladd
This commit was SVN r28666.
2013-06-21 15:28:14 +00:00
Mike Dubman
d1c82994be fix: detect threading model to take appropriate flow in mxm
This commit was SVN r28648.
2013-06-16 08:40:06 +00:00
Jeff Squyres
a0b27f5b28 Better comment than what was submitted in r28614.
This commit was SVN r28631.

The following SVN revision numbers were found above:
  r28614 --> open-mpi/ompi@9556310bd0
2013-06-13 20:52:44 +00:00
Mike Dubman
9556310bd0 cosmetic: add comment with rationale for malloc.h include
This commit was SVN r28614.
2013-06-12 05:58:32 +00:00
Nathan Hjelm
9b1f32bf12 BTL: add flags for signaled BTL operations
As per discussion in the June 2013 developer meeting these
flags will be used by the PML in the future to request
asynchronous progress on an operation. The naming was chosen
to reflect that a BTL supports this mode (MCA_BTL_FLAG_SIGNALED)
and that a descriptor should "signal" the remote side to wake
up and progress the message (MCA_BTL_DES_FLAG_SIGNAL).

Future commits will update OB1 to take advantage of this
feature when performing the RDMA get or RDMA rendezvous
protocols.

This commit was SVN r28612.
2013-06-11 21:52:20 +00:00
Mike Dubman
d18b3ae1a7 fix malloc deprication error with gcc 4.6.3 on ubuntu/fedora
This commit was SVN r28605.
2013-06-09 18:13:16 +00:00
George Bosilca
d789423d34 Typo.
This commit was SVN r28603.
2013-06-08 10:44:02 +00:00
Vishwanath Venkatesan
0b727f84da Avoid malloc of zero bytes, add a check and avoid it.
This commit was SVN r28597.
2013-06-06 14:08:57 +00:00
Edgar Gabriel
2d4655a05a Logic has been revised compared to the previous implementation.
This commit was SVN r28594.
2013-06-05 23:47:42 +00:00
Edgar Gabriel
03c1db7a3a fix the calculation of the UNIFORM flag.
This commit was SVN r28593.
2013-06-05 23:18:50 +00:00
Vishwanath Venkatesan
7d6a05982a Removing the gather_array based on the flag UNIFORM FVIEW for read all operations (dynamic/static),
+ Disabling Timing data extraction by default in dynamic write all

This commit was SVN r28592.
2013-06-05 21:35:37 +00:00
Vishwanath Venkatesan
55878674d7 1. Removing the allgather_array based on the flag UNIFORM FVIEW. This is not really and optimization.
2. Fixing some of the debug printf's these are outdated.

This commit was SVN r28591.
2013-06-05 21:30:15 +00:00
Jeff Squyres
713e3aa3db Refs trac:3626: that ticket specifically refers to the v1.6 branch; this
commit is the trunk version of what is needed for #3626.

Add the "ignore_device" field to the INI file.  This allows us to
specifically list devices that should be ignored by the openib BTL
(such as the Intel Phi, at least as of May 2013 -- see #3626).  

Also add the Intel Phi to the ini file, and set its ignore_device=1.

Finally, add the concept of counting intentionally ignored verbs
devices.  Devices are ignored for one of two reasons:

 * If the number of allowed ports on that device is 0 (i.e., if
   if_include/if_exclude was set such that we're intentionally
   ignoring this device).
 * If the INI ignore_device field for this device is set to 1.

Once we have the count of devices that were intentionally ignored,
only show the "Hey, there's verbs devices that you're not using!"
show_help message if there are devices that were ''unintentionally''
ignored.

This commit was SVN r28589.

The following Trac tickets were found above:
  Ticket 3626 --> https://svn.open-mpi.org/trac/ompi/ticket/3626
2013-06-05 12:12:09 +00:00
Jeff Squyres
3019b7a3f8 Oops! Remove duplicate registration.
This commit was SVN r28588.
2013-06-05 11:55:19 +00:00
Jeff Squyres
1de00b17ad Properly check the return status from registering the MCA params.
This commit was SVN r28587.
2013-06-05 11:53:18 +00:00
Jeff Squyres
d692aba672 Remove the DR PML. It was abondoned long ago. It had a nice life,
a few papers, and now a decent demise with respect.  

This commit was SVN r28582.
2013-06-04 19:36:16 +00:00
Edgar Gabriel
87b3782b7f arghh, copy-and-paste error, status->_ucount has to be set to 0 not max_data for count=0.
This commit was SVN r28576.
2013-05-30 22:00:29 +00:00
Edgar Gabriel
9daec82f17 - make a fileview of 0 bytes work in ompio
- fixes the bug reported in ticket 3619 (which is already closed) also for ompio

This commit was SVN r28575.
2013-05-30 21:33:13 +00:00
Rolf vandeVaart
3d1d158a80 Do not abort in BTL. Rather, callback into PML error function. Thanks George for review.
This commit was SVN r28559.
2013-05-23 18:45:23 +00:00
Nathan Hjelm
721779d7ab Per RFC: remove old MCA parameter system.
This commit was SVN r28541.
2013-05-20 15:36:13 +00:00