1
1
Граф коммитов

20516 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
75230ee574 spml_yoda_getreq.c: fix compile error related to r32196
This commit was SVN r32197.

The following SVN revision numbers were found above:
  r32196 --> open-mpi/ompi@a14e0f10d4
2014-07-10 17:17:19 +00:00
Nathan Hjelm
a14e0f10d4 Per RFC: Remove des_src and des_dst members from the
mca_btl_base_segment_t and replace them with des_local and des_remote

This change also updates the BTL version to 3.0.0. This commit does
not represent the final version of BTL 3.0.0. More changes are coming.

In making this change I updated all of the BTLs as well as BTL user's
to use the new structure members. Please evaluate your component to
ensure the changes are correct.

RFC text:

This is the first of several BTL interface changes I am proposing for
the 1.9/2.0 release series.

What: Change naming of btl descriptor members. I propose we change
des_src and des_dst (and their associated counts) to be des_local and
des_remote. For receive callbacks the des_local member will be used to
communicate the segment information to the callback. The proposed change
will include updating all of the doxygen in btl.h as well as updating
all BTLs and BTL users to use the new naming scheme.

Why: My btl usage makes use of both put and get operations on the same
descriptor. With the current naming scheme I need to ensure that there
is consistency beteen the segments described in des_src and des_dst
depending on whether a put or get operation is executed. Additionally,
the current naming prevents BTLs that do not require prepare/RMA matched
operations (do not set MCA_BTL_FLAGS_RDMA_MATCHED) from executing
multiple simultaneous put AND get operations. At the moment the
descriptor can only be used with one or the other. The naming change
makes it easier for BTL users to setup/modify descriptors for RMA
operations as the local segment and remote segment are always in the
same member field. The only issue I forsee with this change is that it
will require a little more work to move BTL fixes to the 1.8 release
series.

This commit was SVN r32196.
2014-07-10 16:31:15 +00:00
Mike Dubman
1b51e472ac MXM: update min required version for mxm as 2.1+
This commit was SVN r32193.
2014-07-10 13:16:57 +00:00
Jeff Squyres
9428d1c6d2 AUTHORS: Add Howard Pritchard
This commit was SVN r32192.
2014-07-10 11:16:49 +00:00
George Bosilca
2861419661 Correct trivial typos in man files and FUNC_NAME variables.
Patch provided by Fujitsu (Kawashima, Takahiro)

cmr=v1.8.2:reviewer=jsquyres

This commit was SVN r32190.
2014-07-10 01:47:23 +00:00
Ralph Castain
796f57f709 Protect against problems if someone passes us thru a pipe and then abnormally terminates the pipe early
This commit was SVN r32189.
2014-07-09 22:41:53 +00:00
Ralph Castain
7beb8f6799 Silence used-before-set var warning
This commit was SVN r32188.
2014-07-09 22:37:47 +00:00
Ralph Castain
60da1456d9 Silence unused var warning
This commit was SVN r32187.
2014-07-09 22:37:22 +00:00
Ralph Castain
354a1d07a6 Silence warning about set-but-unused var
This commit was SVN r32186.
2014-07-09 22:37:05 +00:00
Howard Pritchard
0bc7405e07 Subject: fix name conflict when both ugni and scif installed on system
Description: This mod fixes two name conflicts between the ugni and scif btls.
References:4771
Closes:4771

cmr=v1.8.2:reviewer=hjelmn

This commit was SVN r32183.
2014-07-09 19:33:58 +00:00
Mike Dubman
6efc9a7329 opal mca: externalize delimiter for base_env_list mca parameters
fixed by Elena, reviewed by MikeD

This commit was SVN r32182.
2014-07-09 18:55:49 +00:00
Nathan Hjelm
8ab9b628b5 Add missing file to tarball
This commit was SVN r32181.
2014-07-09 18:01:05 +00:00
Joshua Ladd
801e2cb544 Fix error and warning messages after reverting
the mca_base_env_list to being semicolon delimited.

This commit was SVN r32179.
2014-07-09 14:46:19 +00:00
Nathan Hjelm
56ad231b7c coll/ml: temporarily disable binding check
This commit was SVN r32178.
2014-07-09 14:39:49 +00:00
Alex Mikheev
0dfd321b59 OSHMEM: fixes error handling in memheap
Memory registration is aborted on first failure.
Already registered memory is freed and
correct error code is returned.

Memory deregistration always suceeeds 

reviewed by miked
cmr=v1.8.2:reviewer=ompi-rm1.8

This commit was SVN r32175.
2014-07-09 09:26:25 +00:00
Alex Mikheev
c3e017c190 OSHMEM: refactoring of fix wrong btl/sm processing
Use exising fields of mkey struct to identify 'shared memory'
segments.

mkey.u.key is now always initialized to MAP_SEGMENT_SHM_INVALID instead
of 0

reviewed by Mike and Igor
cmr=v1.8.2:reviewer=ompi-rm1.8

This commit was SVN r32174.
2014-07-09 08:57:27 +00:00
Gilles Gouaillardet
3743c27c7a Handle error case in mca_spml_yoda_register
this commit fixes error propagation in :
 - mca_memheap_base_reg
 - mca_memheap_base_dereg

cmr=v1.8.2:reviewer=amikheev:ticket=4747

This commit was SVN r32173.

The following Trac tickets were found above:
  Ticket 4747 --> https://svn.open-mpi.org/trac/ompi/ticket/4747
2014-07-09 07:19:25 +00:00
Mike Dubman
32aeba4bdf opal: revert env_list separator to ";" from "+"
This commit was SVN r32172.
2014-07-09 05:38:51 +00:00
Joshua Ladd
057370364d Opal: Add a new MCA variable type "version_string". Also add a
new flag to ompi_info that allows a user to print all MCA variables of a specific type.  

 --type version_string

This command will print all MCA variables of type version_string.

This feature was developed by Elena Shipunova and was reviewed by Josh Ladd.

This commit was SVN r32166.
2014-07-09 01:37:23 +00:00
Jeff Squyres
cde2b67284 VERSION: remove libmpi_usempi_ignore_tkr_so_version; it isn't used anywhere
This commit was SVN r32164.
2014-07-09 01:15:25 +00:00
Joshua Ladd
30da6d3a17 Opal: add a new MCA parameter that allows the user to specify a list of environment variables. This parameter will become the standard mechanism by which environment variables are set for OMPI applications replacing the -x option.
mpirun ... -x env_foo1=val1 -x env_foo2 -x env_foo3=val3  should now be expressed as

mpirun ... -mca mca_base_env_list env_foo1=val1+env_foo2+env_foo3=val3. 

The motivation for doing this is so that a list of environment variables may be set via standard MCA mechanisms such as mca parameter files, amca lists, etc. 

This feature was developed by Elena Shipunova and was reviewed by Josh Ladd.

This commit was SVN r32163.
2014-07-09 00:38:25 +00:00
Jeff Squyres
5081f958a6 fortran: fix MPI_Win_allocate_shared and MPI_Win_shared_query
Several problems with MPI_Win_allocate_shared and MPI_Win_shared_query
were discovered in a code review.  This commit fixes them:

* Add _cptr versions of both subroutines in mpif-h, use-mpi-tkr, and
  use-mpi-ignore-tkr directories
* Fix case of PMPI weak symbols for both C implementations
* Add MPI and PMPI f08 implementations of both subroutines (there is
  no _cptr version in the mpi_f08 module)
* Fixed _f08 suffix on the f08 module of both subroutines

cmr=v1.8.2:ticket=trac:4736

This commit was SVN r32162.

The following Trac tickets were found above:
  Ticket 4736 --> https://svn.open-mpi.org/trac/ompi/ticket/4736
2014-07-09 00:10:04 +00:00
Nathan Hjelm
1eb6ac5e80 mpit: update the return code check for mca_base_var_get
mca_base_var_get now can return OPAL_ERR_NOT_FOUND if a variable no
longer exists. This commit updates the return code check to ensure
the correct MPI_T error code is returned to the user.

cmr=v1.8.2:reviewer=jsquyres

This commit was SVN r32161.
2014-07-08 21:17:47 +00:00
Nathan Hjelm
b6abe68972 osc/rdma: check for more types of window access violations
This commit adds a check to see if the target is in an access epoch. If
not we return OMPI_ERR_RMA_SYNC. This fixes test_start3 in the onesided
test suite. The cost of this extra check is 1 byte/peer for the boolean
flag indicating that the peer is in an access epoch.

I also fixed a problem where mupliple unexpected post messages are not
correctly handled.

cmr=v1.8.2:reviewer=jsquyres

This commit was SVN r32160.
2014-07-08 21:11:12 +00:00
Jeff Squyres
d63cf04d2e btl_usnic_map.c: Arrgh! Forgot to svn add this file.
cmr=v1.8.2:ticket=trac:4773

This commit was SVN r32159.

The following Trac tickets were found above:
  Ticket 4773 --> https://svn.open-mpi.org/trac/ompi/ticket/4773
2014-07-08 20:09:31 +00:00
Jeff Squyres
86f747c627 mca_base_var: use ERR_NOT_FOUND when var is inactive
After discussion with Nathan, change the ERR_VALUE_OUT_OF_BOUNDS to be
ERR_NOT_FOUND, for two reasons:

1. It's consistent with other uses of ERR_NOT_FOUND in the code.
1. In this case, we could have just looked up a variable that is
   basically a "hole" -- e.g., var indexes 0, 1, 2 are valid, and 4,
   5, and 5 are valid, but 3 is invalid (e.g., 3 was de-registered --
   remember that MPI_T explicitly does not allow re-using indexes).
   So returning ERR_OUT_OF_BOUNDS seems weird -- returning
   ERR_NOT_FOUND seems a bit more natural.

cmr=v1.8.2:ticket=trac:4587

This commit was SVN r32158.

The following Trac tickets were found above:
  Ticket 4587 --> https://svn.open-mpi.org/trac/ompi/ticket/4587
2014-07-08 20:00:18 +00:00
Jeff Squyres
1e17ab461b usnic: add btl_usnic_connectivity_map MCA param to output link information
If the btl_usnic_connectivity_map MCA param is set to a non-NULL
value, then each MPI process will output a file named
<prefix>-<hostname>.pid<pid>.job<jobid>.mcwrank<MCW rank>.txt.  Its
contents will detail which usNIC device(s) (and therefore which
link(s)) are being used to communicate with each peer MPI process.

Here is a sample output file (named
mpi005.pid26071.job1640759297.mcwrank0.txt):

{{{
device=usnic_0,interface=eth4,ip=10.10.0.5/16,mac=24:57:20:05:20:00,mtu=9000
device=usnic_1,interface=eth5,ip=10.2.0.5/16,mac=24:57:20:05:21:00,mtu=9000
device=usnic_2,interface=eth6,ip=10.3.0.5/16,mac=24:57:20:05:50:00,mtu=9000
peer=1,hostname=mpi006,device=usnic_0@peer_ip=10.10.0.6/16@peer_mac=24:57:20:06:20:00,device=usnic_1@peer_ip=10.2.0.6/16@peer_mac=24:57:20:06:21:00,device=usnic_2@peer_ip=10.3.0.6/16@peer_mac=24:57:20:06:50:00
peer=2,hostname=mpi007,device=usnic_0@peer_ip=10.10.0.7/16@peer_mac=24:57:20:07:20:00,device=usnic_1@peer_ip=10.2.0.7/16@peer_mac=24:57:20:07:21:00,device=usnic_2@peer_ip=10.3.0.7/16@peer_mac=24:57:20:07:50:00
peer=3,hostname=mpi008,device=usnic_0@peer_ip=10.10.0.8/16@peer_mac=24:57:20:08:20:00,device=usnic_1@peer_ip=10.2.0.8/16@peer_mac=24:57:20:08:21:00,device=usnic_2@peer_ip=10.3.0.8/16@peer_mac=24:57:20:08:50:00
}}}

Reviewed by Reese Faucette

cmr=v1.8.2

This commit was SVN r32156.
2014-07-08 19:14:46 +00:00
Jeff Squyres
df82810d03 opal_path_nfs.c test: skip fuse filesystems
Linux statfs(2) lies about the type of fuse filesystems (it reports
fuse.encfs as an NFS filesystem).  So just skip fuse filesystems in
this test until/if we ever care to add some kind of workaround.

Refs trac:4767

cmr=v1.8.2:reviewer=rhc

This commit was SVN r32152.

The following Trac tickets were found above:
  Ticket 4767 --> https://svn.open-mpi.org/trac/ompi/ticket/4767
2014-07-08 13:30:49 +00:00
Nathan Hjelm
309a6cf951 coll/ml: set n_resources to 0 when destructing an lmngr
Also keep track of the allocation base so we free the correct pointer
when cleaning up.

cmr=v1.8.2:reviewer=manjugv

This commit was SVN r32151.
2014-07-07 15:11:26 +00:00
Ralph Castain
1d5f8297c8 In case someone wants to use the pstat framework, we need to have the usual open/close functions that were lost in the conversion to the new framework system
This commit was SVN r32150.
2014-07-07 13:53:29 +00:00
Ralph Castain
b85471ac13 Add support for "double" to the DSS
This commit was SVN r32149.
2014-07-07 13:51:58 +00:00
Jeff Squyres
e498e20d46 openmpi.spec: several minor fixes
Thanks to Oliver Lahaye for supplying the patch.

This commit was SVN r32147.
2014-07-07 12:06:55 +00:00
Ralph Castain
5bb5b22573 When a user asks for cpus/rank > 1 and only has one slot, we need to ensure we always map at least one process when they don't tell us -np
cmr=v1.8.2:reviewer=rhc:subject=correct num_procs in corner case

This commit was SVN r32142.
2014-07-04 17:00:35 +00:00
Ralph Castain
832fa4a028 Ensure that the progress thread tracker properly cleans up the blocking event, if set. Also, use the blocking event to help wake up the progress thread for quick shutdown as some threads can be blocked in a long-running call to select.
This commit was SVN r32141.
2014-07-04 14:55:51 +00:00
Gilles Gouaillardet
bd72628a9d Cleanup : fix the cornercase with MPI_PROC_NULL persistent requests.
This commit was SVN r32140.
2014-07-04 07:20:44 +00:00
Gilles Gouaillardet
8d3bea2771 Fix the cornercase with MPI_PROC_NULL persistent requests.
This corner case is now handled in the pml so the same code
is invoked for both MPI_Start and MPI_Startall.
This also correctly report an error if MPI_Startall is invoked twice
on a MPI_PROC_NULL persistent request.

This commit was SVN r32139.
2014-07-04 04:58:52 +00:00
Edgar Gabriel
a16e4c5bf9 As discussed during the Open MPI meeting, make ompio the default parallel I/O
library on the trunk in order to expose it to more testing.

This commit was SVN r32138.
2014-07-03 20:04:58 +00:00
Ralph Castain
f6d4b4c11b As discussed at the OMPI developer's meeting, add functions to start, stop, and restart libevent-driven progress threads. Critical NOTE: if you don't have a file descriptor event defined for your progress thread, it will spin hard! Accordingly, the "start progress thread" function has a boolean parameter you can use to request that the function automatically create one for you.
This commit was SVN r32137.
2014-07-03 18:56:46 +00:00
Ralph Castain
9f97c74ba3 Silence warning
This commit was SVN r32136.
2014-07-03 17:29:04 +00:00
Jeff Squyres
e022dd30bc usnic: EHOSTUNREACH means there is no route
ibv_create_ah() can also return EHOSTUNREACH, which means that there
is no route to the peer.  Treat that as a non-fatal warning.

Reviewed by Reese Faucette.

cmr=v1.8.2:reviewer=ompi-rm1.8

This commit was SVN r32135.
2014-07-03 17:19:30 +00:00
Jeff Squyres
81edddff61 usnic: make this show_help message like the others
There's no need for the port number (since usNIC has no port numbers),
and make the wording the same as other help messages.

Reviewed by Reese Faucette.

cmr=v1.8.2:reviewer=ompi-rm1.8

This commit was SVN r32134.
2014-07-03 17:17:34 +00:00
Ralph Castain
356e7ea904 Move all collective id's into the attributes and let the job pack/unpack take care of them instead of singling them out. Add the envars just prior to forking the children instead of into the launch message itself. Remove a few #if CR as the attributes functionality can handle this condition now.
This commit was SVN r32133.
2014-07-03 15:58:13 +00:00
Ralph Castain
0a4639308e Remove a potential race condition - we'll cleanup the local children when we are all done
This commit was SVN r32132.
2014-07-03 14:13:43 +00:00
Jeff Squyres
852af8b834 ompi_mpi_abort: fix corner cases, simplify logic
I recently found a case where ompi_mpi_abort() segv's:

{{{
$ mpirun --mca btl non_existent_btl_name ...
}}}

In this case, the BML init fails because we have no paths to any
peers.  It calls ompi_mpi_abort(), but this is before ompi_comm_self
has been setup.  ompi_mpi_abort() assumes that if the comm parameter
is != NULL, it can be used.  But since we aborted so early in
MPI_INIT, that's a false assumption.

(note that this isn't happening on v1.8 because the check for
INIT/FINALIZE in ompi_mpi_abort() is a little different.  Hence: this
is a trunk issue -- at least for now)

When fixing this problem, I noticed a few other problems in ompi_mpi_abort():

* the group access was incorrect (it didn't use accessor functions)
* it wasn't clear that ORTE's ompi_rte_abort_peers() returns
  NOT_IMPLEMENTED and falls through down to ompi_rte_abort()
* the check for my proc in the communicator was a little more
  complicated than necessary
* the logic for checking for aborts early in MPI_INIT wasn't right
* some comments were stale
* the hostname output in error messages would be NULL if MPI_FINALIZE
  had been invoked
* it was possible to abort, but still exit with a 0 status

This commit fixes all of the above problems, and makes the logic a
little more straightforward.  Thanks to Ralph Castain and George
Bosilca for the assists with this patch.

This commit was SVN r32125.
2014-07-03 02:38:27 +00:00
George Bosilca
843ef1fcb0 ompi_mpi_abort had one extra argument that was never used. Clean it up.
This commit was SVN r32124.
2014-07-03 00:34:44 +00:00
George Bosilca
2883adcdf3 Remove useless variables.
This commit was SVN r32123.
2014-07-03 00:30:54 +00:00
Ralph Castain
149810f02c Per request from Jeff, slightly modify the show_help message as the precise name of the NUMA-containing packages differs based on OS and distro
cmr=v1.8.2:reviewer=jsquyres:subject=modify show_help message

This commit was SVN r32122.
2014-07-02 14:46:00 +00:00
Gilles Gouaillardet
8a2a0293fd fix sort_devs_by_distance in btl/openib
no need to #include <math.h> ...

cmr=v1.8.2:reviewer=miked:ticket=4759

This commit was SVN r32121.

The following Trac tickets were found above:
  Ticket 4759 --> https://svn.open-mpi.org/trac/ompi/ticket/4759
2014-07-02 08:08:10 +00:00
Gilles Gouaillardet
134eee1c4f fix sort_devs_by_distance in btl/openib
The distances as returned by hwloc_get_whole_distance_matrix_by_type are typ float.
This patch handle all distances as float.

cmr=v1.8.2:reviewer=miked

This commit was SVN r32120.
2014-07-02 07:56:40 +00:00
Ryan Grant
a1d312343b This commit fixes trac:4681 - ibm c_fence_lock hangs
cmr=v1.8.2:reviewer=tkordenbrock:subject=Portals4/MTL hanging fix

This commit was SVN r32113.

The following Trac tickets were found above:
  Ticket 4681 --> https://svn.open-mpi.org/trac/ompi/ticket/4681
2014-07-01 17:03:03 +00:00