Ralph Castain
dc0bb0571b
Record the number of heartbeats recvd each period for diag purposes
...
This commit was SVN r24714.
2011-05-20 00:21:33 +00:00
Ralph Castain
69dce0ec10
Minor heartbeat cleanups
...
This commit was SVN r24713.
2011-05-19 21:27:44 +00:00
Ralph Castain
c3df95dd13
Prevent failure due to race condition during abnormal term
...
This commit was SVN r24712.
2011-05-19 21:27:05 +00:00
Ralph Castain
b0f47e6f59
Allow orte_iof to not be opened
...
This commit was SVN r24711.
2011-05-19 21:26:30 +00:00
Ralph Castain
1f3911cc8b
Add a new proc state
...
This commit was SVN r24710.
2011-05-19 21:25:58 +00:00
Ralph Castain
b47ec2ee87
Remove lingering references to opal_profile option
...
This commit was SVN r24709.
2011-05-18 18:27:29 +00:00
Ralph Castain
9678e62613
Fix possible corruption of environ. Thanks to Ariel Burton and Peter Thompson for finding it!
...
This commit was SVN r24708.
2011-05-18 16:25:35 +00:00
Ralph Castain
ddf4914094
Plug fd leak
...
This commit was SVN r24707.
2011-05-18 13:46:27 +00:00
Ralph Castain
502cc0747f
My my...cleanup a disconnect between the man pages and how we implemented comm_spawn_multiple. We allow an info key per executable. Also fix the -host and -add-host info keys - they are supposed to accept comma-separated lists.
...
This commit was SVN r24706.
2011-05-17 20:12:31 +00:00
Ralph Castain
d34bab541d
Remove the ompi-profiler tool and its attendant ompi-probe program. Also remove the grpcomm basic component since its only function was to support profiled clusters, which nobody was doing. :-(
...
This commit was SVN r24704.
2011-05-17 03:30:25 +00:00
Ralph Castain
486041f89d
Get rid of the annoying error messages when setrlimit fails, which seems to be a constant problem on the Mac. Don't use the changed values for max limits if the setrlimit call failed.
...
This commit was SVN r24703.
2011-05-17 03:27:43 +00:00
Mike Dubman
36db9c6233
* updated copyrights
...
* added support for non-contig data layout in FCA
This commit was SVN r24702.
2011-05-16 14:43:11 +00:00
Ralph Castain
4083e23073
Complete cleanup of pstat linux
...
This commit was SVN r24701.
2011-05-16 14:08:08 +00:00
Ralph Castain
08c3ecd608
Handle the case where memory stats are in different order, or don't exist on that platform
...
This commit was SVN r24700.
2011-05-16 13:32:42 +00:00
Brian Barrett
d8b7ea315e
First take at implementing rndv and triggered protocols
...
This commit was SVN r24699.
2011-05-13 05:57:16 +00:00
Brian Barrett
43902221cc
* Fix bad argument to PtlGet in long receive
...
* Fix bad params when configuring ME for long unexpected
This commit was SVN r24698.
2011-05-13 03:56:03 +00:00
Ralph Castain
a3e43594a4
Extend node stats to include additional memory info. Change "darwin" pstat module to "test" as we don't really know how to get all the stat info for darwin.
...
Add a new OPAL_ERROR_LOG macro similar to the ORTE_ERROR_LOG one.
This commit was SVN r24692.
2011-05-08 14:45:16 +00:00
Ralph Castain
c160f5d5a2
Add ability to specify mcast interfaces by name
...
This commit was SVN r24691.
2011-05-08 14:42:48 +00:00
Brian Barrett
be8a126600
At Josh's request, make example MPI extension use the init/fini so that
...
the feature is actually documented.
This commit was SVN r24686.
2011-05-05 18:31:07 +00:00
Brian Barrett
8376e0e507
Use free list get instead of wait; this is a constrained resource that will never come back, as it scales with the number of windows and not some more dynamic resources...
...
This commit was SVN r24685.
2011-05-05 17:19:59 +00:00
Jeff Squyres
d1d2cd0a87
Make the description of mca_btl_openib_cq_size be more accurate of
...
what it really is/does.
cmr:v1.5.4:kliteyn cmr:v1.4.4:reviewer=kliteyn
This commit was SVN r24684.
2011-05-05 13:10:11 +00:00
Christopher Yeoh
bab59bda76
Fixes trac:2767: Recursive locking when ROMIO used with THREAD_MULITPLE
...
This commit was SVN r24681.
The following Trac tickets were found above:
Ticket 2767 --> https://svn.open-mpi.org/trac/ompi/ticket/2767
2011-05-04 06:31:42 +00:00
George Bosilca
34abbce82c
More accurate and trustworthy descriptions of the netmask exist.
...
Interested readers can quench their curiosity either with one
of the Richard Stevens books (ISBN 9780201633467) or the
Wikipedia page (http://en.wikipedia.org/wiki/Subnetwork ).
This commit was SVN r24680.
2011-05-03 21:59:51 +00:00
Thomas Herault
fb3fd8fd0e
items belonging to peer_send_queue are mca_oob_tcp_msg_t *, which are obtained through a opal_freelist.
...
They shouldn't be released, but returned to the freelist.
This commit was SVN r24679.
2011-05-03 21:03:09 +00:00
Brian Barrett
3fed6053a4
don't build BTLs when using portals SHMEM. It breaks things :)
...
This commit was SVN r24678.
2011-05-03 20:17:19 +00:00
George Bosilca
c3c231b5ae
Unsigned datatypes should be redirected to their unsigned correspondants
...
in the OPAL layer. Thenks to Yossi Etigin for the patch.
cmr:v1.5
This commit was SVN r24677.
2011-05-03 12:53:52 +00:00
Shiqing Fan
b4e5826403
Exclude two non-mca files that shouldn't be compiled under windows.
...
This commit was SVN r24669.
2011-05-02 14:39:22 +00:00
Ralph Castain
9df207aa51
Fix the case where a user supplies the -xterm option, which requires that we leave ssh sessions attached.
...
This commit was SVN r24668.
2011-05-02 12:39:55 +00:00
Ralph Castain
257473ebca
Remove an extra "break" - thanks to Rainer for pointing it out.
...
This commit was SVN r24667.
2011-05-02 12:20:37 +00:00
Ralph Castain
138928fcf4
Use ports as multicast channels instead of networks so we avoid stepping into reserved spaces.
...
This commit was SVN r24666.
2011-04-29 18:46:40 +00:00
Ralph Castain
7b29a6153e
Cover all the netmask values
...
This commit was SVN r24665.
2011-04-29 17:56:15 +00:00
Shiqing Fan
9e90ade864
Missed one file from the last commit.
...
This commit was SVN r24664.
2011-04-29 14:44:02 +00:00
Shiqing Fan
4490fdbd34
Add the initial support for MinGW and MSYS.
...
Correctly check the dependencies of MSYS env.
Set up configure include and lib path for building the package.
update a few more CMake scripts.
This commit was SVN r24663.
2011-04-29 14:42:07 +00:00
Rolf vandeVaart
3e8878f556
Add in missing header file
...
This commit was SVN r24662.
2011-04-29 13:20:59 +00:00
Ralph Castain
c78531ce8a
Don't free the envar that gets putenv'd as that messes up the environ
...
This commit was SVN r24660.
2011-04-29 08:50:29 +00:00
Rolf vandeVaart
2634f6401a
Add some basic support for sending and receiving CUDA device memory. Feature is disabled by default and has no effect on default code paths.
...
This commit was SVN r24659.
2011-04-28 23:05:55 +00:00
Ralph Castain
0ff0d20e72
Grr...get the prefix right - need to strip the bin out of absolute path to mpirun.
...
This commit was SVN r24658.
2011-04-28 22:20:55 +00:00
Ralph Castain
6af2677fb8
Check for both absolute-path-to-mpirun and -prefix being specified. If the two differ, print out a warning and ignore -prefix. If they are the same, or only one was given, then proceed as directed.
...
This commit was SVN r24657.
2011-04-28 22:12:41 +00:00
Jeff Squyres
c8c8044b92
Minor updates to NEWS
...
This commit was SVN r24654.
2011-04-28 21:40:44 +00:00
Brad Benton
0876e3549d
Added a couple more items for 1.4.4.
...
This commit was SVN r24651.
2011-04-28 18:47:34 +00:00
Brad Benton
359de0bde6
Add 1.4.4 items.
...
This commit was SVN r24650.
2011-04-28 15:48:51 +00:00
Jeff Squyres
0882d636a6
Oops -- need string.h, too (for strcasecmp).
...
This commit was SVN r24649.
2011-04-28 15:42:35 +00:00
Jeff Squyres
7362a0730a
Change the default to "none". David Singleton raises a good point
...
that enabling "local_only" by default could cause excessive
by-NUMA-node paging and/or OOMs (rather than allowing memory
allocations to spill over to other NUMA nodes).
This brought home the very real-world example of people buying servers
with more processors/cores than they need, just to get more memory.
We wouldn't want Badness to occur in such scenarios by default.
Instead, let people turn on "only allow memory allocations on my local
NUMA node" if their application would benefit from it.
This commit was SVN r24648.
2011-04-28 15:16:39 +00:00
Ralph Castain
b586f2952e
Arggg...revert r24645. I knew those fields were there for a reason...sigh.
...
This commit was SVN r24647.
The following SVN revision numbers were found above:
r24645 --> open-mpi/ompi@e4732110da
2011-04-28 15:07:00 +00:00
Ralph Castain
859aaab93d
In the case of direct-launched processes running under slurm, psm requires that the pre_condition_transports MCA param be set. This is normally computed by mpirun and inserted into each proc's environ, but that doesn't work here.
...
So separate out the printing of that key, and let the individual procs generate it in a way that ensures they all get the same result.
This commit was SVN r24646.
2011-04-28 13:54:33 +00:00
Ralph Castain
e4732110da
Remove a couple more stale fields
...
This commit was SVN r24645.
2011-04-28 00:26:38 +00:00
Ralph Castain
39369f8807
Remove stale fields from global objects - have been moved to the layer that actually uses them
...
This commit was SVN r24644.
2011-04-28 00:20:49 +00:00
Ralph Castain
8858d9a40e
Add a marker for other layers to use in defining data types
...
This commit was SVN r24643.
2011-04-28 00:19:35 +00:00
Jeff Squyres
7b48042ffd
Commit patch from upstream hwloc: r3482. Fixes some compiler
...
warnings.
This commit was SVN r24641.
The following SVN revision numbers were found above:
r3482 --> open-mpi/ompi@2435be8d49
2011-04-27 17:08:15 +00:00
Jeff Squyres
d134ff9b4d
Refs trac:2698
...
After a long period of development with many starts and stops, we
finally got this where we wanted it.
This commit introduces 2 new MCA params (note that the
"maffinity_libnuma_policy" MCA param introduced by r24290 was removed
when libnuma support was removed). Remember that maffinity policies
are only in effect when paffinity is enaabled -- i.e., when processes
are bound to processors!
* '''maffinity_base_alloc_policy:''' Policy that determines how
general memory allocations are bound after MPI_INIT. A value of
"none" means that no memory policy is applied. A value of
"local_only" means that all memory allocations will be restricted
to the local NUMA node where each process is placed. Note that
operating system paging policies are unaffected by this setting.
For example, if "local_only" is used and local NUMA node memory is
exhausted, a new memory allocation may cause paging.
* '''maffinity_base_bind_failure_action:''' What Open MPI will do if
it explicitly tries to bind memory to a specific NUMA location, and
fails. Note that this is a different case than the general
allocation policy described by maffinity_base_alloc_policy. A
value of "warn" means that Open MPI will warn the first time this
happens, but allow the job to continue (possibly with degraded
performance). A value of "error" means that Open MPI will abort
the job if this happens.
This needs at least a little soak time on the trunk before going to
v1.5.
This commit was SVN r24639.
The following SVN revision numbers were found above:
r24290 --> open-mpi/ompi@afa654746c
The following Trac tickets were found above:
Ticket 2698 --> https://svn.open-mpi.org/trac/ompi/ticket/2698
2011-04-26 13:31:07 +00:00