1
1
Граф коммитов

15645 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
1f3911cc8b Add a new proc state
This commit was SVN r24710.
2011-05-19 21:25:58 +00:00
Ralph Castain
b47ec2ee87 Remove lingering references to opal_profile option
This commit was SVN r24709.
2011-05-18 18:27:29 +00:00
Ralph Castain
9678e62613 Fix possible corruption of environ. Thanks to Ariel Burton and Peter Thompson for finding it!
This commit was SVN r24708.
2011-05-18 16:25:35 +00:00
Ralph Castain
ddf4914094 Plug fd leak
This commit was SVN r24707.
2011-05-18 13:46:27 +00:00
Ralph Castain
502cc0747f My my...cleanup a disconnect between the man pages and how we implemented comm_spawn_multiple. We allow an info key per executable. Also fix the -host and -add-host info keys - they are supposed to accept comma-separated lists.
This commit was SVN r24706.
2011-05-17 20:12:31 +00:00
Ralph Castain
d34bab541d Remove the ompi-profiler tool and its attendant ompi-probe program. Also remove the grpcomm basic component since its only function was to support profiled clusters, which nobody was doing. :-(
This commit was SVN r24704.
2011-05-17 03:30:25 +00:00
Ralph Castain
486041f89d Get rid of the annoying error messages when setrlimit fails, which seems to be a constant problem on the Mac. Don't use the changed values for max limits if the setrlimit call failed.
This commit was SVN r24703.
2011-05-17 03:27:43 +00:00
Mike Dubman
36db9c6233 * updated copyrights
* added support for non-contig data layout in FCA

This commit was SVN r24702.
2011-05-16 14:43:11 +00:00
Ralph Castain
4083e23073 Complete cleanup of pstat linux
This commit was SVN r24701.
2011-05-16 14:08:08 +00:00
Ralph Castain
08c3ecd608 Handle the case where memory stats are in different order, or don't exist on that platform
This commit was SVN r24700.
2011-05-16 13:32:42 +00:00
Brian Barrett
d8b7ea315e First take at implementing rndv and triggered protocols
This commit was SVN r24699.
2011-05-13 05:57:16 +00:00
Brian Barrett
43902221cc * Fix bad argument to PtlGet in long receive
* Fix bad params when configuring ME for long unexpected

This commit was SVN r24698.
2011-05-13 03:56:03 +00:00
Ralph Castain
a3e43594a4 Extend node stats to include additional memory info. Change "darwin" pstat module to "test" as we don't really know how to get all the stat info for darwin.
Add a new OPAL_ERROR_LOG macro similar to the ORTE_ERROR_LOG one.

This commit was SVN r24692.
2011-05-08 14:45:16 +00:00
Ralph Castain
c160f5d5a2 Add ability to specify mcast interfaces by name
This commit was SVN r24691.
2011-05-08 14:42:48 +00:00
Brian Barrett
be8a126600 At Josh's request, make example MPI extension use the init/fini so that
the feature is actually documented.

This commit was SVN r24686.
2011-05-05 18:31:07 +00:00
Brian Barrett
8376e0e507 Use free list get instead of wait; this is a constrained resource that will never come back, as it scales with the number of windows and not some more dynamic resources...
This commit was SVN r24685.
2011-05-05 17:19:59 +00:00
Jeff Squyres
d1d2cd0a87 Make the description of mca_btl_openib_cq_size be more accurate of
what it really is/does.

cmr:v1.5.4:kliteyn cmr:v1.4.4:reviewer=kliteyn

This commit was SVN r24684.
2011-05-05 13:10:11 +00:00
Christopher Yeoh
bab59bda76 Fixes trac:2767: Recursive locking when ROMIO used with THREAD_MULITPLE
This commit was SVN r24681.

The following Trac tickets were found above:
  Ticket 2767 --> https://svn.open-mpi.org/trac/ompi/ticket/2767
2011-05-04 06:31:42 +00:00
George Bosilca
34abbce82c More accurate and trustworthy descriptions of the netmask exist.
Interested readers can quench their curiosity either with one
of the Richard Stevens books (ISBN 9780201633467) or the
Wikipedia page (http://en.wikipedia.org/wiki/Subnetwork).

This commit was SVN r24680.
2011-05-03 21:59:51 +00:00
Thomas Herault
fb3fd8fd0e items belonging to peer_send_queue are mca_oob_tcp_msg_t *, which are obtained through a opal_freelist.
They shouldn't be released, but returned to the freelist.

This commit was SVN r24679.
2011-05-03 21:03:09 +00:00
Brian Barrett
3fed6053a4 don't build BTLs when using portals SHMEM. It breaks things :)
This commit was SVN r24678.
2011-05-03 20:17:19 +00:00
George Bosilca
c3c231b5ae Unsigned datatypes should be redirected to their unsigned correspondants
in the OPAL layer. Thenks to Yossi Etigin for the patch.

cmr:v1.5

This commit was SVN r24677.
2011-05-03 12:53:52 +00:00
Shiqing Fan
b4e5826403 Exclude two non-mca files that shouldn't be compiled under windows.
This commit was SVN r24669.
2011-05-02 14:39:22 +00:00
Ralph Castain
9df207aa51 Fix the case where a user supplies the -xterm option, which requires that we leave ssh sessions attached.
This commit was SVN r24668.
2011-05-02 12:39:55 +00:00
Ralph Castain
257473ebca Remove an extra "break" - thanks to Rainer for pointing it out.
This commit was SVN r24667.
2011-05-02 12:20:37 +00:00
Ralph Castain
138928fcf4 Use ports as multicast channels instead of networks so we avoid stepping into reserved spaces.
This commit was SVN r24666.
2011-04-29 18:46:40 +00:00
Ralph Castain
7b29a6153e Cover all the netmask values
This commit was SVN r24665.
2011-04-29 17:56:15 +00:00
Shiqing Fan
9e90ade864 Missed one file from the last commit.
This commit was SVN r24664.
2011-04-29 14:44:02 +00:00
Shiqing Fan
4490fdbd34 Add the initial support for MinGW and MSYS.
Correctly check the dependencies of MSYS env.
Set up configure include and lib path for building the package.
update a few more CMake scripts.

This commit was SVN r24663.
2011-04-29 14:42:07 +00:00
Rolf vandeVaart
3e8878f556 Add in missing header file
This commit was SVN r24662.
2011-04-29 13:20:59 +00:00
Ralph Castain
c78531ce8a Don't free the envar that gets putenv'd as that messes up the environ
This commit was SVN r24660.
2011-04-29 08:50:29 +00:00
Rolf vandeVaart
2634f6401a Add some basic support for sending and receiving CUDA device memory. Feature is disabled by default and has no effect on default code paths.
This commit was SVN r24659.
2011-04-28 23:05:55 +00:00
Ralph Castain
0ff0d20e72 Grr...get the prefix right - need to strip the bin out of absolute path to mpirun.
This commit was SVN r24658.
2011-04-28 22:20:55 +00:00
Ralph Castain
6af2677fb8 Check for both absolute-path-to-mpirun and -prefix being specified. If the two differ, print out a warning and ignore -prefix. If they are the same, or only one was given, then proceed as directed.
This commit was SVN r24657.
2011-04-28 22:12:41 +00:00
Jeff Squyres
c8c8044b92 Minor updates to NEWS
This commit was SVN r24654.
2011-04-28 21:40:44 +00:00
Brad Benton
0876e3549d Added a couple more items for 1.4.4.
This commit was SVN r24651.
2011-04-28 18:47:34 +00:00
Brad Benton
359de0bde6 Add 1.4.4 items.
This commit was SVN r24650.
2011-04-28 15:48:51 +00:00
Jeff Squyres
0882d636a6 Oops -- need string.h, too (for strcasecmp).
This commit was SVN r24649.
2011-04-28 15:42:35 +00:00
Jeff Squyres
7362a0730a Change the default to "none". David Singleton raises a good point
that enabling "local_only" by default could cause excessive
by-NUMA-node paging and/or OOMs (rather than allowing memory
allocations to spill over to other NUMA nodes).

This brought home the very real-world example of people buying servers
with more processors/cores than they need, just to get more memory.
We wouldn't want Badness to occur in such scenarios by default.
Instead, let people turn on "only allow memory allocations on my local
NUMA node" if their application would benefit from it.

This commit was SVN r24648.
2011-04-28 15:16:39 +00:00
Ralph Castain
b586f2952e Arggg...revert r24645. I knew those fields were there for a reason...sigh.
This commit was SVN r24647.

The following SVN revision numbers were found above:
  r24645 --> open-mpi/ompi@e4732110da
2011-04-28 15:07:00 +00:00
Ralph Castain
859aaab93d In the case of direct-launched processes running under slurm, psm requires that the pre_condition_transports MCA param be set. This is normally computed by mpirun and inserted into each proc's environ, but that doesn't work here.
So separate out the printing of that key, and let the individual procs generate it in a way that ensures they all get the same result.

This commit was SVN r24646.
2011-04-28 13:54:33 +00:00
Ralph Castain
e4732110da Remove a couple more stale fields
This commit was SVN r24645.
2011-04-28 00:26:38 +00:00
Ralph Castain
39369f8807 Remove stale fields from global objects - have been moved to the layer that actually uses them
This commit was SVN r24644.
2011-04-28 00:20:49 +00:00
Ralph Castain
8858d9a40e Add a marker for other layers to use in defining data types
This commit was SVN r24643.
2011-04-28 00:19:35 +00:00
Jeff Squyres
7b48042ffd Commit patch from upstream hwloc: r3482. Fixes some compiler
warnings. 

This commit was SVN r24641.

The following SVN revision numbers were found above:
  r3482 --> open-mpi/ompi@2435be8d49
2011-04-27 17:08:15 +00:00
Jeff Squyres
d134ff9b4d Refs trac:2698
After a long period of development with many starts and stops, we
finally got this where we wanted it.

This commit introduces 2 new MCA params (note that the
"maffinity_libnuma_policy" MCA param introduced by r24290 was removed
when libnuma support was removed).  Remember that maffinity policies
are only in effect when paffinity is enaabled -- i.e., when processes
are bound to processors!

 * '''maffinity_base_alloc_policy:''' Policy that determines how
   general memory allocations are bound after MPI_INIT.  A value of
   "none" means that no memory policy is applied.  A value of
   "local_only" means that all memory allocations will be restricted
   to the local NUMA node where each process is placed.  Note that
   operating system paging policies are unaffected by this setting.
   For example, if "local_only" is used and local NUMA node memory is
   exhausted, a new memory allocation may cause paging.
 * '''maffinity_base_bind_failure_action:''' What Open MPI will do if
   it explicitly tries to bind memory to a specific NUMA location, and
   fails.  Note that this is a different case than the general
   allocation policy described by maffinity_base_alloc_policy.  A
   value of "warn" means that Open MPI will warn the first time this
   happens, but allow the job to continue (possibly with degraded
   performance).  A value of "error" means that Open MPI will abort
   the job if this happens.

This needs at least a little soak time on the trunk before going to
v1.5.

This commit was SVN r24639.

The following SVN revision numbers were found above:
  r24290 --> open-mpi/ompi@afa654746c

The following Trac tickets were found above:
  Ticket 2698 --> https://svn.open-mpi.org/trac/ompi/ticket/2698
2011-04-26 13:31:07 +00:00
Matthias Jurenz
a1e304b2d6 Removed redundant debug message
This commit was SVN r24638.
2011-04-26 08:02:46 +00:00
Jeff Squyres
926af377fe Refs trac:2778.
Upgrade to hwloc 1.2 (from hwloc 1.1.2).  This should fix the problems
Nathan's seeing in #2778.

Let's let this soak on the trunk for a little while and see how LANL's
MTT's work out.  If that works, then we can CMR this to v1.5.

This commit was SVN r24635.

The following Trac tickets were found above:
  Ticket 2778 --> https://svn.open-mpi.org/trac/ompi/ticket/2778
2011-04-25 19:31:49 +00:00
Jeff Squyres
b8af3b7c4a New comment explains it all -- previous code was failing to find the
Nth core, so it fell over to try to find the Nth PU.

-----

hwloc isn't able to find cores on all platforms.  Example: PPC64
running RHEL 5.4 (linux kernel 2.6.18) only reports NUMA nodes and
PU's.  Fine.

However, note that hwloc_get_obj_by_type() will return NULL in 2
(effectively) different cases:

- no objects of the requested type were found
- the Nth object of the requested type was not found

So first we have to see if we can find *any* cores by looking for the
0th core.  If we find it, then try to find the Nth core.  Otherwise,
try to find the Nth PU.

This commit was SVN r24632.
2011-04-25 16:55:27 +00:00
Jeff Squyres
16d8e9216b Ran across this comment about i18n support, so I figured I'd update
it.  :-)

This commit was SVN r24631.
2011-04-22 12:14:20 +00:00