1
1
Граф коммитов

17573 Коммитов

Автор SHA1 Сообщение Дата
Samuel Gutierrez
42280e2af5 Temporarily make routed binomial the default. We are experiencing issues with
debruijn when launching fewer processes than are actually available within an
allocation. When this is fixed, please revert this change.

This commit was SVN r27376.
2012-09-26 16:08:12 +00:00
Pavel Shamis
4935b9d930 Fixing compilation error. Adding missing output.h.
This commit was SVN r27375.
2012-09-26 15:52:09 +00:00
Samuel Gutierrez
ba470dcec9 Add eth0 to oob_tcp_if_include
This commit was SVN r27374.
2012-09-26 14:47:00 +00:00
George Bosilca
48f528f142 icc complains about the missing prototype.
This commit was SVN r27373.
2012-09-26 09:56:14 +00:00
George Bosilca
890dedf13f Cleanup.
This commit was SVN r27372.
2012-09-26 09:44:46 +00:00
Brian Barrett
cb6831830a Remove the TSD_HACKS macro. The TSD hack is only for non-glibc libraries
and we only build the linux memory component on glibc, so this shouldn't
be needed.

This commit was SVN r27371.
2012-09-26 07:42:43 +00:00
Vishwanath Venkatesan
c6751eaf70 Fixing all two_phase read_all bugs,
1. Multiple aggregator with non-contiguous datatype,
2. Memory corruption bugs.

Cleaned version, with proper initialization and memory management.

This commit was SVN r27370.
2012-09-26 00:16:08 +00:00
Vishwanath Venkatesan
2e2a46b2be Fixing two-phase write all bug for non-contiguous file-type.
This commit was SVN r27369.
2012-09-26 00:10:40 +00:00
Ralph Castain
01504239e6 Enable some debugging in the if discovery code
This commit was SVN r27367.
2012-09-25 20:23:37 +00:00
Jeff Squyres
cb65a44c6c Fix the component priority assignment. Thanks to Alex Margolin for
the patch.

This commit was SVN r27363.
2012-09-25 07:13:23 +00:00
George Bosilca
6ec41400b3 Fix the error message in case a daemon does not succeed at killing the
local offspring.

This commit was SVN r27362.
2012-09-24 15:25:21 +00:00
Ralph Castain
d5279b0dc8 Make an attempt to protect hwloc cset2str from segfaulting in weird scenario
This commit was SVN r27361.
2012-09-23 16:51:51 +00:00
Ralph Castain
1ddb334a52 Put the specified JDK path first so we find that one
This commit was SVN r27360.
2012-09-22 19:08:52 +00:00
Jeff Squyres
9dbbddc89a Update the script to make PHP-ized man pages to be a bit more
automatic. 

This commit was SVN r27359.
2012-09-21 06:43:53 +00:00
Ralph Castain
662bc05aa6 Refs trac:3322
Cannot start the data clearing at the root object level as the root object has a different struct attached to userdata.

This commit was SVN r27357.

The following Trac tickets were found above:
  Ticket 3322 --> https://svn.open-mpi.org/trac/ompi/ticket/3322
2012-09-20 23:30:32 +00:00
Ralph Castain
d95025f53a Ensure we clear the usage numbers when binding on multiple nodes so we don't "carry over" info from one node to the next. Use the same tracking mechanism for binding upwards and in-place to avoid doing a bunch of mallocs.
Refs trac:3322

This commit was SVN r27356.

The following Trac tickets were found above:
  Ticket 3322 --> https://svn.open-mpi.org/trac/ompi/ticket/3322
2012-09-20 15:16:06 +00:00
Ralph Castain
90d7b5fdca Update test
This commit was SVN r27354.
2012-09-20 02:51:27 +00:00
Ralph Castain
445161cd2e Correctly count the total number of allocated slots
This commit was SVN r27353.
2012-09-20 02:50:14 +00:00
Ralph Castain
f592967685 Add missing retain to maintain correct accounting on nodes
This commit was SVN r27352.
2012-09-20 02:30:53 +00:00
Ralph Castain
e309db0be9 Ensure file descriptors are closed upon completion of transfer
This commit was SVN r27349.
2012-09-18 18:39:29 +00:00
Ralph Castain
11305109e1 Track positioned files so we avoid re-positioning them across jobs
This commit was SVN r27347.
2012-09-18 15:56:21 +00:00
Jeff Squyres
3fc3dd9dfa Sync NEWS with v1.6 branch.
This commit was SVN r27345.
2012-09-18 10:03:09 +00:00
Ralph Castain
a3060cdd15 Fix the bind_downward code - it was incorrectly looking across the entire node instead of only looking below the locale to which the proc had been assigned. In other words, if the proc was mapped to a core, then the only hwthreads that should be considered for binding are those directly below that core. The binding algo was incorrectly looking at ALL hwthreads in that scenario, causing the proc to be bound to an HT outside of the mapped location.
This now results in the procs being bound within their assigned location. It also causes us to use only the 0th HT on a core unless --use-hwthread-cpus has been specified (in which case, we use all the HTs in a core). Bind to core binds you to all HTs regardless - the --use-hwthread-cpus only impacts the oversubscribed determination and when binding to HT.

cmr:v1.7

This commit was SVN r27342.
2012-09-14 22:01:19 +00:00
Jeff Squyres
0b45d4ba86 Per discussion with Ralph and Nathan, disable UDCM for now. It's
borked and needs some surgery to get back on its feet.

This commit was SVN r27335.
2012-09-13 18:25:10 +00:00
Jeff Squyres
30d9c36275 FreeBSD detection improvement. Thanks to Brooks Davis for the patch.
This commit was SVN r27334.
2012-09-13 13:25:04 +00:00
Ralph Castain
a6329ba1b6 Fix makefile
This commit was SVN r27333.
2012-09-13 03:20:05 +00:00
Jeff Squyres
3cc8b0461a More updates to common verbs infrastructure:
* Moved "check basics" sanity check from openib BTL to common/verbs
   (which also allows us to have openib ''not'' include
   <infiniband/driver.h>, which is a Very Good Thing)
 * Add new ompi_common_verbs_qp_test() function, which tests to see
   whether a device supports RC and/or UD QPs.  The openib BTL now
   uses this function to ensure that the device supports RC QPs.
 * Rename ompi_common_verbs_find_ibv_ports() to be
   ompi_common_verbs_find_ports() -- the "ibv" was redundant.
 * Re-work ompi_common_verbs_find_ports() to use
   ompi_common_verbs_qp_test() instead of testing for RC/UD QPs itself
 * Add bunches of opal_output_verbose() to the find_ports() routine
   (to help diagnosing connectivity problems -- imaging running with
   --mca btl_base_verbose 10; you'll see all the find_ports() test
   results)
 * Make ompi_common_verbs_qp_test() warn if devices/ports are supplied
   in the if_include/if_exclude strings that do not exists (quite
   similar to what the openib BTL does today).
 * Add ompi_common_verbs_mca_register() function, which registers
   common verbs MCA params.  It will also register MCA param synonyms
   for thse MCA params to upper-level components (e.g.,
   btl_<upper-level-component>_<the-mca-param>). 
   * common_verbs_warn_nonexistent_if: warn if
     if_include/if_exclude-specified devices or ports do not exist.  

This commit was SVN r27332.
2012-09-12 20:47:47 +00:00
Pavel Shamis
1e7b958c2a Cleaning warning in collectives code
This commit was SVN r27331.
2012-09-12 19:47:23 +00:00
Jeff Squyres
171f6efd70 Don't free heap objects!
This commit was SVN r27326.
2012-09-12 15:11:56 +00:00
Jeff Squyres
c4d00bc476 Sync README with v1.6 branch: add bullet about Intel compilers and the
IA64 platform.

This commit was SVN r27324.
2012-09-12 14:59:32 +00:00
Jeff Squyres
3a4b92dbb7 If we get a filesystem type of "none", skip it.
This commit was SVN r27322.
2012-09-12 14:38:37 +00:00
Ralph Castain
9057e84ec1 Correct test statement
This commit was SVN r27321.
2012-09-12 14:30:03 +00:00
Ralph Castain
c4fd3df2df Remove unused variables
This commit was SVN r27319.
2012-09-12 12:03:24 +00:00
Ralph Castain
c82cfecc1c Cleanup comm_spawn for the multi-node case where at least one new process isn't spawned on every node. Avoid the complexities of trying to execute a daemon collective across the dynamic spawn as it becomes too hard to ensure that all daemons participate or are accounted for - instead, use a less scalable but workable solution of sending the data directly between the participating procs. Ensure that singletons get their collectives properly defined at startup so the spawned "HNP" is ready for them.
As a secondary cleanup, the HNP doesn't need to update its nidmap during an xcast as it already has an up-to-date picture of the situation. So just dump that data and move along.

This commit was SVN r27318.
2012-09-12 11:31:36 +00:00
Ralph Castain
6b5f9d7767 Some cleanups for staged execution
This commit was SVN r27317.
2012-09-12 09:15:33 +00:00
Matthias Jurenz
e643c09dee Changes to OTF:
- otfaux:
		- fixed build error on Solaris and NetBSD (removed -lm from library dependencies)
Changes to VT:
	- vtunify:
		- disable OpenMP parallelization if PGI compiler version < 9 is used (threadprivate not supported)

This commit was SVN r27316.
2012-09-12 09:03:10 +00:00
Ralph Castain
b22fc54d9b Remove unused variable
This commit was SVN r27312.
2012-09-12 08:41:48 +00:00
Ralph Castain
5f7a5c4793 Update test to include all keys
This commit was SVN r27311.
2012-09-12 05:02:51 +00:00
Jeff Squyres
a7ea880d0a Refs trac:3309
* Minor man page tweaks
 * Use existing ompi_mpi_thread_requested global

This commit was SVN r27308.

The following Trac tickets were found above:
  Ticket 3309 --> https://svn.open-mpi.org/trac/ompi/ticket/3309
2012-09-11 21:12:06 +00:00
Ralph Castain
bc1300f5cc Remove debug
This commit was SVN r27307.
2012-09-11 20:54:04 +00:00
Jeff Squyres
fb2e543a57 Refs trac:3275.
We ran into a case where the OMPI SVN trunk grew a new acceptable MCA
parameter value, but this new value was not accepted on the v1.6
branch (hwloc_base_mem_bind_failure_action -- on the trunk it accepts
the value "silent", but on the older v1.6 branch, it doesn't).  If you
set "hwloc_base_mem_bind_failure_action=silent" in the default MCA
params file and then accidentally ran with the v1.6 branch, every OMPI
executable (including ompi_info) just failed because hwloc_base_open()
would say "hey, 'silent' is not a valid value for
hwloc_base_mem_bind_failure_action!".  Kaboom.

The only problem is that it didn't give you any indication of where
this value was being set.  Quite maddening, from a user perspective.

So we changed the ompi_info handles this case.  If any framework open
function return OMPI_ERR_BAD_PARAM (either because its base MCA params
got a bad value or because one of its component register/open
functions return OMPI_ERR_BAD_PARAM), ompi_info will stop, print out
a warning that it received and error, and then dump out the parameters
that it has received so far in the framework that had a problem.

At a minimum, this will show the user the MCA param that had an error
(it's usually the last one), and ''where it was set from'' (so that
they can go fix it).  

We updated ompi_info to check for O???_ERR_BAD_PARAM from each from
the framework opens.  Also updated the doxygen docs in mca.h for this
O???_BAD_PARAM behavior.  And we noticed that mca.h had MCA_SUCCESS
and MCA_ERR_??? codes.  Why?  I think we used them in exactly one
place in the code base (mca_base_components_open.c).  So we deleted
those and just used the normal OPAL_* codes instead.

While we were doing this, we also cleaned up a little memory
management during ompi_info/orte-info/opal-info finalization.
Valgrind still reports a truckload of memory still in use at ompi_info
termination, but they mostly look to be components not freeing
memory/resources properly (and outside the scope of this fix).

This commit was SVN r27306.

The following Trac tickets were found above:
  Ticket 3275 --> https://svn.open-mpi.org/trac/ompi/ticket/3275
2012-09-11 20:47:24 +00:00
Ralph Castain
a0ffeb205a Add an orted component for staged operations and rename the staged component to "staged_hnp".
This commit was SVN r27305.
2012-09-11 20:35:46 +00:00
Ralph Castain
387f657fc2 Nuts - forgot to include this with the MPI Ticket 313 stuff. Set some of the envars needed for MPI_INFO_ENV
This commit was SVN r27304.
2012-09-11 20:35:09 +00:00
Ralph Castain
cd8aff675b Update test
This commit was SVN r27303.
2012-09-11 20:32:43 +00:00
Ralph Castain
e8ecd67d53 Once again, bloody SLURM changes the envars and breaks things. Try and track their changes so we get a correct allocation.
This commit was SVN r27302.
2012-09-11 20:31:33 +00:00
Ralph Castain
a08c23dfdc Actually, do the right thing - leave the test alone, but just turn if "off" for now until someone, someday fixes it to work with bind mounts.
This commit was SVN r27301.
2012-09-11 19:56:58 +00:00
Ralph Castain
3c016d79db Soft mounts are okay
This commit was SVN r27300.
2012-09-11 19:48:24 +00:00
Jeff Squyres
a8f8064d8b Add a missing free(). Refs trac:3292.
This commit was SVN r27298.

The following Trac tickets were found above:
  Ticket 3292 --> https://svn.open-mpi.org/trac/ompi/ticket/3292
2012-09-11 17:59:40 +00:00
Ralph Castain
ffb8c2a2ba Add the MPI_INFO_ENV man page
This commit was SVN r27293.
2012-09-11 17:35:32 +00:00
Ralph Castain
fb4af5e29c Implement the rest of MPI-3 ticket #313 based on side-bar agreement with MPICH2 folks. Fix a bug in the original ompi_info code that put the NULL terminator one position too far if the returned string exceeded MPI_MAX_INFO_VAL in length in ompi_info_get.
This commit was SVN r27292.
2012-09-11 17:03:49 +00:00