1
1
Граф коммитов

322 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
6fbbfd0f7a Gah! r25545 acidentally included ''waaaay'' more stuff than it was
supposed to.  I.e., half-baked/not complete stuff.

This commit backs out all of r25545.  Sorry folks!

This commit was SVN r25546.

The following SVN revision numbers were found above:
  r25545 --> open-mpi/ompi@7f9ae11faf
2011-11-29 23:24:52 +00:00
Jeff Squyres
7f9ae11faf Per http://www.open-mpi.org/community/lists/users/2011/11/17862.php,
to make MPI_IN_PLACE (and other sentinel Fortran constants) work on OS
X, we need to use the following compiler (linker) flag:

    -Wl,-commons,use_dylibs 

So if we're compiling on OS X, test to see if that flag works with the
compiler.  If so, add it to the wrapper FFLAGS and FCFLAGS (note that
per a future update, we'll only have one Fortran compiler anyway).

Fixes trac:1982.  

This commit was SVN r25545.

The following Trac tickets were found above:
  Ticket 1982 --> https://svn.open-mpi.org/trac/ompi/ticket/1982
2011-11-29 23:05:54 +00:00
Ralph Castain
9b59d8de6f This is actually a much smaller commit than it appears at first glance - it just touches a lot of files. The --without-rte-support configuration option has never really been implemented completely. The option caused various objects not to be defined and conditionally compiled some base functions, but did nothing to prevent build of the component libraries. Unfortunately, since many of those components use objects covered by the option, it caused builds to break if those components were allowed to build.
Brian dealt with this in the past by creating platform files and using "no-build" to block the components. This was clunky, but acceptable when only one organization was using that option. However, that number has now expanded to at least two more locations.

Accordingly, make --without-rte-support actually work by adding appropriate configury to prevent components from building when they shouldn't. While doing so, remove two frameworks (db and rmcast) that are no longer used as ORCM comes to a close (besides, they belonged in ORCM now anyway). Do some minor cleanups along the way.

This commit was SVN r25497.
2011-11-22 21:24:35 +00:00
George Bosilca
88d32312d6 The bind_level should be initialized to zero or weird things happens. I'm
not yet sure how and why, but packing a uint8_t with opal_dss lead to
weird values during unpack (except if the original value is already
set to zero).

This commit was SVN r25490.
2011-11-18 10:22:58 +00:00
Ralph Castain
6310361532 At long last, the fabled revision to the affinity system has arrived. A more detailed explanation of how this all works will be presented here:
https://svn.open-mpi.org/trac/ompi/wiki/ProcessPlacement

The wiki page is incomplete at the moment, but I hope to complete it over the next few days. I will provide updates on the devel list. As the wiki page states, the default and most commonly used options remain unchanged (except as noted below). New, esoteric and complex options have been added, but unless you are a true masochist, you are unlikely to use many of them beyond perhaps an initial curiosity-motivated experimentation.

In a nutshell, this commit revamps the map/rank/bind procedure to take into account topology info on the compute nodes. I have, for the most part, preserved the default behaviors, with three notable exceptions:

1. I have at long last bowed my head in submission to the system admin's of managed clusters. For years, they have complained about our default of allowing users to oversubscribe nodes - i.e., to run more processes on a node than allocated slots. Accordingly, I have modified the default behavior: if you are running off of hostfile/dash-host allocated nodes, then the default is to allow oversubscription. If you are running off of RM-allocated nodes, then the default is to NOT allow oversubscription. Flags to override these behaviors are provided, so this only affects the default behavior.

2. both cpus/rank and stride have been removed. The latter was demanded by those who didn't understand the purpose behind it - and I agreed as the users who requested it are no longer using it. The former was removed temporarily pending implementation.

3. vm launch is now the sole method for starting OMPI. It was just too darned hard to maintain multiple launch procedures - maybe someday, provided someone can demonstrate a reason to do so.

As Jeff stated, it is impossible to fully test a change of this size. I have tested it on Linux and Mac, covering all the default and simple options, singletons, and comm_spawn. That said, I'm sure others will find problems, so I'll be watching MTT results until this stabilizes.

This commit was SVN r25476.
2011-11-15 03:40:11 +00:00
Ralph Castain
34f0a27cb6 Initialize the locality info - at time of pmap creation, we at least know node locality
This commit was SVN r25446.
2011-11-06 17:06:41 +00:00
Ralph Castain
b44f8d4b28 Complete implementation of the ess.proc_get_locality API. Up to this point, the API was only capable of telling if the specified proc was sharing a node with you. However, the returned value was capable of telling you much more detailed info - e.g., if the proc shares a socket, a cache, or numa node. We just didn't have the data to provide that detail.
Use hwloc to obtain the cpuset for each process during mpi_init, and share that info in the modex. As it arrives, use a new opal_hwloc_base utility function to parse the value against the local proc's cpuset and determine where they overlap. Cache the value in the pmap object as it may be referenced multiple times.

Thus, the return value from orte_ess.proc_get_locality is a 16-bit bitmask that describes the resources being shared with you. This bitmask can be tested using the macros in opal/mca/paffinity/paffinity.h

Locality is available for all procs, whether launched via mpirun or directly with an external launcher such as slurm or aprun.

This commit was SVN r25331.
2011-10-19 20:18:14 +00:00
Ralph Castain
8f0ef54130 Complete implementation of pmi support. Ensure we support both mpirun and direct launch within same configuration to avoid requiring separate builds. Add support for generic pmi, not just under slurm. Add publish/subscribe support, although slurm's pmi implementation will just return an error as it hasn't been done yet.
This commit was SVN r25303.
2011-10-17 20:51:22 +00:00
Swen Boehm
08b4322a1a patched the lex files to not issue the following compiler warning:
'yyunput' defined but not used

This commit was SVN r25246.
2011-10-10 18:13:04 +00:00
George Bosilca
80c02647c8 Each level (OPAL/ORTE/OMPI) should only return it's own constants,
instead of the current mismatch.

This commit was SVN r25230.
2011-10-04 14:50:31 +00:00
Ralph Castain
3c4f04f4d9 Ensure opal_hwloc_topology is NULL after being destroyed
This commit was SVN r25138.
2011-09-13 19:21:10 +00:00
Ralph Castain
92c7372e20 Per the RFC from Jeff, move hwloc from opal/mca/common to its own static framework ala libevent. Have ORTE daemons collect the topology info at startup and, if --enable-hwloc-xml is set, send that info back to the HNP for later use. The HNP only retains unique topology "templates" to reduce memory footprint. Have the daemon include the local topology info in the nidmap buffer sent to each app so the apps don't all hammer the local system to discover it for themselves.
Remove the sysinfo framework as hwloc replaces that functionality.

This commit was SVN r25124.
2011-09-11 19:02:24 +00:00
Rainer Keller
9d5afc58c6 - Fix breakage of the epoch changes with PGI:
Don't juse include pre-processor macros between two strins ("s1" #if 0 ... "s2")...
   Rather print out the epoch as 0 always...

This commit was SVN r25110.
2011-08-31 08:40:31 +00:00
Wesley Bland
4e7ff0bd5e By popular demand the epoch code is now disabled by default.
To enable the epochs and the resilient orte code, use the configure flag:

--enable-resilient-orte

This will define both:

ORTE_ENABLE_EPOCH
ORTE_RESIL_ORTE

This commit was SVN r25093.
2011-08-26 22:16:14 +00:00
Ralph Castain
e58623cd5b Bring alps back to full operations by correctly computing daemon names. Unfortunately, alps doesn't assign cnos rank in node-based order - i.e., cnos rank=0 isn't necessarily on the first node of the execution. So adjust when using static ports.
Add some debug to nidmap

Ensure that the HNP's node name is not included in the regex when launching via rshbase as that node is automatically included in the daemon map.

This commit was SVN r25063.
2011-08-18 14:59:18 +00:00
Ralph Castain
7b9f958dcf Add some missing error strings. Update test to show silent errors
This commit was SVN r25010.
2011-08-08 04:21:02 +00:00
Ralph Castain
7b307d5bf0 Cleanup handling of all-numerical node names
This commit was SVN r25000.
2011-08-05 14:59:14 +00:00
Ralph Castain
157bad5435 If we can't compress the name, that's fine - but still have to move to next posn
This commit was SVN r24999.
2011-08-05 14:43:36 +00:00
Ralph Castain
3199663613 Correctly handle the case of mixes of character-based names and all-number names
This commit was SVN r24998.
2011-08-05 14:37:36 +00:00
Ralph Castain
5a634caad9 Cleanly handle the case where the node "name" is just a number, and avoid the N-N output when the number is not part of a sequence.
This commit was SVN r24992.
2011-08-05 03:36:30 +00:00
Ralph Castain
8853e0e80a Fix regular expression analyzer for slurmd - use a slurm-specific version
Fix multi-node routing for daemon startup when static ports are not set

This commit was SVN r24898.
2011-07-13 22:49:56 +00:00
Ralph Castain
1ee7c39982 Fix some major bit-rot on scalable launch. If static ports are provided, then daemons can connect back to the HNP via the routed connection tree instead of doing so directly. In order to do that at scale, the node list must be passed as a regular expression - otherwise, the orted command line gets too long.
Over the course of time, usage of static ports got corrupted in several places, the "parent" info got incorrectly reset, etc. So correct all that and get the regex-based wireup going again.

Also, don't pass node lists if static ports aren't enabled - they are of no value to the orted and just create the possibility of overly-long cmd lines.

This commit was SVN r24860.
2011-07-07 18:54:30 +00:00
Ralph Castain
418229c71c Define a new error constant
This commit was SVN r24833.
2011-06-28 19:47:16 +00:00
Wesley Bland
84be81df95 Standardize the initialization of the EPOCH's.
Everyone will be starting at MIN anyway (until we implement restart of course)
so there's no reason to set the epoch to INVALID and then immediately reset them
to MIN. This way there's less room to make mistakes later.

This commit was SVN r24829.
2011-06-28 14:20:33 +00:00
Ralph Castain
c203eee223 Since process names now have three fields, be sure to initialize all three of them
This commit was SVN r24828.
2011-06-27 20:50:08 +00:00
Wesley Bland
e1ba09ad51 Add a resilience to ORTE. Allows the runtime to continue after a process (or
ORTED) failure. Note that more work will be necessary to allow the MPI layer to
take advantage of this.

Per RFC:
http://www.open-mpi.org/community/lists/devel/2011/06/9299.php

This commit was SVN r24815.
2011-06-23 20:38:02 +00:00
Ralph Castain
042ee3ec48 Support the option of outputting error_log messages with something other than the process name
This commit was SVN r24784.
2011-06-17 14:50:00 +00:00
Ralph Castain
1f3911cc8b Add a new proc state
This commit was SVN r24710.
2011-05-19 21:25:58 +00:00
Ralph Castain
b47ec2ee87 Remove lingering references to opal_profile option
This commit was SVN r24709.
2011-05-18 18:27:29 +00:00
Ralph Castain
d34bab541d Remove the ompi-profiler tool and its attendant ompi-probe program. Also remove the grpcomm basic component since its only function was to support profiled clusters, which nobody was doing. :-(
This commit was SVN r24704.
2011-05-17 03:30:25 +00:00
Ralph Castain
138928fcf4 Use ports as multicast channels instead of networks so we avoid stepping into reserved spaces.
This commit was SVN r24666.
2011-04-29 18:46:40 +00:00
Shiqing Fan
9e90ade864 Missed one file from the last commit.
This commit was SVN r24664.
2011-04-29 14:44:02 +00:00
Ralph Castain
859aaab93d In the case of direct-launched processes running under slurm, psm requires that the pre_condition_transports MCA param be set. This is normally computed by mpirun and inserted into each proc's environ, but that doesn't work here.
So separate out the printing of that key, and let the individual procs generate it in a way that ensures they all get the same result.

This commit was SVN r24646.
2011-04-28 13:54:33 +00:00
Ralph Castain
3a28556472 Expand our handling of non-zero exit status. If a process exits with non-zero status, pass that info along to the user in case it means something to them, even if the process also exited without calling MPI_Finalize. If the process calls MPI_Abort, that trumps the exit status question.
Provide a new MCA param that allows the user to direct that we abort the job once a process exits with non-zero status. No recovery is allowed in such cases to avoid trying to restart a process that has already exited MPI.

This commit was SVN r24614.
2011-04-14 15:04:21 +00:00
Jeff Squyres
06d5c59115 Fix a few valgrind-reported memory leaks
This commit was SVN r24498.
2011-03-08 17:37:28 +00:00
Jeff Squyres
79cf382ff3 Fix a few issues with error messages:
* If something goes wrong during ompi_mpi_init, don't erroneously
   report that it is illegal to invoke MPI_INIT* before MPI_INIT
 * Aggregate help messages when possible when something goes wring
   during ompi_mpi_init

This commit was SVN r24492.
2011-03-07 16:45:45 +00:00
Ralph Castain
5120e6aec3 Redefine the rmaps framework to allow multiple mapper modules to be active at the same time. This allows users to map the primary job one way, and map any comm_spawn'd job in a different way. Modules are given the opportunity to map a job in priority order, with the round-robin mapper having the highest default priority. Priority of each module can be defined using mca param.
When called, each mapper checks to see if it can map the job. If npernode is provided, for example, then the loadbalance mapper accepts the assignment and performs the operation - all mappers before it will "pass" as they can't map npernode requests.

Also remove the stale and never completed topo mapper.

This commit was SVN r24393.
2011-02-15 23:24:31 +00:00
Ralph Castain
b5de068533 Clean up an error in r24371 - can't use a const parameter as target in asprintf as it changes the value of the address.
Add some new proc/job states

Rename a constant to reflect coming change - remove the arbitrary difference between restarting a proc locally and relocating it to another node in terms of the number of restarts allowed.

Add pretty-print of signals for "proc aborted due to signal" reports.

This commit was SVN r24378.

The following SVN revision numbers were found above:
  r24371 --> open-mpi/ompi@93d28a5792
2011-02-14 19:29:09 +00:00
Abhishek Kulkarni
93d28a5792 Change opal_err2str_fn_t to return the error string as an argument.
This means that the converters (opal_err2str, orte_err2str) can now
return NULL as a "silent error". The return value of opal_err2str_fn_t
is the status of the operation (OPAL_SUCCESS or OPAL_ERROR).

This fixes the "Unknown error" message issues on the trunk.

This commit was SVN r24371.
2011-02-13 16:09:17 +00:00
Ralph Castain
33b68132cc Update the rmcast framework
This commit was SVN r24370.
2011-02-12 16:52:03 +00:00
Ralph Castain
b09f57b03d Update the multicast subsystem - ported from Cisco branch
This commit was SVN r24246.
2011-01-13 01:54:05 +00:00
Jeff Squyres
a525e70f46 Convert "opal_show_help" to be a global variable pointer.
It is statically initialized to the real back-end OPAL show_help
function.  During orte_show_help_init(), the variable is re-assigned
with the value of the back-end ORTE show_help function (the one that
does error message aggregation).  

Therefore, anything that calls opal_show_help() after a certain point
in orte_init() will have their show_help messages be aggregated.
w00t!  Even code down in OPAL -- that has no knowledge of ORTE -- will
have their messages aggregated.  '''Double w00t!'''

During orte_show_help_finalize(), we restore the original pointer
value so that it something calls opal_show_help() after
orte_finalize(), it'll still work properly (but it won't be
aggregated).  

This commit was SVN r24185.
2010-12-16 23:00:25 +00:00
Jeff Squyres
de97962aac Fixes trac:2651.
Fix off-by-one error when /dev/urandom doesn't exist.  Thanks to "pth"
for the patch.

This commit was SVN r24170.

The following Trac tickets were found above:
  Ticket 2651 --> https://svn.open-mpi.org/trac/ompi/ticket/2651
2010-12-14 14:52:51 +00:00
Ralph Castain
b251a59cdf Cleanup nidmap finalize
This commit was SVN r24164.
2010-12-11 16:42:06 +00:00
Ralph Castain
eba65e97f3 Extend the rmcast APIs to allow enable/disable of comm, required for clean termination by upper layer users.
Point the recv thread event base to the right place so it can wakeup when required.

Add a new error code for "comm disabled" when attempting to communicate after disabling comm.

This commit was SVN r24129.
2010-12-01 13:41:19 +00:00
Ralph Castain
30c37ea536 Ensure that the oversubscribed condition of nodes is accurately reported by the mapper, and that the results are communicated and used by the backend orteds when setting sched_yield on local procs. Restores prior behavior that was somehow lost along the way.
Includes a patch from Damien Guinier to fix vpid assignments when cpus-per-task is specified.

This commit was SVN r24126.
2010-12-01 12:51:39 +00:00
Nathan Hjelm
986265fc6e fixed crash in orte-ps caused by calls to OBJ_RELEASE on an opal_event_t object.
This commit was SVN r24020.
2010-11-09 18:41:43 +00:00
Ralph Castain
9ea2b196ce Convert the opal_event framework to use direct function calls instead of hiding functions behind function pointers. Eliminate the opal_object_t abstraction of libevent's event struct so it can be directly passed to the libevent functions.
Note: the ompi_check_libfca.m4 file had to be modified to avoid it stomping on global CPPFLAGS and the like. The file was also relocated to the ompi/config directory as it pertains solely to an ompi-layer component.

Forgive the mid-day configure change, but I know Shiqing is working the windows issues and don't want to cause him unnecessary redo work.

This commit was SVN r23966.
2010-10-28 15:22:46 +00:00
Ralph Castain
86c7365e8e Clean up a few initialization issues - don't think these are impacting the shared memory situation as it didn't fix the problem.
Setup the event API to support multiple bases in preparation for splitting the OMPI and ORTE events. Holding here pending shared memory resolution.

This commit was SVN r23943.
2010-10-26 02:41:42 +00:00
Ralph Castain
aaec8ec426 Fix orte-ps so it correctly reports out on processes within a job
This commit was SVN r23933.
2010-10-25 17:53:53 +00:00