openmpi

Автор	SHA1	Сообщение	Дата
Ralph Castain	0209cddb5b	Revert r31596 and r31595 as they recreate the "abort" problem - all they did was move the blocking send to another point in the code. An alternative solution to the "show_help and abort" problem. will come in another commit Refs trac:4576 This commit was SVN r31599. The following SVN revision numbers were found above: r31595 --> open-mpi/ompi@2b61f22973 r31596 --> open-mpi/ompi@712634efd3 The following Trac tickets were found above: Ticket 4576 --> https://svn.open-mpi.org/trac/ompi/ticket/4576	2014-05-02 10:38:30 +00:00
Ralph Castain	712634efd3	Silence warning Refs trac:4576 This commit was SVN r31596. The following Trac tickets were found above: Ticket 4576 --> https://svn.open-mpi.org/trac/ompi/ticket/4576	2014-05-01 23:58:03 +00:00
Ralph Castain	2b61f22973	Now that the abort code no longer involves a blocking rml send section, apps that call show_help followed by abort are not printing their error message. So block them in show_help until that message gets out. This commit was SVN r31595.	2014-05-01 22:57:17 +00:00
Ralph Castain	238ecea311	When we comm_spawn, we really want to respect the original -host directives and not expand the daemon virtual machine unless directed to do so in the comm_spawn command. Otherwise, we will automatically launch daemons on every node in the allocation. cmr=v1.8.2:reviewer=rhc:subject=respect vm boundaries during comm_spawn This commit was SVN r31578.	2014-04-30 22:26:18 +00:00
Ralph Castain	087b84b0ef	Add some further debug to the dstore framework. When doing comm_spawn, we have to exchange any provided cpu bitmaps to ensure both sides compute the same locality, else various mpi frameworks can go bonkers. This commit was SVN r31572.	2014-04-30 19:29:00 +00:00
Ralph Castain	8cda1b3dc6	Don't store cpu_bitmap unless it is non-NULL This commit was SVN r31570.	2014-04-30 18:12:48 +00:00
Ralph Castain	7a79b25577	Ensure we cleanup some files so session dirs can be rolled up cmr=v1.8.2:reviewer=jsquyres This commit was SVN r31569.	2014-04-30 17:52:10 +00:00
Ralph Castain	c4c9bc1573	As per the RFC: http://www.open-mpi.org/community/lists/devel/2014/04/14496.php Revamp the opal database framework, including renaming it to "dstore" to reflect that it isn't a "database". Move the "db" framework to ORTE for now, soon to move to ORCM This commit was SVN r31557.	2014-04-29 21:49:23 +00:00
Jeff Squyres	38a27b858d	Protect for the CLEANUP case where tmp hasn't been set yet Refs trac:4536 This commit was SVN r31438. The following Trac tickets were found above: Ticket 4536 --> https://svn.open-mpi.org/trac/ompi/ticket/4536	2014-04-18 23:34:53 +00:00
Jeff Squyres	530f22c403	proc_info.c: uncomment C99 struct member initialization usage The C99 usage to initialize via struct member names was already there, but commented out. This commit doesn't fix any known problem; it simply uncomments the C99 code, because it's safer/better. This commit was SVN r31425.	2014-04-18 17:26:07 +00:00
Ralph Castain	12094eb7b2	Add some further protections after discussion with Jeff Refs trac:4536 This commit was SVN r31422. The following Trac tickets were found above: Ticket 4536 --> https://svn.open-mpi.org/trac/ompi/ticket/4536	2014-04-18 16:21:55 +00:00
Ralph Castain	8d72633acf	Ensure that the session directory fields of orte_process_info have been initialized prior to cleaning up those directories as part of the initialization process that deals with stale session directory trees. Fixes trac:4534 cmr=v1.8.1:reviewer=jsquyres This commit was SVN r31421. The following Trac tickets were found above: Ticket 4534 --> https://svn.open-mpi.org/trac/ompi/ticket/4534	2014-04-18 14:25:48 +00:00
Ralph Castain	deff85ffc3	Prevent a segfault if we encounter an error while parsing a hostfile. Don't issue and error_log output as the hostfile code already prints an error message Thanks to Tetsuya Mishima for the patch. Reviewed ok by rhc. RM-approved cmr=v1.8.1:reviewer=ompi-gk1.8 This commit was SVN r31377.	2014-04-12 21:32:10 +00:00
Ralph Castain	61d94fcee2	Fix the sequential mapper - it was out-of-sync with the hostfile changes, and we missed the "seq" policy when parsing the --map-by option. Thanks to Bill Chen for reporting it cmr=v1.8.1:reviewer=jsquyres This commit was SVN r31333.	2014-04-08 03:38:25 +00:00
Ralph Castain	3fdcaeab97	Fix a problem where we need to abort due to a mapping failure, but we are in a managed environment and thus the orteds have not wired up. Thus, if we send the exit message across the routed network, the remote daemons won't have a way to relay the message along - and we won't exit. If we are aborting, then set the flags so the HNP directly sends an exit command to each daemon. Make it the halt_vm command so the remote daemon doesn't try to relay it, but instead just exits without waiting for its routed children to exit first. cmr=v1.8.1:reviewer=jsquyres:subject=fix hangs due to abort prior to daemon wireup This commit was SVN r31304.	2014-04-02 04:17:55 +00:00
Jeff Squyres	173c046617	build: add Automake-like silent/verbose macros for "ln -s ..." operations Also, since I put some of the macros for these silent/verbose rules up in the top-level Makefile.man-page-rules file, I renamed it to Makefile.ompi-rules. I've had this sitting around for a while; now seems like as good a time as any to commit it. This commit was SVN r31271.	2014-03-28 18:24:32 +00:00
Ralph Castain	5a868028a8	Revert r31091 - the functionality didn't disappear, but moved into the MPI layer :-( This commit was SVN r31093. The following SVN revision numbers were found above: r31091 --> open-mpi/ompi@edf680855e	2014-03-17 22:30:03 +00:00
Ralph Castain	edf680855e	Restore locality computation to the nidmap code - don't know how/when it was removed, but that was not good cmr=v1.7.5:reviewer=hjelmn This commit was SVN r31091.	2014-03-17 21:59:25 +00:00
Ralph Castain	7bb8dbade6	Extend the regular expression parsing support This commit was SVN r31088.	2014-03-17 21:25:05 +00:00
Adrian Reber	8d40cd53ae	use the existing pretty-print function for information about the job state This commit was SVN r31020.	2014-03-12 12:34:25 +00:00
Joshua Ladd	9ea9bec4ad	Addressing Jeff's comments: 1. Changed rng_buff_t --> opal_rng_buff_t 2. All global variables obey the prefix rule 3. Old code has been removed 4. Found a couple of unnecessary includes Refs trac:4298 This commit was SVN r30807. The following Trac tickets were found above: Ticket 4298 --> https://svn.open-mpi.org/trac/ompi/ticket/4298	2014-02-24 23:18:35 +00:00
Joshua Ladd	e39d9f4080	Per the RFC schedule, add an additive lagged Fibonacci parallel random number generator to OPAL. In order to use, please add the following header to your code: opal/util/alfg.h. See ompi/mca/btl/openib/connect/btl_openib_connect_udcm.c for an example how to seed with opal_srand and invoke the generator with opal_rand. This should be added to cmr=v1.7.5:reviewer=rhc:subject=Add an OPAL RNG This commit was SVN r30801.	2014-02-23 21:41:38 +00:00
Ralph Castain	418ca60776	Since we don't know the name of the local leader, store that info under our own name :-) This commit was SVN r30777.	2014-02-20 01:39:52 +00:00
Ralph Castain	262c927778	Define a new key and store the process name of the local_rank=0 process on each node so that the MPI layer can retrieve it as desired. This commit was SVN r30759.	2014-02-18 00:32:58 +00:00
Ralph Castain	c3df744a3b	Shift the orte_db_localrank key to the opal level. Add the job and proc-level session directory names to the database using opal_db keys. This commit was SVN r30746.	2014-02-17 01:40:56 +00:00
Ralph Castain	449cd8f3d7	Update a couple of fields, add a scheduler field to proc_info This commit was SVN r30718.	2014-02-13 23:30:04 +00:00
Ralph Castain	1565816988	Do a little better job of cleaning up the session directory left by mpirun by ensuring we delete the event associated with debugger attachment and unlinking the pipe used for that purpose. Also, we no longer leave "abort" files around, so remove that check when deleting session directory trees cmr=v1.7.5:reviewer=jsquyres:subject=cleanup session directories better This commit was SVN r30689.	2014-02-11 22:16:17 +00:00
Adrian Reber	fde1040d2f	Use unique collective ids for the checkpoint/restart code This commit was SVN r30552.	2014-02-04 14:03:05 +00:00
Ralph Castain	e3cb4b4a5b	Grant Nathan his wish - add an --disable-getpwuid to the configure options and protect all users of that code so it disappears if disabled. cmr=v1.7.5:reviewer=hjelmn:subject=disable getpwuid if requested This commit was SVN r30413.	2014-01-24 19:18:37 +00:00
Ralph Castain	14bf1c9463	Some minor cleanups: * don't return null if someone wants to print ORTE_SUCCESS * rename some stale process types * keep show_help local if we are in standalone operation as there is nobody to send it to cmr=v1.7.5:reviewer=jsquyres This commit was SVN r30400.	2014-01-23 21:35:20 +00:00
Ralph Castain	a01470190d	Allow a little more flexibility - if getpwuid fails, just use the return from getuid to define the session directory cmr=v1.7.5:reviewer=jsquyres This commit was SVN r30388.	2014-01-23 05:00:05 +00:00
Ralph Castain	3e9c8497e0	Shift the verbose output a bit Refs trac:4136 This commit was SVN r30332. The following Trac tickets were found above: Ticket 4136 --> https://svn.open-mpi.org/trac/ompi/ticket/4136	2014-01-20 14:41:37 +00:00
Ralph Castain	5ad9795bd8	Cleanup some potential memory overruns cmr=v1.7.5:reviewer=jsquyres This commit was SVN r30331.	2014-01-19 16:31:26 +00:00
Ralph Castain	9f6fd7b98d	A few corrections to hostfile parsing - thanks to Tetsuya Mishima for the review Refs trac:4136 This commit was SVN r30330. The following Trac tickets were found above: Ticket 4136 --> https://svn.open-mpi.org/trac/ompi/ticket/4136	2014-01-19 16:26:12 +00:00
Ralph Castain	fcdd904af4	Simplify and update hostfile handling to correctly support hostfiles that list nodes multiple times, once for each slot, and those that list a host once and include an explicit slot count. Eliminate support for mixing those two modes as this logic became just too complex when attempting to handle all the corner cases. cmr=v1.7.4:reviewer=jsquyres This commit was SVN r30325.	2014-01-18 16:08:40 +00:00
Ralph Castain	d5647394d8	Initialize variable so dash-host option gets correctly parsed cmr=v1.7.4:reviewer=rolfv This commit was SVN r30159.	2014-01-08 15:17:16 +00:00
Brian Barrett	8b778903d8	Fix longstanding issue with our multi-project support. Rather than using pkg{data,lib,includedir}, use our own ompi{data,lib,includedir}, which is always set to {datadir,libdir,includedir}/openmpi. This will keep us from having help files in prefix/share/open-rte when building without Open MPI, but in prefix/share/openmpi when building with Open MPI. This commit was SVN r30140.	2014-01-07 22:11:15 +00:00
Ralph Castain	3f2b3c53ea	Ensure that rankfile-provided allocations are correctly handled Fixes trac:4043 cmr=v1.7.4:reviewer=jsquyres:subject=Ensure that rankfile-provided allocations are correctly handled This commit was SVN r30106. The following Trac tickets were found above: Ticket 4043 --> https://svn.open-mpi.org/trac/ompi/ticket/4043	2014-01-02 16:07:16 +00:00
Ralph Castain	bb80625a8a	Add missing var initialization cmr=v1.7.4:reviewer=ompi-gk1.7 This commit was SVN r30063.	2013-12-24 00:02:22 +00:00
Ralph Castain	9c768df8b8	Resolve an unexpected behavior in hostfile allocations. Now that we filter allocations to determine what will be used for mapping, let the initial global pool be the union of nodes from all sources (default hostfile, hostfiles, and dash-hosts). Each app will filter down to only those specified for it using its own hostfile and dash-host options. cmr=v1.7.4:reviewer=jsquyres:subject=Resolve an unexpected behavior in hostfile allocations This commit was SVN r30040.	2013-12-21 01:38:27 +00:00
Ralph Castain	d47d2569f3	We stripped the process info packing routine to minimize message size when sending the launch message, but tools still require all the info. So modify the tool-hnp handshake to explicitly add the missing info Refs trac:3992 This commit was SVN r29989. The following Trac tickets were found above: Ticket 3992 --> https://svn.open-mpi.org/trac/ompi/ticket/3992	2013-12-19 20:42:20 +00:00
Ralph Castain	6239e64f36	Further cleanup of orte-ps so it doesn't abort when hitting a stale HNP - only report that event once and just keep working. Refs trac:3992 This commit was SVN r29974. The following Trac tickets were found above: Ticket 3992 --> https://svn.open-mpi.org/trac/ompi/ticket/3992	2013-12-19 03:28:05 +00:00
Jeff Squyres	0ab48ad0d2	Fix some annoying flex warnings that have been there for years. Many thanks to Tom Fogal for the initial patch. cmr=v1.7.4:reviewer=rhc:subject=Fix annoying flex warnings This commit was SVN r29904.	2013-12-14 00:36:12 +00:00
Ralph Castain	617a0edbb8	Fix hostfile parsing for the case where RMs count slots by listing the node multiple times. Thanks to Tetsuya Mishima for rep[orting the problem and providing a patch. cmf=v1.7.4:reviewer=rhc This commit was SVN r29748.	2013-11-24 16:17:52 +00:00
Ralph Castain	7480beb7f0	Per request from Nathan, add an offset value to the job struct so we can construct a "global rank" that spans multiple jobs during dynamic launch operations. Store a new ORTE_DB_GLOBAL_RANK value for each process in the database, and ensure that we share our own value during connect_accept so both sides can see it. This isn't being used yet - just enabling Nathan to do what he needs. *** NOTE: any use of the OMPI_DB_GLOBAL_RANK database key must be protected by #ifdef OMPI_DB_GLOBAL_RANK as not all RTE's will define this key. *** This commit was SVN r29708.	2013-11-14 17:01:43 +00:00
Ralph Castain	46f633883b	Correct the error check on rml.send cmr=v1.7.4:reviewer=jsquyres This commit was SVN r29660.	2013-11-11 23:23:12 +00:00
Ralph Castain	24c811805f	************************************************************** This change contains a non-mandatory modification of the MPI-RTE interface. Anyone wishing to support coprocessors such as the Xeon Phi may wish to add the required definition and underlying support ************************************************************** Add locality support for coprocessors such as the Intel Xeon Phi. Detecting that we are on a coprocessor inside of a host node isn't straightforward. There are no good "hooks" provided for programmatically detecting that "we are on a coprocessor running its own OS", and the ORTE daemon just thinks it is on another node. However, in order to properly use the Phi's public interface for MPI transport, it is necessary that the daemon detect that it is colocated with procs on the host. So we have to split the locality to separately record "on the same host" vs "on the same board". We already have the board-level locality flag, but not quite enough flexibility to handle this use-case. Thus, do the following: 1. add OPAL_PROC_ON_HOST flag to indicate we share a host, but not necessarily the same board 2. modify OPAL_PROC_ON_NODE to indicate we share both a host AND the same board. Note that we have to modify the OPAL_PROC_ON_LOCAL_NODE macro to explicitly check both conditions 3. add support in opal/mca/hwloc/base/hwloc_base_util.c for the host to check for coprocessors, and for daemons to check to see if they are on a coprocessor. The former is done via hwloc, but support for the latter is not yet provided by hwloc. So the code for detecting we are on a coprocessor currently is Xeon Phi specific - hopefully, we will find more generic methods in the future. 4. modify the orted and the hnp startup so they check for coprocessors and to see if they are on a coprocessor, and have the orteds pass that info back in their callback message. Automatically detect that coprocessors have been found and identify which coprocessors are on which hosts. Note that this algo isn't scalable at the moment - this will hopefully be improved over time. 5. modify the ompi proc locality detection function to look for coprocessor host info IF the OMPI_RTE_HOST_ID database key has been defined. RTE's that choose not to provide this support do not have to do anything - the associated code will simply be ignored. 6. include some cleanup of the hwloc open/close code so it conforms to how we did things in other frameworks (e.g., having a single "frame" file instead of open/close). Also, fix the locality flags - e.g., being on the same node means you must also be on the same cluster/cu, so ensure those flags are also set. cmr:v1.7.4:reviewer=hjelmn This commit was SVN r29435.	2013-10-14 16:52:58 +00:00
Ralph Castain	5ec422dbc1	Correctly compute num local peers when launched via mpirun This commit was SVN r29327.	2013-10-02 01:46:09 +00:00
Ralph Castain	d565a76814	Do some cleanup of the way we handle modex data. Identify data that needs to be shared with peers in my job vs data that needs to be shared with non-peers - no point in sharing extra data. When we share data with some process(es) from another job, we cannot know in advance what info they have or lack, so we have to share everything just in case. This limits the optimization we can do for things like comm_spawn. Create a new required key in the OMPI layer for retrieving a "node id" from the database. ALL RTE'S MUST DEFINE THIS KEY. This allows us to compute locality in the MPI layer, which is necessary when we do things like intercomm_create. cmr:v1.7.4:reviewer=rhc:subject=Cleanup handling of modex data This commit was SVN r29274.	2013-09-27 00:37:49 +00:00
Ralph Castain	a200e4f865	As per the RFC, bring in the ORTE async progress code and the rewrite of OOB: * THIS RFC INCLUDES A MINOR CHANGE TO THE MPI-RTE INTERFACE * Note: during the course of this work, it was necessary to completely separate the MPI and RTE progress engines. There were multiple places in the MPI layer where ORTE_WAIT_FOR_COMPLETION was being used. A new OMPI_WAIT_FOR_COMPLETION macro was created (defined in ompi/mca/rte/rte.h) that simply cycles across opal_progress until the provided flag becomes false. Places where the MPI layer blocked waiting for RTE to complete an event have been modified to use this macro. *************************************************************************************** I am reissuing this RFC because of the time that has passed since its original release. Since its initial release and review, I have debugged it further to ensure it fully supports tests like loop_spawn. It therefore seems ready for merge back to the trunk. Given its prior review, I have set the timeout for one week. The code is in https://bitbucket.org/rhc/ompi-oob2 WHAT: Rewrite of ORTE OOB WHY: Support asynchronous progress and a host of other features WHEN: Wed, August 21 SYNOPSIS: The current OOB has served us well, but a number of limitations have been identified over the years. Specifically: * it is only progressed when called via opal_progress, which can lead to hangs or recursive calls into libevent (which is not supported by that code) * we've had issues when multiple NICs are available as the code doesn't "shift" messages between transports - thus, all nodes had to be available via the same TCP interface. * the OOB "unloads" incoming opal_buffer_t objects during the transmission, thus preventing use of OBJ_RETAIN in the code when repeatedly sending the same message to multiple recipients * there is no failover mechanism across NICs - if the selected NIC (or its attached switch) fails, we are forced to abort * only one transport (i.e., component) can be "active" The revised OOB resolves these problems: * async progress is used for all application processes, with the progress thread blocking in the event library * each available TCP NIC is supported by its own TCP module. The ability to asynchronously progress each module independently is provided, but not enabled by default (a runtime MCA parameter turns it "on") * multi-address TCP NICs (e.g., a NIC with both an IPv4 and IPv6 address, or with virtual interfaces) are supported - reachability is determined by comparing the contact info for a peer against all addresses within the range covered by the address/mask pairs for the NIC. * a message that arrives on one TCP NIC is automatically shifted to whatever NIC that is connected to the next "hop" if that peer cannot be reached by the incoming NIC. If no TCP module will reach the peer, then the OOB attempts to send the message via all other available components - if none can reach the peer, then an "error" is reported back to the RML, which then calls the errmgr for instructions. * opal_buffer_t now conforms to standard object rules re OBJ_RETAIN as we no longer "unload" the incoming object * NIC failure is reported to the TCP component, which then tries to resend the message across any other available TCP NIC. If that doesn't work, then the message is given back to the OOB base to try using other components. If all that fails, then the error is reported to the RML, which reports to the errmgr for instructions * obviously from the above, multiple OOB components (e.g., TCP and UD) can be active in parallel * the matching code has been moved to the RML (and out of the OOB/TCP component) so it is independent of transport * routing is done by the individual OOB modules (as opposed to the RML). Thus, both routed and non-routed transports can simultaneously be active * all blocking send/recv APIs have been removed. Everything operates asynchronously. KNOWN LIMITATIONS: * although provision is made for component failover as described above, the code for doing so has not been fully implemented yet. At the moment, if all connections for a given peer fail, the errmgr is notified of a "lost connection", which by default results in termination of the job if it was a lifeline * the IPv6 code is present and compiles, but is not complete. Since the current IPv6 support in the OOB doesn't work anyway, I don't consider this a blocker * routing is performed at the individual module level, yet the active routed component is selected on a global basis. We probably should update that to reflect that different transports may need/choose to route in different ways * obviously, not every error path has been tested nor necessarily covered * determining abnormal termination is more challenging than in the old code as we now potentially have multiple ways of connecting to a process. Ideally, we would declare "connection failed" when all transports can no longer reach the process, but that requires some additional (possibly complex) code. For now, the code replicates the old behavior only somewhat modified - i.e., if a module sees its connection fail, it checks to see if it is a lifeline. If so, it notifies the errmgr that the lifeline is lost - otherwise, it notifies the errmgr that a non-lifeline connection was lost. * reachability is determined solely on the basis of a shared subnet address/mask - more sophisticated algorithms (e.g., the one used in the tcp btl) are required to handle routing via gateways * the RML needs to assign sequence numbers to each message on a per-peer basis. The receiving RML will then deliver messages in order, thus preventing out-of-order messaging in the case where messages travel across different transports or a message needs to be redirected/resent due to failure of a NIC This commit was SVN r29058.	2013-08-22 16:37:40 +00:00

1 2 3 4 5 ...

448 Коммитов