openmpi

Автор	SHA1	Сообщение	Дата
Ralph Castain	c146c4969b	Revert part of open-mpi/ompi@c1bbbb5e2f to restore the usock component, thus fixing show_help aggregation. Fixes #1467 Restore debugger attach operations Fixes #1225	2016-03-18 21:49:04 -07:00
Ralph Castain	60a7bc2e50	Enable the PMIx notification callback system. This currently is only supported by the pmix120 component, which is not selected by default. All other components will ignore error registration requests, and thus do not support debugger attach when launched via mpirun. Note that direct launched applications will support such attachment, but may not do so in a scalable fashion. Fixes ##1225	2016-02-18 09:29:12 -08:00
Ralph Castain	06c3dfc052	Refactor the ORTE DVM code so that external codes can submit multiple jobs using only a single connection to the HNP. * Clean up the DVM so it continues to run even when applications error out and we would ordinarily abort the daemons. * Create a new errmgr component for the DVM to handle the differences. * Cleanup the DVM state component. * Add ORTE bindings directory and brief README * Pass a local tool index around to match jobs. * Pass the jobid on job completion. * Fix initialization logic. * Add framework for python wrapper. * Fix terminate-with-non-zero-exit behavior so it properly terminates only the indicated procs, notifies orte-submit, and orte-dvm continues executing. * Add some missing options to orte-dvm * Fix a bug in -host processing that caused us to ignore the #slots designator. Add a new attribute to indicate "do not expand the DVM" when submitting job spawn requests. * It actually makes no sense that we treat the termination of all children differently than terminating the children of a specific job - it only creates confusion over the difference in behavior. So terminate children the same way regardless. Extend the cmd_line utility to easily allow layering of command line definitions Catch up with ORTE interface change and make build more generic. Disable "fixed dvm" logic for now. Add another cmd_line function to merge a table of cmd line options with another one, reporting as errors any duplicate entries. Use this to allow orterun to reuse the orted_submit code Fix the "fixed_dvm" logic by ensuring we reset num_new_daemons to zero. Also ensure that the nidmap is sent with the first job so the downstream daemons get the node info. Remove a duplicate cmd line entry in orterun. Revise the DVM startup procedure to pass the nidmap only once, at the startup of the DVM. This reduces the overhead on each job launch and ensures that the nidmap doesn't get overwritten. Add new commands to get_orted_comm_cmd_str(). Move ORTE command line options to orte_globals.[ch]. Catch up with extra orte_submit_init parameter. Add example code. Add documentation. Bump version. The nidmap and routing data must be updated prior to propagating the xcast or else the xcast will fail. Fix the return code so it is something more expected when an error occurs. Ensure we get an error returned to us when we fail to launch for some reason. In this case, we will always get a launch_cb as we did indeed attempt to spawn it. The error code will be returned in the complete_cb. Fix the return code from orte_submit_job - it was returning the tracker index instead of "success". Take advantage of ORTE's pretty-print capabilities to provide a nice error output explaining why we failed to launch. Ensure we always get a launch_cb when we fail to launch, but no complete_cb as the job never launched. Extend the error reporting capability to job completion as well. Add index parameter to orte_submit_job(). Add orte_job_cancel and implement ORTE_DAEMON_TERMINATE_JOB_CMD. Factor out dvm termination. Parse the terminate option at tool level. Add error string for ORTE_ERR_JOB_CANCELLED. Add some safeguards. Cleanup and/of comments. Enable the return. Properly ORTE_DECLSPEC orte_submit_halt. Add orte_submit_halt and orte_submit_cancel to interface. Use the plm interface to terminate the job	2016-02-13 08:10:44 -08:00
Jeff Squyres	7850517215	brucks: rename the "brks" component to be "brucks" After hearing the 3rd person ask what "brks" stood for, I'm renaming this component to be "brucks" (because it uses a Bruck-based algorithm).	2016-02-09 13:17:11 -08:00
Ralph Castain	64b695669a	Cleanup warnings in opal and orte layers when building optimized on Mac	2015-12-17 07:51:24 -08:00
Jeff Squyres	00c5dc9449	rml oob: C99-ification of structure member assignment	2015-12-08 17:05:16 -08:00
Ralph Castain	0d5814b5ca	Cleanup Coverity issues	2015-08-29 21:19:27 -07:00
Ralph Castain	cf6137b530	Integrate PMIx 1.0 with OMPI. Bring Slurm PMI-1 component online Bring the s2 component online Little cleanup - let the various PMIx modules set the process name during init, and then just raise it up to the ORTE level. Required as the different PMI environments all pass the jobid in different ways. Bring the OMPI pubsub/pmi component online Get comm_spawn working again Ensure we always provide a cpuset, even if it is NULL pmix/cray: adjust cray pmix component for pmix Make changes so cray pmix can work within the integrated ompi/pmix framework. Bring singletons back online. Implement the comm_spawn operation using pmix - not tested yet Cleanup comm_spawn - procs now starting, error in connect_accept Complete integration	2015-08-29 16:04:10 -07:00
Ralph Castain	4853457b93	The RML posted recvs are controlled by the async progress thread when in an application process. The call to finalize and close the RML is done from the main thread, and so we need to shift the actual destruct of the posted recv list to the async thread for handling or else we encounter a race condition when accessing the posted recvs. Thanks to Gilles for providing the required debug info	2015-07-21 08:44:23 -07:00
Gilles Gouaillardet	409874eb47	remove trigraph '??)' from comment Fujitsu compilers issue way too many warnings because of this trigraph	2015-07-07 11:00:13 +09:00
Nathan Hjelm	4d92c9989e	more c99 updates This commit does two things. It removes checks for C99 required headers (stdlib.h, string.h, signal.h, etc). Additionally it removes definitions for required C99 types (intptr_t, int64_t, int32_t, etc). Signed-off-by: Nathan Hjelm <hjelmn@me.com>	2015-06-25 10:14:13 -06:00
Ralph Castain	869041f770	Purge whitespace from the repo	2015-06-23 20:59:57 -07:00
Ralph Castain	b5382c9bf9	Rework the OOB selection logic to allow a component (e.g., usock) to direct that it be the sole active component. Remove prior disqualifying code in the oob/tcp component as it was too restrictive - if usock wasn't able to run, it left apps with no way to communicate to their daemon. Have the local daemon check the global modex for the RML URI info of the local procs so it can route messages between them when tcp is the primary channel. A few other minor cleanups included.	2015-05-08 11:15:21 -07:00
Gilles Gouaillardet	2e384a3b65	initialize common symbols from orte A few uninitialized common symbols are remaining (generated by flex) : * orte/mca/rmaps/rank_file/rmaps_rank_file_lex.c: orte_rmaps_rank_file_leng * orte/mca/rmaps/rank_file/rmaps_rank_file_lex.c: orte_rmaps_rank_file_text * orte/util/hostfile/hostfile_lex.c: orte_util_hostfile_leng * orte/util/hostfile/hostfile_lex.c: orte_util_hostfile_text	2015-05-08 10:11:58 +09:00
Ralph Castain	9cb2fcfa5c	Cleanup the qos code when --enable-timings is given	2015-05-06 20:24:27 -07:00
Ralph Castain	1f8de276de	Consolidate all the QOS changes into one clean commit	2015-05-06 19:48:42 -07:00
Nathan Hjelm	45e053dbce	orte: use C99 subobject naming for component initialization This commit helps future-proof orte components by initializing each component member by name. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-04-18 10:29:58 -06:00
Nathan Hjelm	b68d66bb9b	MCA: Add the project/project version to the MCA base component This commit adds support for project_framework_component_* parameter matching. This is the first step in allowing the same framework name in multiple projects. This change also bumps the MCA component version to 2.1.0. All master frameworks have been updated to use the new component versioning macro. An mca.h has been added to each project to add a project specific versioning macro of the form PROJECT_MCA_VERSION_2_1_0. Signed-off-by: Nathan Hjelm <hjelmn@me.com>	2015-03-27 10:59:04 -06:00
Howard Pritchard	bf89131f9e	add owner files to opa/ompi/orte mca directories This commit adds an owner file in each of the component directories for each framework. This allows for a simple script to parse the contents of the files and generate, among other things, tables to be used on the project's wiki page. Currently there are two "fields" in the file, an owner and a status. A tool to parse the files and generate tables for the wiki page will be added in a subsequent commit.	2015-02-22 15:10:23 -07:00
Ralph Castain	2b0b012460	Continue refinement of the DVM operations. Send the spawn request to the right place (it helps) as it isn't a comm_spawn request and has to be treated a little differently. Ensure IO gets forwarded back to the tool. Ensure the tool outputs show_help locally as there is no place to send it.	2015-02-04 06:21:54 -08:00
Ralph Castain	ec5ccb76cf	Enable persistent ORTE DVM so users can execute multiple OMPI jobs within an allocation without restarting the DVM every time.	2015-01-30 11:00:43 -08:00
Howard Pritchard	2809c21e0f	rml/oob: check peer param in send methods The rml/oob was not doing sanity checks on the input peer parameter for the orte_rml_oob_send_nb and orte_rml_oob_send_buffer_nd. Owing to the fact that there are places in the ompi/orte stack where things like orte_show_help_norender are called way before ORTE_PROC_MY_HNP, are setup properly, all kinds of weird startup failures can occur as the rml/oob tries to process send requests where the peer is junk. Rather than try to expand this kind of thing: /* if we are the HNP, or the RML has not yet been setup, * or ROUTED has not been setup, * or we weren't given an HNP, or we are running in standalone * mode, then all we can do is process this locally */ if (ORTE_PROC_IS_HNP \|\| orte_standalone_operation \|\| NULL == orte_rml.send_buffer_nb \|\| NULL == orte_routed.get_route \|\| NULL == orte_process_info.my_hnp_uri) { rc = show_help(filename, topic, output, ORTE_PROC_MY_NAME); } do the right thing in the rml level and return an error rather than eventually failing in the send owing to peer not being valid.	2015-01-22 06:12:39 -08:00
Artem Polyakov	8ffad75a0a	Introduce timing interval measurement facility in timing framework	2014-12-10 16:47:49 +06:00
Ralph Castain	780c93ee57	Per the PR and discussion on today's telecon, extend the process name definition as a two-field struct of uint32_t's down to the OPAL layer. This resolves issues created by prior commits that impacted both heterogeneous and SPARC support. This also simplifies the OMPI code base by removing the need for frequent memcpy's when transitioning between the OMPI/ORTE layers and OPAL. We recognize that this means other users of OPAL will need to "wrap" the opal_process_name_t if they desire to abstract it in some fashion. This is regrettable, and we are looking at possible alternatives that might mitigate that requirement. Meantime, however, we have to put the needs of the OMPI community first, and are taking this step to restore hetero and SPARC support.	2014-11-11 17:00:42 -08:00
Ralph Castain	dfb952fa78	[Contribution from Artem - moved it to svn from git for him] Replace our old, clunky timing setup with a much nicer one that is only available if configured with --enable-timing. Add a tool for profiling clock differences between the nodes so you can get more precise timing measurements. I'll ask Artem to update the Github wiki with full instructions on how to use this setup. This commit was SVN r32738.	2014-09-15 18:00:46 +00:00
Ralph Castain	4eb6291334	Avoid conflicts when multiple collectives are underway in ORTE by giving each grpcomm component its own RML tag and posting persistent receives. We use the signature anyway to determine which collective the received message is addressing, so there is no need to post non-persistent receives. This commit was SVN r32703.	2014-09-10 17:36:16 +00:00
Ralph Castain	8faabed2cd	Add some further initialization and protection for zero-byte messages This commit was SVN r32644.	2014-08-29 17:24:55 +00:00
Ralph Castain	aec5cd08bd	Per the PMIx RFC: WHAT: Merge the PMIx branch into the devel repo, creating a new OPAL “lmix” framework to abstract PMI support for all RTEs. Replace the ORTE daemon-level collectives with a new PMIx server and update the ORTE grpcomm framework to support server-to-server collectives WHY: We’ve had problems dealing with variations in PMI implementations, and need to extend the existing PMI definitions to meet exascale requirements. WHEN: Mon, Aug 25 WHERE: https://github.com/rhc54/ompi-svn-mirror.git Several community members have been working on a refactoring of the current PMI support within OMPI. Although the APIs are common, Slurm and Cray implement a different range of capabilities, and package them differently. For example, Cray provides an integrated PMI-1/2 library, while Slurm separates the two and requires the user to specify the one to be used at runtime. In addition, several bugs in the Slurm implementations have caused problems requiring extra coding. All this has led to a slew of #if’s in the PMI code and bugs when the corner-case logic for one implementation accidentally traps the other. Extending this support to other implementations would have increased this complexity to an unacceptable level. Accordingly, we have: * created a new OPAL “pmix” framework to abstract the PMI support, with separate components for Cray, Slurm PMI-1, and Slurm PMI-2 implementations. * Replaced the current ORTE grpcomm daemon-based collective operation with an integrated PMIx server, and updated the grpcomm APIs to provide more flexible, multi-algorithm support for collective operations. At this time, only the xcast and allgather operations are supported. * Replaced the current global collective id with a signature based on the names of the participating procs. The allows an unlimited number of collectives to be executed by any group of processes, subject to the requirement that only one collective can be active at a time for a unique combination of procs. Note that a proc can be involved in any number of simultaneous collectives - it is the specific combination of procs that is subject to the constraint * removed the prior OMPI/OPAL modex code * added new macros for executing modex send/recv to simplify use of the new APIs. The send macros allow the caller to specify whether or not the BTL supports async modex operations - if so, then the non-blocking “fence” operation is used, if the active PMIx component supports it. Otherwise, the default is a full blocking modex exchange as we currently perform. * retained the current flag that directs us to use a blocking fence operation, but only to retrieve data upon demand This commit was SVN r32570.	2014-08-21 18:56:47 +00:00
Ralph Castain	8ea576c870	I have no idea how they did it, but someone managed to write a test that circled around and around and eventually reached this point with a NULL pointer. So protect against that possibility. This commit was SVN r32434.	2014-08-05 16:20:46 +00:00
Ralph Castain	6c5e592785	Revert r32222, r32210, and r32203 as they created a problem when daemon collectives did not involve app procs on every node. Instead, modify the ompi/mca/rte/orte/rte_orte.h to add a new function that allows apps to request new daemon collective ids for use in barrier and modex operations. This will only appear in ORTE-based installations, but it is only being used by a couple of researchers at the moment. Update the orte/test/mpi/coll_test.c test to show the revised example. This commit was SVN r32234. The following SVN revision numbers were found above: r32203 --> open-mpi/ompi@a523dba41d r32210 --> open-mpi/ompi@2ce11ed5c4 r32222 --> open-mpi/ompi@d55f16db50	2014-07-15 03:48:00 +00:00
Ralph Castain	42bf7466fc	This isn't as big a change as it appears - a change in one place caused a whole bunch of files to require updated #include's due to some arcane linkage. Rework the orte_wait code to reflect the introduction of the state machine. If we are in cleanup mode and just want to kill all our local children, then there is no reason to be polite about it as that introduces very long delays at scale. Just kill the procs and move on. Refs trac:4717 This commit was SVN r32019. The following Trac tickets were found above: Ticket 4717 --> https://svn.open-mpi.org/trac/ompi/ticket/4717	2014-06-17 17:57:51 +00:00
Ralph Castain	b7c08582ba	Add new tag to avoid conflicts This commit was SVN r31960.	2014-06-06 17:23:35 +00:00
Nathan Hjelm	59d09ad9de	orte: fix several small memory leaks grpcomm: fix memory leaks We were leaking the caddy object used to pass data to the callback function. This commit fixes these leaks. oob,rml: fix memory leaks This commit fixes several leaks: - Both the oob/base and oob/tcp were leaking objects on their peer hash tables. Iterate on the hash tables and free any objects. - Leaked sent messages because of missing OBJ_RELEASE. I placed the release in ORTE_RML_SEND_COMPLETE to catch all the possible paths. ess/base: close the state framework cmr=v1.8.2:reviewer=rhc This commit was SVN r31776.	2014-05-15 15:06:27 +00:00
Ralph Castain	11faab1091	The final step of the RFC: convert the <foo>libdir and friends to fit their respective code areas, and equate them all at the top. Note that we can't entirely separate things as the opal_install_dirs framework can't handle separated locations for the various trees. This commit was SVN r31679.	2014-05-08 02:01:35 +00:00
Adrian Reber	34625b360b	use the newly created JOB_STATE_FT_* events This commit was SVN r31021.	2014-03-12 12:37:14 +00:00
Ralph Castain	956aab03a7	Track the origin of a message so it can be passed across transports Refs trac:4184 This commit was SVN r30433. The following Trac tickets were found above: Ticket 4184 --> https://svn.open-mpi.org/trac/ompi/ticket/4184	2014-01-26 21:09:26 +00:00
Ralph Castain	657796f9e0	Revert r30327 - turns out it isn't quite right just yet. :-( Closes trac:4138 This commit was SVN r30328. The following SVN revision numbers were found above: r30327 --> open-mpi/ompi@87d5f86025 The following Trac tickets were found above: Ticket 4138 --> https://svn.open-mpi.org/trac/ompi/ticket/4138	2014-01-18 23:38:39 +00:00
Ralph Castain	87d5f86025	Enable use of unix domain sockets for local OOB communications, thereby removing the requirement for an active network interface when running strictly on a single node. Update the overall OOB system to support cross-transport movement of messages so that the OOB can move a received message to another transport for transmission. cmr=v1.7.5:reviewer=jsquyres:subject=Enable use of unix domain sockets for local OOB communications This commit was SVN r30327.	2014-01-18 21:36:49 +00:00
Ralph Castain	286ff6d552	For large scale systems, we would like to avoid doing a full modex during MPI_Init so that launch will scale a little better. At the moment, our options are somewhat limited as only a few BTLs don't immediately call modex_recv on all procs during startup. However, for those situations where someone can take advantage of it, add the ability to do a "modex on demand" retrieval of data from remote procs when we launch via mpirun. NOTE: launch performance will be absolutely awful if you do this with BTLs that aren't configured to modex_recv on first message! Even with "modex on demand", we still have to do a barrier in place of the modex - we simply don't move any data around, which does reduce the time impact. The barrier is required to ensure that the other proc has in fact registered all its BTL info and therefore is prepared to hand over a complete data package. Otherwise, you may not get the info you need. In addition, the shared memory BTL can fail to properly rendezvous as it expects the barrier to be in place. This behavior will only take effect under the following conditions: 1. launched via mpirun 2. #procs is greater than ompi_hostname_cutoff, which defaults to UINT32_MAX 3. mca param rte_orte_direct_modex is set to 1. At the moment, we are having problems getting this param to register properly, so only the first two conditions are in effect. Still, the bottom line is you have to want this behavior to get it. The planned next evolution of this will be to make the direct modex be non-blocking - this will require two fixes: 1. if the remote proc doesn't have the required info, then let it delay its response until it does. This means we need a way for the MPI layer to tell the RTE "I am done entering modex data". 2. adjust the SM rendezvous logic to loop until the required file has been created Creating a placeholder to bring this over to 1.7.5 when ready. cmr=v1.7.5:reviewer=hjelmn:subject=Enable direct modex at scale This commit was SVN r30259.	2014-01-11 17:36:06 +00:00
Brian Barrett	8b778903d8	Fix longstanding issue with our multi-project support. Rather than using pkg{data,lib,includedir}, use our own ompi{data,lib,includedir}, which is always set to {datadir,libdir,includedir}/openmpi. This will keep us from having help files in prefix/share/open-rte when building without Open MPI, but in prefix/share/openmpi when building with Open MPI. This commit was SVN r30140.	2014-01-07 22:11:15 +00:00
George Bosilca	38cbaeaa82	Try to impose a little bit of consistency on how we parse lists of modules by enforcing the use of OPAL list accessors. This commit was SVN r30045.	2013-12-21 23:23:33 +00:00
Adrian Reber	53a70fe87f	Trying to get the C/R code to compile again. (send__nb) This patch changes all send/send_buffer occurrences in the C/R code to send_nb/send_buffer_nb. The new code compiles but does not work. Changes from V1: #ifdef out the code (so it is preserved for later re-design) * marked the broken C/R code with ENABLE_FT_FIXED Changes from V2: * just replace the blocking calls with the non-blocking calls * all #ifdef's introduced in V1 are gone * send_* returns error code or ORTE_SUCCESS (not the number of bytes) This commit was SVN r30036.	2013-12-20 21:58:28 +00:00
Adrian Reber	a3813d37c7	Trying to get the C/R code to compile again. (recv__nb) This patch changes all recv/recv_buffer occurrences in the C/R code to recv_nb/recv_buffer_nb. The old code is still there but disabled using ifdefs (ENABLE_FT_FIXED). The new code compiles but does not work. Changes from V1: #ifdef out the code (so it is preserved for later re-design) * marked the broken C/R code with ENABLE_FT_FIXED Changes from V2: * only #ifdef out the code where the behaviour is changed (used to be blocking; now non-blocking) This commit was SVN r30035.	2013-12-20 21:05:40 +00:00
Adrian Reber	b42aad44a3	Trying to get the C/R code to compile again. This patch includes various fixes all over the C/R code which are hard to group like the other patches. Changes from V1: * explain why mca_base_component_distill_checkpoint_ready no longer works * compare return result of opal functions with OPAL_* values Changes from V2: * use orte_rml_oob_ft_event() instead of referencing through the modules * properly protect variable (thanks to --enable-picky) This commit was SVN r29922.	2013-12-16 15:35:28 +00:00
Ralph Castain	1ff12362da	Cleanup merge conflict that was incorrectly committed This commit was SVN r29851.	2013-12-09 20:20:14 +00:00
Jeff Squyres	ed9aba3896	This patch fixes error: void value not ignored as it ought to be in the C/R code by ignoring the return value of functions which no longer return a value (only void). Signed-off-by: Adrian Reber <adrian.reber@hs-esslingen.de> This commit was SVN r29816.	2013-12-06 14:40:10 +00:00
Ralph Castain	7c23a5ad65	Fix headers when building with ft enabled. Thanks to Adrian Reber for the patch! This commit was SVN r29743.	2013-11-23 22:58:32 +00:00
Ralph Castain	99611ac1d2	Revert r29166 in favor of a better solution from George This commit was SVN r29199. The following SVN revision numbers were found above: r29166 --> open-mpi/ompi@497c7e6abb	2013-09-18 01:41:26 +00:00
Ralph Castain	497c7e6abb	Fixes trac:2904 The intercomm "merge" function can create a linkage between procs that was not reflected anywhere in a modex, and so at least some of the procs in the resulting communicator don't know how to talk to some of the new communicator's peers. For example, consider the case where: 1. parent job A comm_spawns a process (job B) - these processes exchange modex and can communicate 2. parent job A now comm_spawns another process (job C) - again, these can communicate, but the proc in C knows nothing of B 3. do an intercomm merge across the communicators created by the two comm_spawns. This puts B and C into the same communicator, but they know nothing about how to talk to each other as they were not involved in any exchange of contact info. Hence, collectives on that communicator now fail. This fix adds an API to the ompi/dpm framework that (a) exchanges the modex info across the procs in the merge to ensure all procs know how to communicate, and (b) calls add_procs to give the btl's a chance to select transports to any new procs. cmr:v1.7.3:reviewer=jsquyres This commit was SVN r29166. The following Trac tickets were found above: Ticket 2904 --> https://svn.open-mpi.org/trac/ompi/ticket/2904	2013-09-15 15:00:40 +00:00
Ralph Castain	13ae51a91b	Protect against possible race conditions and threads by ensuring that rml send always occurs inside an event. cmr:v1.7.4:reviewer=jsquyres:subject=Protect against race conditions in rml send This commit was SVN r29128.	2013-09-05 01:16:32 +00:00

1 2 3 4 5 ...

265 Коммитов