openmpi

Автор	SHA1	Сообщение	Дата
Ralph Castain	ae2af61ee3	Update the session dir structure. Restore the creation of a top-level dir based on userid so that everything is contained under the user's top-level dir. Make the next level down (the "job family" level) be either the pid (indicated by a name of "pid.N") or the job family if not launched by mpirun. This allows for proper rendezvous by direct-launched procs.	2016-08-15 22:46:46 -05:00
Ralph Castain	d4327fd973	The node index isn't normally passed with the packed node object, so we need to set it on the remote end as the orted needs to pass it down to the procs. Refactor the registration code to better package proc-level info - we will separate out the node and app levels in a subsequent change.	2016-08-12 12:06:23 -07:00
Ralph Castain	99f7096031	Fix permissions	2016-07-16 21:03:55 -07:00
Ralph Castain	20a91c2baf	Add a new --continuous flag to mpirun that directs ORTE to let a job continue running as app procs terminate. Don't attempt to restart them. Add event notification of abnormally terminating procs, and demonstrate that in the mpi_spin test program. Cleanup debug message	2016-07-13 15:28:33 -07:00
Ralph Castain	ae8444682f	Remove stale variable	2016-07-05 20:07:16 -07:00
Ralph Castain	ee56d9dc1a	Shorten the session directory name as some OS's are now providing unusually long temp directory names, causing us to overflow the sockaddr field	2016-07-05 14:59:50 -07:00
Ralph Castain	6e434d6785	Add support for PMIx tool connections and queries. Initially only support a request to list all known namespaces (jobids) from ORTE, but other folks will extend that support to include additional information Update to match PMIx RFC Fix configury to point to correct libevent and hwloc locations	2016-06-29 19:19:19 -07:00
Jeff Squyres	98a2f5248d	orte: add missing break statement This seems like an obvious typo: insert a missing "break" statement so that we don't fall through to the next case. Fixes CIDs 1362756 and 1362764. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-06-18 07:48:45 -07:00
Ralph Castain	5d330d5220	Enable the PMIx event notification capability and use that for all error notifications, including debugger release. This capability requires use of PMIx 2.0 or above as the features are not available with earlier PMIx releases. When OMPI master is built against an earlier external version, it will fallback to the prior behavior - i.e., debugger will be released via RML and all notifications will go strictly to the default error handler. Add PMIx 2.0 Remove PMIx 1.1.4 Cleanup copying of component Add missing file Touchup a typo in the Makefile.am Update the pmix ext114 component Minor cleanups and resync to master Update to latest PMIx 2.x Update to the PMIx event notification branch latest changes	2016-06-14 13:08:41 -07:00
Gilles Gouaillardet	5f565dfec3	configury: clean the flex generated .c files	2016-06-01 11:13:31 +09:00
Ralph Castain	3913595e10	Enable simulation of large-scale clusters by allowing multiple daemons/node. Specifying the ras_base_multiplier parameter to be greater than 1 will cause ORTE to replicate each allocated node by that factor. A daemon will be spawned for each replica, thus letting ORTE function as if it were on a much larger cluster. Note that this cannot be used for MPI performance testing. It is really only useful for ORTE scaling tests. It also only works with the rsh/ssh launcher.	2016-05-29 18:56:18 -07:00
Ralph Castain	ca69403cc8	In MPMD case, add slots given to each of the executables instead of overwriting	2016-05-15 08:55:43 -07:00
Ralph Castain	58dd41facf	Repair the processing of cmd line options that mapped to MCA params. This was responsible for breaking things like map-by <foo>. Remove debug, let orterun send terminate cmd to DVM Recover the DVM support	2016-05-06 13:14:03 -07:00
rhc54	ff8518853e	Merge pull request #1604 from rhc54/topic/psm2 Improve the transport key print statement to ensure that we don't get…	2016-05-03 13:43:10 -07:00
Jeff Squyres	265e5b9795	Merge pull request #1552 from kmroz/wip-hostname-len-cleanup-1 ompi/opal/orte/oshmem/test: max hostname length cleanup	2016-05-02 09:44:18 -04:00
Ralph Castain	6ac7929bd0	Extend the schizo framework to allow definition of CLI options by environment. Refactor orterun to mesh with the orted_submit code, thus improving code reuse. Eliminate the orte-submit tool as orterun can now meet that need. Cleanups per @jjhursey review	2016-05-01 11:30:25 -07:00
Ralph Castain	29bc24bdd5	Improve the transport key print statement to ensure that we don't get zero fields as this can be a problem for PSM	2016-04-28 20:11:12 -07:00
Karol Mroz	5c11bdb251	orte: fixup hostname max length usage Also removes orte specific max hostname value. Signed-off-by: Karol Mroz <mroz.karol@gmail.com>	2016-04-25 07:08:23 +02:00
Ralph Castain	503e1274a9	Per the discussion on the telecon, change the -host behavior so we only run one instance if no slots were provided and the user didn't specify #procs to run. However, if no slots are given and the user does specify #procs, then let the number of slots default to the #found processing elements Ensure the returned exit status is non-zero if we fail to map If no -np is given, but either -host and/or -hostfile was given, then error out with a message telling the user that this combination is not supported. If -np is given, and -host is given with only one instance of each host, then default the #slots to the detected #pe's and enforce oversubscription rules. If -np is given, and -host is given with more than one instance of a given host, then set the #slots for that host to the number of times it was given and enforce oversubscription rules. Alternatively, the #slots can be specified via "-host foo:N". I therefore believe that row #7 on Jeff's spreadsheet is incorrect. With that one correction, this now passes all the given use-cases on that spreadsheet. Make things behave under unmanaged allocations more like their managed cousins - if the #slots is given, then no-np shall fill things up. Fixes #1344	2016-03-29 11:21:57 -07:00
Ralph Castain	8c14df2328	Revert "Modify singularity support per patch from Greg Kurtzer" This reverts commit open-mpi/ompi@f7257a8310. Ensure that we properly cleanup the session directory tree. Prior code had issues with symlinks, especially if the file that the link points to was already removed as we traverse the tree. Also found that the dirent checks for directory type weren't fully portable, and so fall back to the stat-based approach which is known to be portable. Fix singularity singletons by detecting we are in a container and properly setting the pmix selection to pick the isolated component. Remove a stale restriction blocking use of the sm btl	2016-03-24 11:27:18 -07:00
Ralph Castain	6d7ada9675	Silence Coverity warning	2016-03-14 09:42:43 -07:00
Ralph Castain	4d0cc27eb7	Update the singularity support to match that of the latest singularity master. Remove the restriction on shared memory components by instructing singularity to not isolate the PID space. Add a new schizo API to allow setting up the original app_context. Ensure the container is installed prior to execution.	2016-03-05 21:47:42 -08:00
Ralph Castain	50431001a3	Modify the IOF subsystem to handle per-job directives for redirecting IO to files, tagging IO, and timestamping IO. Fix stdin reader	2016-02-16 18:54:38 -08:00
Ralph Castain	06c3dfc052	Refactor the ORTE DVM code so that external codes can submit multiple jobs using only a single connection to the HNP. * Clean up the DVM so it continues to run even when applications error out and we would ordinarily abort the daemons. * Create a new errmgr component for the DVM to handle the differences. * Cleanup the DVM state component. * Add ORTE bindings directory and brief README * Pass a local tool index around to match jobs. * Pass the jobid on job completion. * Fix initialization logic. * Add framework for python wrapper. * Fix terminate-with-non-zero-exit behavior so it properly terminates only the indicated procs, notifies orte-submit, and orte-dvm continues executing. * Add some missing options to orte-dvm * Fix a bug in -host processing that caused us to ignore the #slots designator. Add a new attribute to indicate "do not expand the DVM" when submitting job spawn requests. * It actually makes no sense that we treat the termination of all children differently than terminating the children of a specific job - it only creates confusion over the difference in behavior. So terminate children the same way regardless. Extend the cmd_line utility to easily allow layering of command line definitions Catch up with ORTE interface change and make build more generic. Disable "fixed dvm" logic for now. Add another cmd_line function to merge a table of cmd line options with another one, reporting as errors any duplicate entries. Use this to allow orterun to reuse the orted_submit code Fix the "fixed_dvm" logic by ensuring we reset num_new_daemons to zero. Also ensure that the nidmap is sent with the first job so the downstream daemons get the node info. Remove a duplicate cmd line entry in orterun. Revise the DVM startup procedure to pass the nidmap only once, at the startup of the DVM. This reduces the overhead on each job launch and ensures that the nidmap doesn't get overwritten. Add new commands to get_orted_comm_cmd_str(). Move ORTE command line options to orte_globals.[ch]. Catch up with extra orte_submit_init parameter. Add example code. Add documentation. Bump version. The nidmap and routing data must be updated prior to propagating the xcast or else the xcast will fail. Fix the return code so it is something more expected when an error occurs. Ensure we get an error returned to us when we fail to launch for some reason. In this case, we will always get a launch_cb as we did indeed attempt to spawn it. The error code will be returned in the complete_cb. Fix the return code from orte_submit_job - it was returning the tracker index instead of "success". Take advantage of ORTE's pretty-print capabilities to provide a nice error output explaining why we failed to launch. Ensure we always get a launch_cb when we fail to launch, but no complete_cb as the job never launched. Extend the error reporting capability to job completion as well. Add index parameter to orte_submit_job(). Add orte_job_cancel and implement ORTE_DAEMON_TERMINATE_JOB_CMD. Factor out dvm termination. Parse the terminate option at tool level. Add error string for ORTE_ERR_JOB_CANCELLED. Add some safeguards. Cleanup and/of comments. Enable the return. Properly ORTE_DECLSPEC orte_submit_halt. Add orte_submit_halt and orte_submit_cancel to interface. Use the plm interface to terminate the job	2016-02-13 08:10:44 -08:00
Gilles Gouaillardet	b55b9e6aee	sentinel: fix sentinel to proc_name conversion converting an opal_process_name_t means the loss of one bit, it was decided to restrict the local job id to 15 bits, so the useful information of an opal_process_name_t can fit in 63 bits.	2016-02-10 15:44:07 +09:00
Ralph Castain	3fbad2e2bd	Transfer across the -host number of slots	2016-02-08 10:38:03 -08:00
Gilles Gouaillardet	7d6b75f3b2	orte_util_snprintf_jobid: return ORTE_SUCCESS or ORTE_ERROR	2016-01-18 09:44:33 +09:00
Ralph Castain	4dad5de8ff	Silence a couple of warnings - strncpy returns a char*, not an int	2016-01-16 09:44:52 -08:00
Gilles Gouaillardet	1d38430e43	opal: replace opal_convert_jobid_to_string with opal_snprintf_jobid	2016-01-14 10:39:03 +09:00
Ralph Castain	64b695669a	Cleanup warnings in opal and orte layers when building optimized on Mac	2015-12-17 07:51:24 -08:00
Jeff Squyres	8bd356549a	orte proc_info.h: use symbolic names This fix was actually applied in the v2.x branch first (as commit open-mpi/ompi-release@a9b22afc1a).	2015-11-10 13:39:21 -08:00
Ralph Castain	f1483eb2dc	Need to delay registration of the waitpid callback until after the fork/exec of the child process. Fix the bit testing of process type so that the proper state component gets selected for HNP.	2015-11-06 21:35:24 -08:00
Ralph Castain	68996d6858	Move the argv_free back to the correct place - I blame Jeff for suggesting it was wrong to begin with	2015-11-05 07:57:54 -08:00
Ralph Castain	fe0c995f6b	Fix a couple of minor issues identified by Jeff	2015-11-03 17:30:51 -08:00
Ralph Castain	24419b6523	Fix relative node syntax for dash-host option	2015-10-31 19:00:46 -07:00
Ralph Castain	0140ff048d	Now that we have an "isolated" PLM component, we cannot just let rsh silently decline to run when it cannot find a launch agent - if we do, then we will -always- run on the local node. So if the user specifies a launch agent and we can't find it, then generate a pretty error message, report a fatal error back to the component select, and exit out. This required modifying the mca_component_select function to actually check the return code on a component query - it was blissfully ignoring it. Also do a little cleanup to avoid bombarding the user with multiple error messages. Thanks to Patrick Begou for reporting the problem	2015-09-24 07:16:48 -07:00
Ralph Castain	749bd4e6fe	Plug a few memory leaks identified by valgrind	2015-09-23 15:21:04 -07:00
Ralph Castain	e6add86e4f	Deal with connect/accept between two jobs from different mpirun's. Somewhat optimize connect/accept by using MPI bcast to distribute the participants instead of another PMIx lookup. Cleanup some Coverity issues.	2015-09-07 09:19:24 -07:00
Ralph Castain	d97bc29102	Remove OPAL_HAVE_HWLOC qualifier and error out if --without-hwloc is given	2015-09-04 16:54:40 -07:00
Ralph Castain	cf6137b530	Integrate PMIx 1.0 with OMPI. Bring Slurm PMI-1 component online Bring the s2 component online Little cleanup - let the various PMIx modules set the process name during init, and then just raise it up to the ORTE level. Required as the different PMI environments all pass the jobid in different ways. Bring the OMPI pubsub/pmi component online Get comm_spawn working again Ensure we always provide a cpuset, even if it is NULL pmix/cray: adjust cray pmix component for pmix Make changes so cray pmix can work within the integrated ompi/pmix framework. Bring singletons back online. Implement the comm_spawn operation using pmix - not tested yet Cleanup comm_spawn - procs now starting, error in connect_accept Complete integration	2015-08-29 16:04:10 -07:00
Ralph Castain	bc7815e178	Adjust the process type flags to remove confusion between orted and dvm state machines	2015-08-21 07:50:08 -07:00
Ralph Castain	0b1d4b62be	Cleanup some cruft and update to coordinate with CM operations: * don't pass --tree-spawn to the orted cmd line. If someone doesn't want tree-spawn, it shows up as an MCA param anyway * ensure state/orted component disqualifies itself from CM operations * clarify the DVM proc_type definitions * ensure we stop littering the tmp dir with session directories	2015-08-12 10:32:14 -07:00
Nathan Hjelm	4d92c9989e	more c99 updates This commit does two things. It removes checks for C99 required headers (stdlib.h, string.h, signal.h, etc). Additionally it removes definitions for required C99 types (intptr_t, int64_t, int32_t, etc). Signed-off-by: Nathan Hjelm <hjelmn@me.com>	2015-06-25 10:14:13 -06:00
Ralph Castain	869041f770	Purge whitespace from the repo	2015-06-23 20:59:57 -07:00
Gilles Gouaillardet	ac5921d7da	orte/util: fix misc memory leak as reported by Coverity with CID 1196738-1196739	2015-06-17 11:17:55 +09:00
Gilles Gouaillardet	67638690ea	orte/util: fix a misc memory leak as reported by Coverity with CID 710652	2015-06-17 11:17:54 +09:00
Ralph Castain	ea35e47228	Fat SMPs (i.e., systems with nodes containing large numbers of cpus) were failing to start due to connection failures of the opal/pmix support. Root cause was that (a) we were setting the client socket to non-blocking before calling connect, and (b) the server was using the event library to harvest the accepts, and also did the handshake while in that event. So the server would backup beyond the connection backlog limit, and we would fail. Changing the client to leave its socket as blocking during the connect doesn't solve the problem by itself - you also have to introduce a sleep delay once the backlog is hit to avoid simply machine-gunning your way thru retries. This gets somewhat difficult to adjust as you don't want to unnecessarily prolong startup time. We've solved this before by adding a listening thread that simply reaps accepts and shoves them into the event library for subsequent processing. This would resolve the problem, but meant yet another daemon-level thread. So I centralized the listening thread support and let multiple elements register listeners on it. Thus, each daemon now has a single listening thread that reaps accepts from multiple sources - for now, the orte/pmix server and the oob/usock support are using it. I'll add in the oob/tcp component later. This still didn't fully resolve the SMP problem, especially on coprocessor cards (e.g., KNC). Removing the shared memory dstore support helped further improve the behavior - it looks like there is some kind of memory paging issue there that needs further understanding. Given that the shared memory support was about to be lost when I bring over the PMIx integration (until it is restored in that library), it seemed like a reasonable thing to just remove it at this point.	2015-05-29 14:37:14 -07:00
Ralph Castain	b5382c9bf9	Rework the OOB selection logic to allow a component (e.g., usock) to direct that it be the sole active component. Remove prior disqualifying code in the oob/tcp component as it was too restrictive - if usock wasn't able to run, it left apps with no way to communicate to their daemon. Have the local daemon check the global modex for the RML URI info of the local procs so it can route messages between them when tcp is the primary channel. A few other minor cleanups included.	2015-05-08 11:15:21 -07:00
Gilles Gouaillardet	2e384a3b65	initialize common symbols from orte A few uninitialized common symbols are remaining (generated by flex) : * orte/mca/rmaps/rank_file/rmaps_rank_file_lex.c: orte_rmaps_rank_file_leng * orte/mca/rmaps/rank_file/rmaps_rank_file_lex.c: orte_rmaps_rank_file_text * orte/util/hostfile/hostfile_lex.c: orte_util_hostfile_leng * orte/util/hostfile/hostfile_lex.c: orte_util_hostfile_text	2015-05-08 10:11:58 +09:00
Ralph Castain	1f8de276de	Consolidate all the QOS changes into one clean commit	2015-05-06 19:48:42 -07:00
Ralph Castain	7d1980ba83	Add the ability to specify the number of desired slots in the --host option. Just giving a host name => one slot (multiple copies of the name yield one slot per copy). Giving "foo:3" indicates you want three slots - a shorthand notation for saying "foo" three times. Giving "foo:*" indicates you want the topology to set the number of slots based on the orte_set_slots param.	2015-04-30 20:35:23 -07:00
Ralph Castain	a013f3059f	For scalability reasons, and to make life easier for the poor Cray-ites, don't bang on the system for the username - we'll just use the uid.	2015-03-19 21:24:13 -07:00
Ralph Castain	b01e8c1063	Include the FQDN version and non-stripped version of the hostname in our list of aliases as these (plus localhost) are the most common aliases we see.	2015-03-17 06:26:26 -07:00
Ralph Castain	a0487e014c	Further reduce the RARP load by removing getaddrinfo for IPv6 connections. Correct typo when checking return on inet_pton. Don't consider the TCP component for apps that are launched via mpirun as it will never be used.	2015-03-16 19:42:05 -07:00
Ralph Castain	5ae42c816e	Attempt to reduce the RARP traffic during definition of allocations	2015-03-16 16:26:40 -07:00
Gilles Gouaillardet	89806c6261	orte/util: fix memory leaks as reported by Coverity with CIDs 70845, 71855, 710652, 1196738, 1196739, 1196757, 1196758, 1269863 and 1269883	2015-03-05 14:06:18 +09:00
Ralph Castain	332e4fa7aa	Minor fix - relative host name syntax cannot support usernames as you can't know which hosts will be selected	2015-02-24 12:15:28 -08:00
Ralph Castain	f7c28ea706	Fix bad test - opal_buffer and opal_ptr can support NULL locations	2015-02-17 21:46:23 -08:00
Ralph Castain	c1282d5b99	The opal_buffer type also generates its own alloc, so need to let it pass thru the check	2015-02-17 21:06:19 -08:00
Ralph Castain	624b16e070	Protect the unload attribute function	2015-02-17 14:21:23 -08:00
Ralph Castain	78245e8a33	Continue massaging of the notifier framework. Convert it to an event-driven interface. Add the ability to report job state if requested. Cleanup object declarations.	2015-02-17 12:51:11 -08:00
Gilles Gouaillardet	b762766969	orte/util: fix misc memory leaks as reported by Coverity with CIDS 70314, 710653-710657 and 1196741-1196744	2015-02-17 12:27:23 +09:00
Ralph Castain	46fb850bb0	Continue adding support for options on orte-submit - still need to shift some of the MCA params to job object attributes	2015-02-10 13:56:14 -08:00
Ralph Castain	2b0b012460	Continue refinement of the DVM operations. Send the spawn request to the right place (it helps) as it isn't a comm_spawn request and has to be treated a little differently. Ensure IO gets forwarded back to the tool. Ensure the tool outputs show_help locally as there is no place to send it.	2015-02-04 06:21:54 -08:00
Howard Pritchard	1e94d84ae6	orte/util: minor improvement to show_help Make sure the show help gives it a good try to print an error message locally if the send_buffer_nb method returns an error.	2015-01-23 13:54:03 -08:00
Ralph Castain	9d5135e6cd	Function definition should use the correct type	2014-12-09 01:04:31 -08:00
Ralph Castain	b1bf557024	Fix the hostfile parser so it correctly ignores binding directives that are just integers. Fix the create_dmns function so we don't hang if we can't get an error before creating the job map for an application.	2014-12-05 15:47:09 -08:00
Ralph Castain	c88f181efe	Fix singleton comm-spawn, yet again. The new grpcomm collectives require a complete knowledge of every active proc in the system in case they participate in a collective. So ensure we pass the required job info when we spawn new daemons, and construct the necessary connections to allow grpcomm to operate.	2014-12-03 18:11:17 -08:00
Ralph Castain	3f9d9ae8b6	Provide tighter LSF integration by correctly handling scenarios where the user has asked LSF to assign bindings. Fix a couple of typos in lex parser definitions. Tell hostfile parser to ignore binding designations in hostfiles. Add an attribute to indicate that cpusets were provided as physical cpu ids. Once validated, a version of this will be backported to the v1.8.4 release.	2014-11-30 11:50:31 -08:00
Ralph Castain	f48b9012cb	Some minor cleanup. We really don't need another peer error constant to indicate that a peer closed as we already have one for "connection failed", and that's all we really know. Update the orte constants to track their opal equivalents.	2014-11-25 08:02:29 -08:00
Ralph Castain	48f702827e	First part of memory leak cleanups from Gilles	2014-11-24 16:53:33 -08:00
rhc54	7c0273ecb3	Merge pull request #276 from teng-lin/master Fixed a bug that fails to parse hostname starting with numbers.	2014-11-19 16:39:00 -08:00
Teng Lin	07ff51f43f	Fixed a bug that fails to parse hostname starting with numbers. According to RFC 1123, hostnames that begin with numbers are valid.	2014-11-19 16:03:55 -08:00
Ralph Castain	bb91517349	All other layers to register their own print-attribute functions so we can maintain pretty-print capabilities as the attributes are extended.	2014-11-19 09:37:59 -08:00
Ralph Castain	37593b232d	Add a marker for the max attr value being used by ORTE so that other, higher-levels can also use the attribute system	2014-11-19 09:37:59 -08:00
Gilles Gouaillardet	84b21d726e	orte/util: add OPAL_{VPID,JOBID} types to orte_attr_{load,unload}	2014-11-14 15:55:25 +09:00
Ralph Castain	780c93ee57	Per the PR and discussion on today's telecon, extend the process name definition as a two-field struct of uint32_t's down to the OPAL layer. This resolves issues created by prior commits that impacted both heterogeneous and SPARC support. This also simplifies the OMPI code base by removing the need for frequent memcpy's when transitioning between the OMPI/ORTE layers and OPAL. We recognize that this means other users of OPAL will need to "wrap" the opal_process_name_t if they desire to abstract it in some fashion. This is regrettable, and we are looking at possible alternatives that might mitigate that requirement. Meantime, however, we have to put the needs of the OMPI community first, and are taking this step to restore hetero and SPARC support.	2014-11-11 17:00:42 -08:00
Elena	03fc809bc9	This commit contains new dstore component sm which is used for communication between pmix server and clients at the same node via shared memory.	2014-11-06 16:01:19 +02:00
Ralph Castain	526682e2f9	Add the ability for a tool that requests spawn of a job to also request forwarding of all output to the tool. The tool is responsible for its own call to push its stdin to the new job. The push request can come -after- the job is started, but the pull request has to be done during the spawn procedure or else output can be lost.	2014-10-23 08:16:49 -07:00
Jeff Squyres	c22e1ae33b	configury: new OPAL_SET_LIB_PREFIX/ORTE_SET_LIB_PREFIX macros These two macros set the prefix for the OPAL and ORTE libraries, respectively. Specifically, the OPAL library will be named libPREFIXopen-pal.la and the ORTE library will be named libPREFIXopen-rte.la. These macros must be called, even if the prefix argument is empty. The intent is that Open MPI will call these macros with an empty prefix, but other projects (such as ORCM) will call these macros with a non-empty prefix. For example, ORCM libraries can be named liborcm-open-pal.la and liborcm-open-rte.la. This scheme is necessary to allow running Open MPI applications under systems that use their own versions of ORTE and OPAL. For example, when running MPI applications under ORTE, if the ORTE and OPAL libraries between OMPI and ORCM are not identical (which, because they are released at different times, are likely to be different), we need to ensure that the OMPI applications link against their ORTE and OPAL libraries, but the ORCM executables link against their ORTE and OPAL libraries.	2014-10-22 10:32:19 -07:00
Jeff Squyres	01fd96bfa5	Revert "Provide a mechanism by which an upstream project can rename the OPAL and ORTE libraries. This is required by projects such as ORCM that have their own ORTE and OPAL libraries in order to avoid library confusion. By renaming their version of the libraries, the OMPI applications can correctly dynamically load the correct one for their build." This reverts commit `63f619f871`.	2014-10-22 10:32:11 -07:00
Ralph Castain	63f619f871	Provide a mechanism by which an upstream project can rename the OPAL and ORTE libraries. This is required by projects such as ORCM that have their own ORTE and OPAL libraries in order to avoid library confusion. By renaming their version of the libraries, the OMPI applications can correctly dynamically load the correct one for their build.	2014-10-10 11:39:08 -07:00
Gilles Gouaillardet	63209eac5b	orte/util: use ORTE_JOB_FAMILY and ORTE_LOCAL_JOBID macros This commit was SVN r32688.	2014-09-09 05:13:00 +00:00
Ralph Castain	2bfb18e004	Resolve some race conditions when async pmix modex modes are invoked. Since calls to "get" data can come both locally and remotely before data for a given proc has actually been received, we have to track all requests that cannot be immediately fulfilled and provide the data once it has been received. This commit was SVN r32664.	2014-09-02 20:04:17 +00:00
Ralph Castain	cb0739dfd4	Update the regex to resolve a bug This commit was SVN r32647.	2014-08-29 22:24:20 +00:00
Ralph Castain	5a13cdb739	Fix a race condition caused by a bad attribute flag that created an OR instead of an AND condition check This commit was SVN r32587.	2014-08-22 22:48:16 +00:00
Ralph Castain	aec5cd08bd	Per the PMIx RFC: WHAT: Merge the PMIx branch into the devel repo, creating a new OPAL “lmix” framework to abstract PMI support for all RTEs. Replace the ORTE daemon-level collectives with a new PMIx server and update the ORTE grpcomm framework to support server-to-server collectives WHY: We’ve had problems dealing with variations in PMI implementations, and need to extend the existing PMI definitions to meet exascale requirements. WHEN: Mon, Aug 25 WHERE: https://github.com/rhc54/ompi-svn-mirror.git Several community members have been working on a refactoring of the current PMI support within OMPI. Although the APIs are common, Slurm and Cray implement a different range of capabilities, and package them differently. For example, Cray provides an integrated PMI-1/2 library, while Slurm separates the two and requires the user to specify the one to be used at runtime. In addition, several bugs in the Slurm implementations have caused problems requiring extra coding. All this has led to a slew of #if’s in the PMI code and bugs when the corner-case logic for one implementation accidentally traps the other. Extending this support to other implementations would have increased this complexity to an unacceptable level. Accordingly, we have: * created a new OPAL “pmix” framework to abstract the PMI support, with separate components for Cray, Slurm PMI-1, and Slurm PMI-2 implementations. * Replaced the current ORTE grpcomm daemon-based collective operation with an integrated PMIx server, and updated the grpcomm APIs to provide more flexible, multi-algorithm support for collective operations. At this time, only the xcast and allgather operations are supported. * Replaced the current global collective id with a signature based on the names of the participating procs. The allows an unlimited number of collectives to be executed by any group of processes, subject to the requirement that only one collective can be active at a time for a unique combination of procs. Note that a proc can be involved in any number of simultaneous collectives - it is the specific combination of procs that is subject to the constraint * removed the prior OMPI/OPAL modex code * added new macros for executing modex send/recv to simplify use of the new APIs. The send macros allow the caller to specify whether or not the BTL supports async modex operations - if so, then the non-blocking “fence” operation is used, if the active PMIx component supports it. Otherwise, the default is a full blocking modex exchange as we currently perform. * retained the current flag that directs us to use a blocking fence operation, but only to retrieve data upon demand This commit was SVN r32570.	2014-08-21 18:56:47 +00:00
Gilles Gouaillardet	f24699623f	check-help-strings cleanup This commit was SVN r32495.	2014-08-11 03:25:22 +00:00
Ralph Castain	42c5073aa3	Safely cleanup the opal_proc_t structure for non-MPI procs. This commit was SVN r32402.	2014-08-01 16:38:49 +00:00
Ralph Castain	7758528d72	Apparently, someone else is destructing the opal_proc_t, so don't destruct it ourselves This commit was SVN r32400.	2014-08-01 14:54:22 +00:00
Ralph Castain	daeb9b6c4f	Some more cleanups. Remove direct references to ORTE by changing OMPI_CAST_ORTE_NAME -> OMPI_CAST_RTE_NAME. Ensure that ORTE tools (mpirun, orted, tools) set the OPAL proc structure fields so OPAL knows what is going on and uses the correct print functions (still need to fix the problem for non-MPI apps). Properly return uint32_t from the opal utilities instead of int32_t as that is what the ORTE process name fields contain. Thanks to Gilles for pointing out some of the discrepancies. This commit was SVN r32398.	2014-08-01 14:44:11 +00:00
Ralph Castain	6c5e592785	Revert r32222, r32210, and r32203 as they created a problem when daemon collectives did not involve app procs on every node. Instead, modify the ompi/mca/rte/orte/rte_orte.h to add a new function that allows apps to request new daemon collective ids for use in barrier and modex operations. This will only appear in ORTE-based installations, but it is only being used by a couple of researchers at the moment. Update the orte/test/mpi/coll_test.c test to show the revised example. This commit was SVN r32234. The following SVN revision numbers were found above: r32203 --> open-mpi/ompi@a523dba41d r32210 --> open-mpi/ompi@2ce11ed5c4 r32222 --> open-mpi/ompi@d55f16db50	2014-07-15 03:48:00 +00:00
Ralph Castain	1feaffbb15	Get the blasted singleton comm_spawn working again. There remain problems with the Slurm interaction in this use-case as the PMI components (if configured to build) try to run even when a Slurm allocation hasn't been made, but I leave that to someone else to resolve. I did, however, tell the Slurm ess to quit interfering with applications launched in this use-case by ORTE daemons, so things do work when inside a Slurm allocation. Also discovered that the rsh launcher is not picking up --enable-orterun-prefix-by-default when invoked during singleton comm_spawn, but I was unable to see why that was happening and ran out of time. cmr=v1.8.2:reviewer=rhc This commit was SVN r32229.	2014-07-13 14:47:22 +00:00
Ralph Castain	a523dba41d	NOTE: this modifies the MPI-RTE interface We have been getting several requests for new collectives that need to be inserted in various places of the MPI layer, all in support of either checkpoint/restart or various research efforts. Until now, this would require that the collective id's be generated at launch. which required modification s to ORTE and other places. We chose not to make collectives reusable as the race conditions associated with resetting collective counters are daunti ng. This commit extends the collective system to allow self-generation of collective id's that the daemons need to support, thereby allowing developers to request any number of collectives for their work. There is one restriction: RTE collectives must occur at the process level - i.e., we don't curren tly have a way of tagging the collective to a specific thread. From the comment in the code: * In order to allow scalable * generation of collective id's, they are formed as: * * top 32-bits are the jobid of the procs involved in * the collective. For collectives across multiple jobs * (e.g., in a connect_accept), the daemon jobid will * be used as the id will be issued by mpirun. This * won't cause problems because daemons don't use the * collective_id * * bottom 32-bits are a rolling counter that recycles * when the max is hit. The daemon will cleanup each * collective upon completion, so this means a job can * never have more than 2*32 collectives going on at a time. If someone needs more than that - they've got * a problem. * * Note that this means (for now) that RTE-level collectives * cannot be done by individual threads - they must be * done at the overall process level. This is required as * there is no guaranteed ordering for the collective id's, * and all the participants must agree on the id of the * collective they are executing. So if thread A on one * process asks for a collective id before thread B does, * but B asks before A on another process, the collectives will * be mixed and not result in the expected behavior. We may * find a way to relax this requirement in the future by * adding a thread context id to the jobid field (maybe taking the * lower 16-bits of that field). This commit includes a test program (orte/test/mpi/coll_test.c) that cycles 100 times across barrier and modex collectives. This commit was SVN r32203.	2014-07-10 18:53:12 +00:00
Ralph Castain	356e7ea904	Move all collective id's into the attributes and let the job pack/unpack take care of them instead of singling them out. Add the envars just prior to forking the children instead of into the launch message itself. Remove a few #if CR as the attributes functionality can handle this condition now. This commit was SVN r32133.	2014-07-03 15:58:13 +00:00
Adrian Reber	47b118c0ae	fix FT compilation This commit was SVN r32094.	2014-06-26 03:40:07 +00:00
Adrian Reber	72f1c7941f	use a consistent naming scheme for the SNAPSHOT attributes This commit was SVN r32083.	2014-06-25 15:26:24 +00:00
Ralph Castain	34e5573988	Resolve the MTT timeout problem. This appears to have largely been caused by missing sigchld notifications, thus causing the daemons to believe that not all procs had exited. Let comm failure also serve as notification of process termination, and add appropriate flags/attributes to avoid multiple reporting of proc termination. This won't transition cleanly to the 1.8 series, and may represent too much change, so we'll have to (a) evaluate whether or not to bring it over (once it demonstrates that it does indeed solve the problem), and (b) develop a custom patch for that purpose. Refs trac:4717 This commit was SVN r32063. The following Trac tickets were found above: Ticket 4717 --> https://svn.open-mpi.org/trac/ompi/ticket/4717	2014-06-21 17:09:02 +00:00
Ralph Castain	42bf7466fc	This isn't as big a change as it appears - a change in one place caused a whole bunch of files to require updated #include's due to some arcane linkage. Rework the orte_wait code to reflect the introduction of the state machine. If we are in cleanup mode and just want to kill all our local children, then there is no reason to be polite about it as that introduces very long delays at scale. Just kill the procs and move on. Refs trac:4717 This commit was SVN r32019. The following Trac tickets were found above: Ticket 4717 --> https://svn.open-mpi.org/trac/ompi/ticket/4717	2014-06-17 17:57:51 +00:00
Gilles Gouaillardet	d26ac02b4a	#if OPAL_HAVE_HWLOC protect access to orte_proc_info_t.cpuset Fix a bug when trunk is configured with --without-hwloc v1.8 is safe so no cmr This commit was SVN r31957.	2014-06-06 07:25:39 +00:00

1 2 3 4 5 ...

607 Коммитов