openmpi

Автор	SHA1	Сообщение	Дата
Gilles Gouaillardet	9c77c6b66d	fortran: fix f08 bindings only define the unique fortran symbol depending on - CAPS - PLAIN - SINGLE_UNDERSCORE - DOUBLE_UNDERSCORE and bind the f08 symbol to the uniquely defined C symbol. Use real data structures to make the code simpler. (perl script written by Jeff)	2015-07-27 16:28:57 +09:00
Ralph Castain	869041f770	Purge whitespace from the repo	2015-06-23 20:59:57 -07:00
Gilles Gouaillardet	9d56b85b55	initialize common symbols from ompi	2015-05-08 10:11:58 +09:00
Nadezhda Kogteva	116169c38a	opal timing: added ability to choose the timer type	2015-04-17 11:15:55 +03:00
Nathan Hjelm	a7b0c00ab6	fix memory leaks and valgrind errors This commit fixes several vagrind errors. Included: - installdirs did not correctly reinitialize all pointers to NULL at close. This causes valgrind errors on a subsequent call to opal_init_tool. - several opal strings were leaked by opal_deregister_params which was setting them to NULL instead of letting them be freed by the MCA variable system. - move opal_net_init to AFTER the variable system is initialized and opal's MCA variables have been registered. opal_net_init uses a variable registered by opal_register_params! - do not leak ompi_mpi_main_thread when it is allocated by MPI_T_init_thread. - do not overwrite ompi_mpi_main_thread if it is already set (by MPI_T_init_thread). - mca_base_var: read_files was overwritting mca_base_var_file_list even if it was non-NULL. - mca_base_var: set all file global variables to initial states on finalize. - btl/vader: decrement enumerator reference count to ensure that it is freed. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-04-11 09:28:35 -06:00
Artem Polyakov	01601f3284	Merge pull request #305 from artpol84/timing Timing framework improvement	2014-12-16 15:13:48 +06:00
George Bosilca	3430714989	Correctly propagate the requested level of thread support during the component init calls.	2014-12-13 02:36:21 -05:00
Artem Polyakov	8ffad75a0a	Introduce timing interval measurement facility in timing framework	2014-12-10 16:47:49 +06:00
Ralph Castain	06e49d0e92	Per contribution from Pascal Deveze of Bull: move opal_set_using_threads earlier in MPI_Init (before datatype init) so the value gets set in time to be properly used.	2014-12-09 00:37:57 -08:00
Ralph Castain	780c93ee57	Per the PR and discussion on today's telecon, extend the process name definition as a two-field struct of uint32_t's down to the OPAL layer. This resolves issues created by prior commits that impacted both heterogeneous and SPARC support. This also simplifies the OMPI code base by removing the need for frequent memcpy's when transitioning between the OMPI/ORTE layers and OPAL. We recognize that this means other users of OPAL will need to "wrap" the opal_process_name_t if they desire to abstract it in some fashion. This is regrettable, and we are looking at possible alternatives that might mitigate that requirement. Meantime, however, we have to put the needs of the OMPI community first, and are taking this step to restore hetero and SPARC support.	2014-11-11 17:00:42 -08:00
Ralph Castain	fd6a044b7f	Cleanup some cruft resulting from the move of the btl's to opal. We had created the ability to delay modex operations, which included a need to delay retrieving hostname info for remote procs. This allowed us to not retrieve the modex info until first message unless required - the hostname is generally only required for debug and error messages. Properly setup the opal_process_info structure early in the initialization procedure. Define the local hostname right at the beginning of opal_init so all parts of opal can use it. Overlay that during orte_init as the user may choose to remove fqdn and strip prefixes during that time. Setup the job_session_dir and other such info immediately when it becomes available during orte_init.	2014-10-03 16:02:57 -06:00
Ralph Castain	dfb952fa78	[Contribution from Artem - moved it to svn from git for him] Replace our old, clunky timing setup with a much nicer one that is only available if configured with --enable-timing. Add a tool for profiling clock differences between the nodes so you can get more precise timing measurements. I'll ask Artem to update the Github wiki with full instructions on how to use this setup. This commit was SVN r32738.	2014-09-15 18:00:46 +00:00
Ralph Castain	aec5cd08bd	Per the PMIx RFC: WHAT: Merge the PMIx branch into the devel repo, creating a new OPAL “lmix” framework to abstract PMI support for all RTEs. Replace the ORTE daemon-level collectives with a new PMIx server and update the ORTE grpcomm framework to support server-to-server collectives WHY: We’ve had problems dealing with variations in PMI implementations, and need to extend the existing PMI definitions to meet exascale requirements. WHEN: Mon, Aug 25 WHERE: https://github.com/rhc54/ompi-svn-mirror.git Several community members have been working on a refactoring of the current PMI support within OMPI. Although the APIs are common, Slurm and Cray implement a different range of capabilities, and package them differently. For example, Cray provides an integrated PMI-1/2 library, while Slurm separates the two and requires the user to specify the one to be used at runtime. In addition, several bugs in the Slurm implementations have caused problems requiring extra coding. All this has led to a slew of #if’s in the PMI code and bugs when the corner-case logic for one implementation accidentally traps the other. Extending this support to other implementations would have increased this complexity to an unacceptable level. Accordingly, we have: * created a new OPAL “pmix” framework to abstract the PMI support, with separate components for Cray, Slurm PMI-1, and Slurm PMI-2 implementations. * Replaced the current ORTE grpcomm daemon-based collective operation with an integrated PMIx server, and updated the grpcomm APIs to provide more flexible, multi-algorithm support for collective operations. At this time, only the xcast and allgather operations are supported. * Replaced the current global collective id with a signature based on the names of the participating procs. The allows an unlimited number of collectives to be executed by any group of processes, subject to the requirement that only one collective can be active at a time for a unique combination of procs. Note that a proc can be involved in any number of simultaneous collectives - it is the specific combination of procs that is subject to the constraint * removed the prior OMPI/OPAL modex code * added new macros for executing modex send/recv to simplify use of the new APIs. The send macros allow the caller to specify whether or not the BTL supports async modex operations - if so, then the non-blocking “fence” operation is used, if the active PMIx component supports it. Otherwise, the default is a full blocking modex exchange as we currently perform. * retained the current flag that directs us to use a blocking fence operation, but only to retrieve data upon demand This commit was SVN r32570.	2014-08-21 18:56:47 +00:00
Jeff Squyres	132375f07f	helpfiles: fix filenames referenced by calls to show_help() This commit was SVN r32453.	2014-08-08 13:34:15 +00:00
Ralph Castain	daeb9b6c4f	Some more cleanups. Remove direct references to ORTE by changing OMPI_CAST_ORTE_NAME -> OMPI_CAST_RTE_NAME. Ensure that ORTE tools (mpirun, orted, tools) set the OPAL proc structure fields so OPAL knows what is going on and uses the correct print functions (still need to fix the problem for non-MPI apps). Properly return uint32_t from the opal utilities instead of int32_t as that is what the ORTE process name fields contain. Thanks to Gilles for pointing out some of the discrepancies. This commit was SVN r32398.	2014-08-01 14:44:11 +00:00
George Bosilca	9b2fcd898e	No more ORTE specifics in this file. This commit was SVN r32384.	2014-07-31 22:34:16 +00:00
George Bosilca	f39abb9e69	Reverting r32355: a number of processes is not a notion that a low level communication library should use to initialize itself. Ralph will champion this change back with an RFC if there is a realistic need/use case from the community. This commit was SVN r32361. The following SVN revision numbers were found above: r32355 --> open-mpi/ompi@c903917f47	2014-07-30 20:11:35 +00:00
Ralph Castain	c903917f47	Expose the num_procs information to the opal layer as the info is needed in several BTLs This commit was SVN r32355.	2014-07-30 09:33:41 +00:00
Ralph Castain	bcade48e27	Move the opal_process_info initialization to the right place - all that info is known (for ourselves only) immediately after rte_init. This commit was SVN r32329.	2014-07-28 19:19:35 +00:00
George Bosilca	a3feb627cf	Move some of the ompi_process_info down in OPAL. This commit was SVN r32324.	2014-07-26 21:43:34 +00:00
Ralph Castain	552c9ca5a0	George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-) WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic. This commit was SVN r32317.	2014-07-26 00:47:28 +00:00
Ralph Castain	6c5e592785	Revert r32222, r32210, and r32203 as they created a problem when daemon collectives did not involve app procs on every node. Instead, modify the ompi/mca/rte/orte/rte_orte.h to add a new function that allows apps to request new daemon collective ids for use in barrier and modex operations. This will only appear in ORTE-based installations, but it is only being used by a couple of researchers at the moment. Update the orte/test/mpi/coll_test.c test to show the revised example. This commit was SVN r32234. The following SVN revision numbers were found above: r32203 --> open-mpi/ompi@a523dba41d r32210 --> open-mpi/ompi@2ce11ed5c4 r32222 --> open-mpi/ompi@d55f16db50	2014-07-15 03:48:00 +00:00
Ralph Castain	a523dba41d	NOTE: this modifies the MPI-RTE interface We have been getting several requests for new collectives that need to be inserted in various places of the MPI layer, all in support of either checkpoint/restart or various research efforts. Until now, this would require that the collective id's be generated at launch. which required modification s to ORTE and other places. We chose not to make collectives reusable as the race conditions associated with resetting collective counters are daunti ng. This commit extends the collective system to allow self-generation of collective id's that the daemons need to support, thereby allowing developers to request any number of collectives for their work. There is one restriction: RTE collectives must occur at the process level - i.e., we don't curren tly have a way of tagging the collective to a specific thread. From the comment in the code: * In order to allow scalable * generation of collective id's, they are formed as: * * top 32-bits are the jobid of the procs involved in * the collective. For collectives across multiple jobs * (e.g., in a connect_accept), the daemon jobid will * be used as the id will be issued by mpirun. This * won't cause problems because daemons don't use the * collective_id * * bottom 32-bits are a rolling counter that recycles * when the max is hit. The daemon will cleanup each * collective upon completion, so this means a job can * never have more than 2*32 collectives going on at a time. If someone needs more than that - they've got * a problem. * * Note that this means (for now) that RTE-level collectives * cannot be done by individual threads - they must be * done at the overall process level. This is required as * there is no guaranteed ordering for the collective id's, * and all the participants must agree on the id of the * collective they are executing. So if thread A on one * process asks for a collective id before thread B does, * but B asks before A on another process, the collectives will * be mixed and not result in the expected behavior. We may * find a way to relax this requirement in the future by * adding a thread context id to the jobid field (maybe taking the * lower 16-bits of that field). This commit includes a test program (orte/test/mpi/coll_test.c) that cycles 100 times across barrier and modex collectives. This commit was SVN r32203.	2014-07-10 18:53:12 +00:00
Ralph Castain	f3cb124e50	Revert r32082 and r32070 - the developer's conference has decided to go a different direction on the threaded progress effort. This will involve some degree of prototyping to understand the tradeoffs prior to making a final design decision, and so we'll hold off on the final change until that is completed. This commit was SVN r32089. The following SVN revision numbers were found above: r32070 --> open-mpi/ompi@12d92d0c22 r32082 --> open-mpi/ompi@aa6438ef7a	2014-06-25 20:43:28 +00:00
Ralph Castain	12d92d0c22	Per the OMPI developer conference, remove the last vestiges of OMPI_USE_PROGRESS_THREADS This commit was SVN r32070.	2014-06-24 17:05:11 +00:00
Ralph Castain	248a4b100f	Per Artem, we don't know our VPID at the time of getting the initial timing mark, so just get it if timing is requested This commit was SVN r31951.	2014-06-04 16:28:41 +00:00
George Bosilca	750c6c7861	Update the UTK copyright on the topology related files. This commit was SVN r31805.	2014-05-16 22:23:52 +00:00
Ralph Castain	ab4f8585b0	When we abort during MPI_Init, we currently emit a totally incorrect error message stating that we were unable to aggregate error messages and cannot guarantee all other processes were killed. This simply isn't true IF the rte has been initialized. So track that the rte has reached that point, and only emit the new message if it is accurate. Note that we still generate a TON of output for a minor error: Ralphs-iMac:examples rhc$ mpirun -n 3 -mca btl sm ./hello_c -------------------------------------------------------------------------- At least one pair of MPI processes are unable to reach each other for MPI communications. This means that no Open MPI device has indicated that it can be used to communicate between these processes. This is an error; Open MPI requires that all MPI processes be able to reach each other. This error can sometimes be the result of forgetting to specify the "self" BTL. Process 1 ([[50239,1],2]) is on host: Ralphs-iMac Process 2 ([[50239,1],2]) is on host: Ralphs-iMac BTLs attempted: sm Your MPI job is now going to abort; sorry. -------------------------------------------------------------------------- * An error occurred in MPI_Init * on a NULL communicator * MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, * and potentially your MPI job) * An error occurred in MPI_Init * on a NULL communicator * MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, * and potentially your MPI job) * An error occurred in MPI_Init * on a NULL communicator * MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, * and potentially your MPI job) -------------------------------------------------------------------------- MPI_INIT has failed because at least one MPI process is unreachable from another. This usually means that an underlying communication plugin -- such as a BTL or an MTL -- has either not loaded or not allowed itself to be used. Your MPI job will now abort. You may wish to try to narrow down the problem; * Check the output of ompi_info to see which BTL/MTL plugins are available. * Run your application with MPI_THREAD_SINGLE. * Set the MCA parameter btl_base_verbose to 100 (or mtl_base_verbose, if using MTL-based communications) to see exactly which communication plugins were considered and/or discarded. -------------------------------------------------------------------------- ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted. ------------------------------------------------------- -------------------------------------------------------------------------- mpirun detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was: Process name: [[50239,1],2] Exit code: 1 -------------------------------------------------------------------------- [Ralphs-iMac.local:23227] 2 more processes have sent help message help-mca-bml-r2.txt / unreachable proc [Ralphs-iMac.local:23227] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages [Ralphs-iMac.local:23227] 2 more processes have sent help message help-mpi-runtime / mpi_init:startup:pml-add-procs-fail Ralphs-iMac:examples rhc$ Hopefully, we can agree on a way to reduce this verbage! This commit was SVN r31686. The following SVN revision numbers were found above: r2 --> open-mpi/ompi@58fdc18855	2014-05-08 15:48:16 +00:00
Nathan Hjelm	595a6e94e6	Fix typos in r31260 Also added some missing values and sentinels. cmr=v1.8:ticket=trac:4470 This commit was SVN r31263. The following SVN revision numbers were found above: r31260 --> open-mpi/ompi@69036437b7 The following Trac tickets were found above: Ticket 4470 --> https://svn.open-mpi.org/trac/ompi/ticket/4470	2014-03-27 22:34:28 +00:00
George Bosilca	bde9619386	Various minor cleanups. This commit was SVN r30431.	2014-01-26 17:27:12 +00:00
Mike Dubman	2af0f878bc	remove bml_init call, called from btl add_proc. Refs trac:3763 This commit was SVN r30310. The following Trac tickets were found above: Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763	2014-01-17 16:52:20 +00:00
Mike Dubman	b7750ccbf4	OSHMEM: bml initialization is moved into ompi_init it fixes race of mca_var segfault in finalization of shmem based on this thread: http://www.open-mpi.org/community/lists/devel/2014/01/13778.php Refs trac:3763 fixed by Igor, reviewed by Brian This commit was SVN r30304. The following Trac tickets were found above: Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763	2014-01-17 06:09:29 +00:00
Ralph Castain	e7710873a1	Open/close the RTE framework cmr=v1.7.4:reviewer=hjelmn This commit was SVN r30270.	2014-01-13 17:43:24 +00:00
Nathan Hjelm	c17b21b11d	Due to MPI_Comm_idup we can no longer use the communicator's CID as the fortran handle. Use a seperate opal_pointer_array to keep track of the fortran handles of communicators. This commit also fixes a bug in ompi_comm_idup where the newcomm was not set until after the operation completed. cmr=v1.7.4:reviewer=jsquyres:ticket=trac:3796 This commit was SVN r29342. The following Trac tickets were found above: Ticket 3796 --> https://svn.open-mpi.org/trac/ompi/ticket/3796	2013-10-03 01:11:28 +00:00
Nathan Hjelm	195c892a9d	Add support for MPI_Comm_dup_with_info, MPI_Comm_create_group, and MPI_Comm_idup. As part of this work I implemented a basic request scheduler in ompi/comm/comm_request.c. This scheduler might be useful for more than just communicator requests and could be moved to ompi/request if there is a demand. Otherwise I will leave it where it is. Added a non-blocking version of ompi_comm_set to support ompi_comm_idup. The call makes a recursive call to comm_dup and a non-blocking version was needed. To simplify the code the blocking version calls the nonblocking version and waits on the resulting request if one exists. cmr=v1.7.4:reviewer=jsquyres:ticket=trac:3796 This commit was SVN r29334. The following Trac tickets were found above: Ticket 3796 --> https://svn.open-mpi.org/trac/ompi/ticket/3796	2013-10-02 14:26:40 +00:00
George Bosilca	2aed876be6	Fix the bug where MPI_is_thread_main returns true for all threads when not in MPI_THREAD_MULTIPLE mode. Thanks to Lisandro Dalcin for pointing it out. This commit was SVN r29113.	2013-09-04 11:10:51 +00:00
Ralph Castain	a200e4f865	As per the RFC, bring in the ORTE async progress code and the rewrite of OOB: * THIS RFC INCLUDES A MINOR CHANGE TO THE MPI-RTE INTERFACE * Note: during the course of this work, it was necessary to completely separate the MPI and RTE progress engines. There were multiple places in the MPI layer where ORTE_WAIT_FOR_COMPLETION was being used. A new OMPI_WAIT_FOR_COMPLETION macro was created (defined in ompi/mca/rte/rte.h) that simply cycles across opal_progress until the provided flag becomes false. Places where the MPI layer blocked waiting for RTE to complete an event have been modified to use this macro. *************************************************************************************** I am reissuing this RFC because of the time that has passed since its original release. Since its initial release and review, I have debugged it further to ensure it fully supports tests like loop_spawn. It therefore seems ready for merge back to the trunk. Given its prior review, I have set the timeout for one week. The code is in https://bitbucket.org/rhc/ompi-oob2 WHAT: Rewrite of ORTE OOB WHY: Support asynchronous progress and a host of other features WHEN: Wed, August 21 SYNOPSIS: The current OOB has served us well, but a number of limitations have been identified over the years. Specifically: * it is only progressed when called via opal_progress, which can lead to hangs or recursive calls into libevent (which is not supported by that code) * we've had issues when multiple NICs are available as the code doesn't "shift" messages between transports - thus, all nodes had to be available via the same TCP interface. * the OOB "unloads" incoming opal_buffer_t objects during the transmission, thus preventing use of OBJ_RETAIN in the code when repeatedly sending the same message to multiple recipients * there is no failover mechanism across NICs - if the selected NIC (or its attached switch) fails, we are forced to abort * only one transport (i.e., component) can be "active" The revised OOB resolves these problems: * async progress is used for all application processes, with the progress thread blocking in the event library * each available TCP NIC is supported by its own TCP module. The ability to asynchronously progress each module independently is provided, but not enabled by default (a runtime MCA parameter turns it "on") * multi-address TCP NICs (e.g., a NIC with both an IPv4 and IPv6 address, or with virtual interfaces) are supported - reachability is determined by comparing the contact info for a peer against all addresses within the range covered by the address/mask pairs for the NIC. * a message that arrives on one TCP NIC is automatically shifted to whatever NIC that is connected to the next "hop" if that peer cannot be reached by the incoming NIC. If no TCP module will reach the peer, then the OOB attempts to send the message via all other available components - if none can reach the peer, then an "error" is reported back to the RML, which then calls the errmgr for instructions. * opal_buffer_t now conforms to standard object rules re OBJ_RETAIN as we no longer "unload" the incoming object * NIC failure is reported to the TCP component, which then tries to resend the message across any other available TCP NIC. If that doesn't work, then the message is given back to the OOB base to try using other components. If all that fails, then the error is reported to the RML, which reports to the errmgr for instructions * obviously from the above, multiple OOB components (e.g., TCP and UD) can be active in parallel * the matching code has been moved to the RML (and out of the OOB/TCP component) so it is independent of transport * routing is done by the individual OOB modules (as opposed to the RML). Thus, both routed and non-routed transports can simultaneously be active * all blocking send/recv APIs have been removed. Everything operates asynchronously. KNOWN LIMITATIONS: * although provision is made for component failover as described above, the code for doing so has not been fully implemented yet. At the moment, if all connections for a given peer fail, the errmgr is notified of a "lost connection", which by default results in termination of the job if it was a lifeline * the IPv6 code is present and compiles, but is not complete. Since the current IPv6 support in the OOB doesn't work anyway, I don't consider this a blocker * routing is performed at the individual module level, yet the active routed component is selected on a global basis. We probably should update that to reflect that different transports may need/choose to route in different ways * obviously, not every error path has been tested nor necessarily covered * determining abnormal termination is more challenging than in the old code as we now potentially have multiple ways of connecting to a process. Ideally, we would declare "connection failed" when all transports can no longer reach the process, but that requires some additional (possibly complex) code. For now, the code replicates the old behavior only somewhat modified - i.e., if a module sees its connection fail, it checks to see if it is a lifeline. If so, it notifies the errmgr that the lifeline is lost - otherwise, it notifies the errmgr that a non-lifeline connection was lost. * reachability is determined solely on the basis of a shared subnet address/mask - more sophisticated algorithms (e.g., the one used in the tcp btl) are required to handle routing via gateways * the RML needs to assign sequence numbers to each message on a per-peer basis. The receiving RML will then deliver messages in order, thus preventing out-of-order messaging in the case where messages travel across different transports or a message needs to be redirected/resent due to failure of a NIC This commit was SVN r29058.	2013-08-22 16:37:40 +00:00
Ralph Castain	e4e678e234	Per the RFC and discussion on the devel list, update the RTE-MPI error handling interface. There are a few differences in the code from the original RFC that came out of the discussion - I've captured those in the following writeup George and I were talking about ORTE's error handling the other day in regards to the right way to deal with errors in the updated OOB. Specifically, it seemed a bad idea for a library such as ORTE to be aborting the job on its own prerogative. If we lose a connection or cannot send a message, then we really should just report it upwards and let the application and/or upper layers decide what to do about it. The current code base only allows a single error callback to exist, which seemed unduly limiting. So, based on the conversation, I've modified the errmgr interface to provide a mechanism for registering any number of error handlers (this replaces the current "set_fault_callback" API). When an error occurs, these handlers will be called in order until one responds that the error has been "resolved" - i.e., no further action is required - by returning OMPI_SUCCESS. The default MPI layer error handler is specified to go "last" and calls mpi_abort, so the current "abort" behavior is preserved unless other error handlers are registered. In the register_callback function, I provide an "order" param so you can specify "this callback must come first" or "this callback must come last". Seemed to me that we will probably have different code areas registering callbacks, and one might require it go first (the default "abort" will always require it go last). So you can append and prepend, or go first. Note that only one registration can declare itself "first" or "last", and since the default "abort" callback automatically takes "last", that one isn't available. :-) The errhandler callback function passes an opal_pointer_array of structs, each of which contains the name of the proc involved (which can be yourself for internal errors) and the error code. This is a change from the current fault callback which returned an opal_pointer_array of just process names. Rationale is that you might need to see the cause of the error to decide what action to take. I realize that isn't a requirement for remote procs, but remember that we will use the SAME interface to report RTE errors internal to the proc itself. In those cases, you really do need to see the error code. It is legal to pass a NULL for the pointer array (e.g., when reporting an internal failure without error code), so handlers must be prepared for that possibility. If people find that too burdensome, we can remove it. Should we ever decide to create a separate callback path for internal errors vs remote process failures, or if we decide to do something different based on experience, then we can adjust this API. This commit was SVN r28852.	2013-07-19 01:08:53 +00:00
George Bosilca	5fae72b9aa	Add the MPI 2.2 MPI_Dist_graph functionality. This patch reshape the way we deal with topologies completely. Where our topologies were mainly storage components (they were not capable of creating the new communicator), the new version is built around a [possibly] common representation (in mca/topo/topo.h), but the functions to attach and retrieve the topological information are specific to each component. As a result the ompi_create_cart and ompi_create_graph functions become useless and have been removed. In addition to adding the internal infrastructure to manage the topology information, it updates the MPI interface, and the debuggers support and provides all Fortran interfaces. This commit was SVN r28687.	2013-07-01 12:40:08 +00:00
Ralph Castain	5d7a93c032	Add the ability to use an external version of libevent. Clearly not recommended at this time. I've verified that it works in limited scenarios, but more thorough testing and performance impacts need to be assessed. Interesting how many includes had to be fixed here and there to fill in missing dependencies :-) This commit was SVN r28411.	2013-04-29 17:02:37 +00:00
Nathan Hjelm	bccf8c657a	Per RFC add initial support for the MPI 3.0 tools interface. Current MPI_T support: - Full cvar interface. - Full categories interface. - No pvar support at this time. This commit was SVN r28376.	2013-04-24 15:59:23 +00:00
Nathan Hjelm	17315bf360	Now that the entire codebase has been updated to use the MCA framework system remove the last calls to the MCA parameter system. This commit was SVN r28242.	2013-03-27 21:17:53 +00:00
Nathan Hjelm	9d4a26f47d	Update OMPI frameworks to use the MCA framework system. Notes: - This commit also eliminates the need for an available components list in use in several frameworks. None of the code in question was making use of the priority field of the priority component list item so these extra lists were removed. - Cleaned up selection code in several frameworks to sort lists using opal_list_sort. - Cleans up the ompi/orte-info functions. Expose the functions that construct the list of params so they can be used elsewhere. patches for mtl/portals4 from brian missed a few output variables in openib This commit was SVN r28241.	2013-03-27 21:17:31 +00:00
Nathan Hjelm	cf377db823	MCA/base: Add new MCA variable system Features: - Support for an override parameter file (openmpi-mca-param-override.conf). Variable values in this file can not be overridden by any file or environment value. - Support for boolean, unsigned, and unsigned long long variables. - Support for true/false values. - Support for enumerations on integer variables. - Support for MPIT scope, verbosity, and binding. - Support for command line source. - Support for setting variable source via the environment using OMPI_MCA_SOURCE_<var name>=source (either command or file:filename) - Cleaner API. - Support for variable groups (equivalent to MPIT categories). Notes: - Variables must be created with a backing store (char *, int , or bool *) that must live at least as long as the variable. - Creating a variable with the MCA_BASE_VAR_FLAG_SETTABLE enables the use of mca_base_var_set_value() to change the value. - String values are duplicated when the variable is registered. It is up to the caller to free the original value if necessary. The new value will be freed by the mca_base_var system and must not be freed by the user. - Variables with constant scope may not be settable. - Variable groups (and all associated variables) are deregistered when the component is closed or the component repository item is freed. This prevents a segmentation fault from accessing a variable after its component is unloaded. - After some discussion we decided we should remove the automatic registration of component priority variables. Few component actually made use of this feature. - The enumerator interface was updated to be general enough to handle future uses of the interface. - The code to generate ompi_info output has been moved into the MCA variable system. See mca_base_var_dump(). opal: update core and components to mca_base_var system orte: update core and components to mca_base_var system ompi: update core and components to mca_base_var system This commit also modifies the rmaps framework. The following variables were moved from ppr and lama: rmaps_base_pernode, rmaps_base_n_pernode, rmaps_base_n_persocket. Both lama and ppr create synonyms for these variables. This commit was SVN r28236.	2013-03-27 21:09:41 +00:00
Brian Barrett	6c3f986d79	* Fix issue with duplicate symbol for the initialize hook due to it existing in both libmpi and libopen-pal by removing the one for libopen-pal. This won't work if we eventually need registration caching in opal/orte, but I'm hoping that by that point, OFED will have gotten off its butt and properly integrated ummunotify into the verbs layer so that this code can go away. At the same time, fix a minor issue where the init hook was being called twice, once by the libc malloc and once by our malloc by removing the call from our malloc. This commit was SVN r28202.	2013-03-21 23:05:54 +00:00
Brian Barrett	fc2b3b8d46	Ugh. Work around an issue with memory hooks and the change from one big library to multiple libraries that are implicitly sucked into the executable as a dependency of libmpi. The initialize hook isn't visible to libc on some linux distributions when it's in libopal and libopal isn't explicity linked into the executable. The fix is to have a duplicate initialize hook in libmpi as well as libopal. sigh. This commit was SVN r28164.	2013-03-11 19:22:24 +00:00
Brian Barrett	312f37706e	In talking about this with Jeff and Ralph, we don't actually need ompi_show_help, because opal_show_help is replaced with an aggregating version when using ORTE, so there's no reason to directly call orte_show_help. This commit was SVN r28051.	2013-02-12 21:10:11 +00:00
Brian Barrett	f42783ae1a	Move the RTE framework change into the trunk. With this change, all non-CR runtime code goes through one of the rte, dpm, or pubsub frameworks. This commit was SVN r27934.	2013-01-27 23:25:10 +00:00
Brian Barrett	fc3df11e08	Remove the (only two) fortran constants from OPAL. The only places that actually care if opal_pointer_array is limited to handle_max already passes that in as the max_size during init, so don't need it there. The arch constant was a bit more difficult, so pass that in during MPI init and leave empty otherwise. This is to help with the effort to allow building ompi against an external opal or orte. This commit was SVN r27817.	2013-01-15 01:27:36 +00:00
Jeff Squyres	a7ea880d0a	Refs trac:3309 * Minor man page tweaks * Use existing ompi_mpi_thread_requested global This commit was SVN r27308. The following Trac tickets were found above: Ticket 3309 --> https://svn.open-mpi.org/trac/ompi/ticket/3309	2012-09-11 21:12:06 +00:00
Ralph Castain	fb4af5e29c	Implement the rest of MPI-3 ticket #313 based on side-bar agreement with MPICH2 folks. Fix a bug in the original ompi_info code that put the NULL terminator one position too far if the returned string exceeded MPI_MAX_INFO_VAL in length in ompi_info_get. This commit was SVN r27292.	2012-09-11 17:03:49 +00:00
Jeff Squyres	7a4d6f05cf	It is valid to call MPI_Init(NULL, NULL), so be sure to handle the case where the passed-in argv is NULL. This commit was SVN r27029.	2012-08-14 16:33:53 +00:00
Ralph Castain	589acf550c	Improve the new MPI_INFO_ENV to better handle Java applications and to correctly report the info for singletons. This commit was SVN r27025.	2012-08-13 22:13:49 +00:00
Ralph Castain	cb48fd52d4	Implement the MPI_Info part of MPI-3 Ticket 313. Add an MPI_info object MPI_INFO_GET_ENV that contains a number of run-time related pieces of info. This includes all the required ones in the ticket, plus a few that specifically address recent user questions: "num_app_ctx" - the number of app_contexts in the job "first_rank" - the MPI rank of the first process in each app_context "np" - the number of procs in each app_context Still need clarification on the MPI_Init portion of the ticket. Specifically, does the ticket call for returning an error is someone calls MPI_Init more than once in a program? We set a flag to tell us that we have been initialized, but currently never check it. This commit was SVN r27005.	2012-08-12 01:28:23 +00:00
Jeff Squyres	7390ab8a23	Many updates and bug fixes for the Fortran bindings. Sorry these aren't separated out into individual commits; they represent a few months of work in the Mercurial branch, and it seemed error-prone to try to break them up into multiple SVN commits. * Remove 2nd overloaded interfaces for MPI_TESTALL, MPI_TESTSOME, MPI_WAITALL, and MPI_WAITSOME in the "mpi" module implementations (because we're not allowed to have them, anyway -- it causes complications in the profiling interface). This forced an MPI-2.2 errata in the MPI Forum; we applied the errata here (the array of statuses parameter could not have a specific dimension specified in the dummy argument). Fixes trac:3166. * Similarly, fix type for MPI_ARGVS_NULL in Fortran * Add MPI_3.0 function MPI_F_SYNC_REG (Fortran interfaces only). * Add MPI-3.0 MPI_MESSAGE_NO_PROC in the mpi_f08 module. * Added mpi_f08 handle comparison operators, per MPI-3.0 addendum to the F08 proposal at the last Forum meeting. * Added missing type(MPI_File) and type(Message) in mpi_f08 module. * Fix --disable-mpi-io configure switch with all Fortran interfaces * Re-factor the Fortran header files to be fundamentally simpler and easier to maintain. Fortran constant values in the header files are now generated by a script named mpif-values.pl during autogen.pl (they were previously generated by mpif-common.pl, but it was quite a bit more subtle/complex). A second commit will follow this one to update svn:ignore values (just to ensure we don't muck up the first commit with the SVN client getting confused by the changed ignore values and new/changed files). * Fix some dependencies for compile ordering in ompi/mpi/fortran/use-mpi-ignore-tkr/Makefile.am. * Fix bad wording in several places (.m4 file name, ompi_info output, etc.): we previoulsy said "F08 assumed shape" when we really meant "F08 assumed rank" (for Fortran gurus, those are very different things). * Removed the GREEK/SVN version string from mpif.h. It really had no purpose being there. Still to be done: * Handling of 2D array of strings in MPI_COMM_SPAWN_MULTIPLE still isn't right yet. Not sure how many people really care about this :-), but it is still broken. This commit was SVN r26997. The following Trac tickets were found above: Ticket 3166 --> https://svn.open-mpi.org/trac/ompi/ticket/3166	2012-08-10 21:19:47 +00:00
Josh Hursey	28681deffa	Backout the ORCA commit. :( There is a linking issue on Mac OSX that needs to be addressed before this is able to come back into the trunk. This commit was SVN r26676.	2012-06-27 01:28:28 +00:00
Josh Hursey	542330e3a7	Commit of ORCA: Open MPI Runtime Collaborative Abstraction This is a runtime interposition project that sits between the OMPI and ORTE layers in Open MPI. The project is described on the wiki: https://svn.open-mpi.org/trac/ompi/wiki/Runtime_Interposition And on this email thread: http://www.open-mpi.org/community/lists/devel/2012/06/11109.php This commit was SVN r26670.	2012-06-26 21:42:16 +00:00
Ralph Castain	e6f3586415	Remove the orte notifier framework, per discussion at the devel meeting and follow-up with Jeff (who took the action item) This commit was SVN r26637.	2012-06-22 18:09:23 +00:00
Jeff Squyres	2ba10c37fe	Per RFC, bring in the following changes: * Remove paffinity, maffinity, and carto frameworks -- they've been wholly replaced by hwloc. * Move ompi_mpi_init() affinity-setting/checking code down to ORTE. * Update sm, smcuda, wv, and openib components to no longer use carto. Instead, use hwloc data. There are still optimizations possible in the sm/smcuda BTLs (i.e., making multiple mpools). Also, the old carto-based code found out how many NUMA nodes were ''available'' -- not how many were used ''in this job''. The new hwloc-using code computes the same value -- it was not updated to calculate how many NUMA nodes are used ''by this job.'' * Note that I cannot compile the smcuda and wv BTLs -- I ''think'' they're right, but they need to be verified by their owners. * The openib component now does a bunch of stuff to figure out where "near" OpenFabrics devices are. '''THIS IS A CHANGE IN DEFAULT BEHAVIOR!!''' and still needs to be verified by OpenFabrics vendors (I do not have a NUMA machine with an OpenFabrics device that is a non-uniform distance from multiple different NUMA nodes). * Completely rewrite the OMPI_Affinity_str() routine from the "affinity" mpiext extension. This extension now understands hyperthreads; the output format of it has changed a bit to reflect this new information. * Bunches of minor changes around the code base to update names/types from maffinity/paffinity-based names to hwloc-based names. * Add some helper functions into the hwloc base, mainly having to do with the fact that we have the hwloc data reporting ''all'' topology information, but sometimes you really only want the (online \| available) data. This commit was SVN r26391.	2012-05-07 14:52:54 +00:00
Jeff Squyres	253444c6d0	== Highlights == 1. New mpifort wrapper compiler: you can utilize mpif.h, use mpi, and use mpi_f08 through this one wrapper compiler 1. mpif77 and mpif90 still exist, but are sym links to mpifort and may be removed in a future release 1. The mpi module has been re-implemented and is significantly "mo' bettah" 1. The mpi_f08 module offers many, many improvements over mpif.h and the mpi module This stuff is coming from a VERY long-lived mercurial branch (3 years!); it'll almost certainly take a few SVN commits and a bunch of testing before I get it correctly committed to the SVN trunk. == More details == Craig Rasmussen and I have been working with the MPI-3 Fortran WG and Fortran J3 committees for a long, long time to make a prototype MPI-3 Fortran bindings implementation. We think we're at a stable enough state to bring this stuff back to the trunk, with the goal of including it in OMPI v1.7. Special thanks go out to everyone who has been incredibly patient and helpful to us in this journey: * Rolf Rabenseifner/HLRS (mastermind/genius behind the entire MPI-3 Fortran effort) * The Fortran J3 committee * Tobias Burnus/gfortran * Tony !Goetz/Absoft * Terry !Donte/Oracle * ...and probably others whom I'm forgetting :-( There's still opportunities for optimization in the mpi_f08 implementation, but by and large, it is as far along as it can be until Fortran compilers start implementing the new F08 dimension(..) syntax. Note that gfortran is currently unsupported for the mpi_f08 module and the new mpi module. gfortran users will a) fall back to the same mpi module implementation that is in OMPI v1.5.x, and b) not get the new mpi_f08 module. The gfortran maintainers are actively working hard to add the necessary features to support both the new mpi_f08 module and the new mpi module implementations. This will take some time. As mentioned above, ompi/mpi/f77 and ompi/mpi/f90 no longer exist. All the fortran bindings implementations have been collated under ompi/mpi/fortran; each implementation has its own subdirectory: {{{ ompi/mpi/fortran/ base/ - glue code mpif-h/ - what used to be ompi/mpi/f77 use-mpi-tkr/ - what used to be ompi/mpi/f90 use-mpi-ignore-tkr/ - new mpi module implementation use-mpi-f08/ - new mpi_f08 module implementation }}} There's also a prototype 6-function-MPI implementation under use-mpi-f08-desc that emulates the new F08 dimension(..) syntax that isn't fully available in Fortran compilers yet. We did that to prove it to ourselves that it could be done once the compilers fully support it. This directory/implementation will likely eventually replace the use-mpi-f08 version. Other things that were done: * ompi_info grew a few new output fields to describe what level of Fortran support is included * Existing Fortran examples in examples/ were renamed; new mpi_f08 examples were added * The old Fortran MPI libraries were renamed: * libmpi_f77 -> libmpi_mpifh * libmpi_f90 -> libmpi_usempi * The configury for Fortran was consolidated and significantly slimmed down. Note that the F77 env variable is now IGNORED for configure; you should only use FC. Example: {{{ shell$ ./configure CC=icc CXX=icpc FC=ifort ... }}} All of this work was done in a Mercurial branch off the SVN trunk, and hosted at Bitbucket. This branch has got to be one of OMPI's longest-running branches. Its first commit was Tue Apr 07 23:01:46 2009 -0400 -- it's over 3 years old! :-) We think we've pulled in all relevant changes from the OMPI trunk (e.g., Fortran implementations of the new MPI-3 MPROBE stuff for mpif.h, use mpi, and use mpi_f08, and the recent Fujitsu Fortran patches). I anticipate some instability when we bring this stuff into the trunk, simply because it touches a LOT of code in the MPI layer in the OMPI code base. We'll try our best to make it as pain-free as possible, but please bear with us when it is committed. This commit was SVN r26283.	2012-04-18 15:57:29 +00:00
Ralph Castain	bd8b4f7f1e	Sorry for mid-day commit, but I had promised on the call to do this upon my return. Roll in the ORTE state machine. Remove last traces of opal_sos. Remove UTK epoch code. Please see the various emails about the state machine change for details. I'll send something out later with more info on the new arch. This commit was SVN r26242.	2012-04-06 14:23:13 +00:00
Ralph Castain	534d70025f	Cleanup the detection of process binding during mpi_init. There are several cases that need to be checked: 1. no binding support - indicated by a negative return code from get_cpubind 2. binding supported, but not bound - the bitset returned by get_cpubind is the same as the available cpuset 3. binding supported and bound - bitset from get_cpubind is a subset of available cpuset 4. only one cpu is available - in this case, get_cpubind matches the available cpuset, but we are effectively bound This commit was SVN r25957.	2012-02-17 21:18:53 +00:00
Brian Barrett	25d48e22fa	Implementation of the MPI-3 Matched Probe functionality. Currently only implemented in the OB1 PML, will return NOT_SUPPORTED in other PMLs. This commit was SVN r25865.	2012-02-06 17:35:21 +00:00
Ralph Castain	7b65af28c6	Correct ordering in MPI_Init so that we do the modex prior to attempting to bind ourselves in the direct launch case as the modex contains info required for self-binding. This commit was SVN r25742.	2012-01-19 18:38:58 +00:00
George Bosilca	e3a373335d	If threads are not supported but were required we will go for SERIALIZED, otherwise whatever was required. This commit was SVN r25653.	2011-12-15 02:23:21 +00:00
George Bosilca	61f273b987	Do not tolerate uninitialized variables. This commit was SVN r25489.	2011-11-18 10:19:24 +00:00
Ralph Castain	6310361532	At long last, the fabled revision to the affinity system has arrived. A more detailed explanation of how this all works will be presented here: https://svn.open-mpi.org/trac/ompi/wiki/ProcessPlacement The wiki page is incomplete at the moment, but I hope to complete it over the next few days. I will provide updates on the devel list. As the wiki page states, the default and most commonly used options remain unchanged (except as noted below). New, esoteric and complex options have been added, but unless you are a true masochist, you are unlikely to use many of them beyond perhaps an initial curiosity-motivated experimentation. In a nutshell, this commit revamps the map/rank/bind procedure to take into account topology info on the compute nodes. I have, for the most part, preserved the default behaviors, with three notable exceptions: 1. I have at long last bowed my head in submission to the system admin's of managed clusters. For years, they have complained about our default of allowing users to oversubscribe nodes - i.e., to run more processes on a node than allocated slots. Accordingly, I have modified the default behavior: if you are running off of hostfile/dash-host allocated nodes, then the default is to allow oversubscription. If you are running off of RM-allocated nodes, then the default is to NOT allow oversubscription. Flags to override these behaviors are provided, so this only affects the default behavior. 2. both cpus/rank and stride have been removed. The latter was demanded by those who didn't understand the purpose behind it - and I agreed as the users who requested it are no longer using it. The former was removed temporarily pending implementation. 3. vm launch is now the sole method for starting OMPI. It was just too darned hard to maintain multiple launch procedures - maybe someday, provided someone can demonstrate a reason to do so. As Jeff stated, it is impossible to fully test a change of this size. I have tested it on Linux and Mac, covering all the default and simple options, singletons, and comm_spawn. That said, I'm sure others will find problems, so I'll be watching MTT results until this stabilizes. This commit was SVN r25476.	2011-11-15 03:40:11 +00:00
Ralph Castain	b44f8d4b28	Complete implementation of the ess.proc_get_locality API. Up to this point, the API was only capable of telling if the specified proc was sharing a node with you. However, the returned value was capable of telling you much more detailed info - e.g., if the proc shares a socket, a cache, or numa node. We just didn't have the data to provide that detail. Use hwloc to obtain the cpuset for each process during mpi_init, and share that info in the modex. As it arrives, use a new opal_hwloc_base utility function to parse the value against the local proc's cpuset and determine where they overlap. Cache the value in the pmap object as it may be referenced multiple times. Thus, the return value from orte_ess.proc_get_locality is a 16-bit bitmask that describes the resources being shared with you. This bitmask can be tested using the macros in opal/mca/paffinity/paffinity.h Locality is available for all procs, whether launched via mpirun or directly with an external launcher such as slurm or aprun. This commit was SVN r25331.	2011-10-19 20:18:14 +00:00
George Bosilca	c453614f8b	A more meaningful name for this function (mpi_proc_complete_init instead of ompi_proc_set_arch). Change the comment to reflect the real behavior of the function. This commit was SVN r25312.	2011-10-18 02:54:38 +00:00
Jeff Squyres	951c745590	We always have hwloc xml support (now that it's built into to hwloc without needing libxml2). So OPAL_HAVE_HWLOC_XML is no longer necessary. This commit was SVN r25263.	2011-10-11 20:20:59 +00:00
George Bosilca	80c02647c8	Each level (OPAL/ORTE/OMPI) should only return it's own constants, instead of the current mismatch. This commit was SVN r25230.	2011-10-04 14:50:31 +00:00
Wesley Bland	5fde3e0e00	Move the resilient orte errmgr code into a seperate errmgr for now while it's still unstable. Reverted errmgr modules back to the original errmgr (with the updates since the resilient code was brought into the trunk). This commit was SVN r24958.	2011-07-28 21:24:34 +00:00
Wesley Bland	e1ba09ad51	Add a resilience to ORTE. Allows the runtime to continue after a process (or ORTED) failure. Note that more work will be necessary to allow the MPI layer to take advantage of this. Per RFC: http://www.open-mpi.org/community/lists/devel/2011/06/9299.php This commit was SVN r24815.	2011-06-23 20:38:02 +00:00
Brian Barrett	be8a126600	At Josh's request, make example MPI extension use the init/fini so that the feature is actually documented. This commit was SVN r24686.	2011-05-05 18:31:07 +00:00
Eugene Loh	2770a12beb	Continue clean up of thread options started in r22841, 22842, and 22849. No need for any CMRs to 1.5... that was already done in CMR 2728. This commit was SVN r24545. The following SVN revision numbers were found above: r22841 --> open-mpi/ompi@b400b84162	2011-03-18 21:36:35 +00:00
Jeff Squyres	79cf382ff3	Fix a few issues with error messages: * If something goes wrong during ompi_mpi_init, don't erroneously report that it is illegal to invoke MPI_INIT* before MPI_INIT * Aggregate help messages when possible when something goes wring during ompi_mpi_init This commit was SVN r24492.	2011-03-07 16:45:45 +00:00
Ralph Castain	83723037d5	Restore the reset of progress event counter to avoid over-sampling the event lib This commit was SVN r24114.	2010-11-30 17:15:08 +00:00
Brian Barrett	3ed00ba148	More fixes to make OMPI compile with minimal ORTE support again This commit was SVN r23962.	2010-10-27 20:40:39 +00:00
Ralph Castain	fceabb2498	Update libevent to the 2.0 series, currently at 2.0.7rc. We will update to their final release when it becomes available. Currently known errors exist in unused portions of the libevent code. This revision passes the IBM test suite on a Linux machine and on a standalone Mac. This is a fairly intrusive change, but outside of the moving of opal/event to opal/mca/event, the only changes involved (a) changing all calls to opal_event functions to reflect the new framework instead, and (b) ensuring that all opal_event_t objects are properly constructed since they are now true opal_objects. Note: Shiqing has just returned from vacation and has not yet had a chance to complete the Windows integration. Thus, this commit almost certainly breaks Windows support on the trunk. However, I want this to have a chance to soak for as long as possible before I become less available a week from today (going to be at a class for 5 days, and thus will only be sparingly available) so we can find and fix any problems. Biggest change is moving the libevent code from opal/event to a new opal/mca/event framework. This was done to make it much easier to update libevent in the future. New versions can be inserted as a new component and tested in parallel with the current version until validated, then we can remove the earlier version if we so choose. This is a statically built framework ala installdirs, so only one component will build at a time. There is no selection logic - the sole compiled component simply loads its function pointers into the opal_event struct. I have gone thru the code base and converted all the libevent calls I could find. However, I cannot compile nor test every environment. It is therefore quite likely that errors remain in the system. Please keep an eye open for two things: 1. compile-time errors: these will be obvious as calls to the old functions (e.g., opal_evtimer_new) must be replaced by the new framework APIs (e.g., opal_event.evtimer_new) 2. run-time errors: these will likely show up as segfaults due to missing constructors on opal_event_t objects. It appears that it became a typical practice for people to "init" an opal_event_t by simply using memset to zero it out. This will no longer work - you must either OBJ_NEW or OBJ_CONSTRUCT an opal_event_t. I tried to catch these cases, but may have missed some. Believe me, you'll know when you hit it. There is also the issue of the new libevent "no recursion" behavior. As I described on a recent email, we will have to discuss this and figure out what, if anything, we need to do. This commit was SVN r23925.	2010-10-24 18:35:54 +00:00
Brian Barrett	13c827dda8	Make trunk compile on Red Storm again This commit was SVN r23622.	2010-08-17 21:51:38 +00:00
Brian Barrett	6ae9790d19	* Add option of init/fini hooks for MPI extensions to be called at the end of MPI_INIT and start of MPI_FINALIZE. * Clean up MPI Extensions build system to acknowledge that OMPI's the only project with extensions, as well as remove some build artifacts necessary for more general components. This commit was SVN r23616.	2010-08-17 04:44:22 +00:00
Jeff Squyres	a8f69c9e3b	This is no longer necessary; the orte DPM does the necessary opal_progress_increment() and opal_progress_decrement(). This commit was SVN r23434.	2010-07-19 19:34:10 +00:00
Jeff Squyres	fec7918eea	Some paffinity functions had their return status overloaded: * If < 0, it's an OPAL_ERR_* value * If >= 0, it's the actual output value of the function This is problematic for the OPAL_SOS stuff. This commit changes those functions to always return OPAL_* statuses and send the output value back through output parameters (like 95% of the rest of the code base). This avoids the confusion with OPAL_SOS stuff and makes paffinity work again (e.g., mpirun --bind-to-core ...). I updated all paffinitiy modules for the new function signatures, and bumped the paffinity API version up to 2.0.1. I don't think the version change will matter, though, because we'll be introducing support for hardware threads soon, which will either bump the paffinity version again or we'll replace paffinity with a new framework. This commit was SVN r23197.	2010-05-21 16:55:28 +00:00
Abhishek Kulkarni	afbe3e99c6	* Wrap all the direct error-code checks of the form (OMPI_ERR_* == ret) with (OMPI_ERR_* = OPAL_SOS_GET_ERR_CODE(ret)), since the return value could be a SOS-encoded error. The OPAL_SOS_GET_ERR_CODE() takes in a SOS error and returns back the native error code. * Since OPAL_SUCCESS is preserved by SOS, also change all calls of the form (OPAL_ERROR == ret) to (OPAL_SUCCESS != ret). We thus avoid having to decode 'ret' to get the native error code. This commit was SVN r23162.	2010-05-17 23:08:56 +00:00
Ralph Castain	4d06125a33	Establish a method by which a process knows if it has been bound by mpirun. This helps resolve a problem where a process gets "bound" to all available resources, which looks to the opal paffinity system as "not bound". This can cause mpi_init to attempt to "bind" the process itself, causing unintended behavior. This commit was SVN r22985.	2010-04-17 01:58:26 +00:00
Ralph Castain	41428e6b61	Issue a warning if a requested binding operation results in processes being bound to all available processes, which is the equivalent of not being bound at all. See the following email thread for further details: http://www.open-mpi.org/community/lists/devel/2010/04/7745.php This commit was SVN r22984.	2010-04-17 01:02:41 +00:00
Ralph Castain	b400b84162	Merge in the modified thread configure option branch per today's telecon. Remove the --enable-progress-threads option as this is no longer functional, and hardcode OPAL_ENABLE_PROGRESS_THREADS to 0. Replace the --enable-mpi-threads option with --enable-mpi-thread-multiple as this is clearer as to meaning. This option automatically turns "on" opal thread support if it wasn't already so specified. If the user specifies --disable-opal-multi-threads --enable-mpi-thread-multiple, we will error out with a message Add a new --enable-opal-multi-threads option that turns "on" opal thread support without doing anything wrt mpi-thread-multiple This commit was SVN r22841.	2010-03-16 23:10:50 +00:00
Jeff Squyres	bb314911b3	If we get OMPI_ERR_UNREACH from the PML, print a slightly more specific error. Suggested by Nick Edmonds: http://www.open-mpi.org/community/lists/users/2010/03/12339.php This commit was SVN r22828.	2010-03-14 00:09:55 +00:00
Josh Hursey	e9b5162d79	Fix the configure logic for --with-ft so that it properly takes a comma separated list. Many of the OPAL_ENABLE_FT should be OPAL_ENABLE_FT_CR, so fix those. The OPAL Layer INC should call opal_output on restart so that it can refresh the string it prints to reflect the current pid/hostname which may have changed. This commit was SVN r22824.	2010-03-12 23:57:50 +00:00
Ralph Castain	40be3d896c	Ensure we set an error code when leaving, correctly check for slot_list_set return status This commit was SVN r22643.	2010-02-17 22:59:19 +00:00
Ralph Castain	ab5ceb3d5f	Ensure we return the error code when something fails. Thanks to Guillaume Thouvenin for finding it. This commit was SVN r22588.	2010-02-09 16:48:55 +00:00
Jeff Squyres	16b100219d	A patch from UTK to allow orte_init(), opal_init(), and associated friends also receive &argc and &argv (George asked Jeff to Ralph to review before committing). The thought is that passing argv and argc to opal/orte_init be useful to other projects outside of OMPI that are using OPAL and/or ORTE (especially in conjunction with some other bootstrapping code where it is helpful to modify argv). It's such a small thing that it's easy to apply here to make others' lives a little easier. Ask George for more details; I'm just the messenger. :-) Judging by the copyrights on this patch, it's been around for a while. :-) This commit was SVN r22260.	2009-12-04 00:51:15 +00:00
Shiqing Fan	fb4be6fad7	First step to enable DSO build on Windows. - add a cmake module for searching libltdl libraries and headers - a configure option to enable DSO build, default OFF. - update a few source files for including correct header and loading correct mca libraries path/suffix. This commit was SVN r21804.	2009-08-12 08:52:48 +00:00
George Bosilca	3e971e61f3	The system headers are supposed to be protected by #ifdef and not by #if. This commit was SVN r21700.	2009-07-16 18:27:33 +00:00
Rainer Keller	c971c09eb6	- Runtime and include files missed in last commit This commit was SVN r21642.	2009-07-13 04:59:13 +00:00
Edgar Gabriel	b6f292f794	add a uint8_t to the startup modex which allows us to recognize whether different processes have requested different levels of thread support. This verification is restricted to MPI_COMM_WORLD. In case one ore more processes have requested support for MPI_THREAD_MULTIPLE, the cid selection algorithm will fall back to the original, thread safe approach. Else, it uses the block-algorithm. For dynamic communicators, we always fall back now to the original algorithm. This has been tested for homogeneous and heterogeneous settings for MCW. However, I could not test yet the dynamic comm scenario for technical reasons, and that's why I don't close yet ticket 1949. This commit was SVN r21613.	2009-07-07 18:32:14 +00:00
Ralph Castain	e30826c6e1	Quiet some compiler warnings This commit was SVN r21591.	2009-07-02 17:48:36 +00:00
Ralph Castain	f832352b45	Clean up some compiler warnings This commit was SVN r21577.	2009-07-01 16:51:11 +00:00
Ralph Castain	c3c1ab1337	Correct a comment in paffinity.h about what paffinity_get returns - it was inaccurate. Revamp the affinity detection/set procedure in mpi_init to correctly detect when we have already been bound to processors, given the revised understanding of paffinity_get. Add a new paffinity macro to make checking for already bound a little nicer. This commit was SVN r21402.	2009-06-09 14:33:35 +00:00
Ralph Castain	d396f0a6fc	Per the discussion on the devel list, move the binding of processes to processors from MPI_Init to process start. This involves: 1. replacing mpi_paffinity_alone with opal_paffinity_alone - for back-compatibility, I have aliased mpi_paffinity_alone to the new param name. This caus es a mild abstraction break in the opal/mca/paffinity framework - per the devel discussion...live with it. :-) I also moved the ompi_xxx global variable that tracked maffinity setup so it could be properly closed in MPI_Finalize to the opal/mca/maffinity framework to avoid an abstraction break. 2. Added code to the odls/default module to perform paffinity binding and maffinity init between process fork and exec. This has been tested on IU's odi n cluster and works for both MPI and non-MPI apps. 3. Revise MPI_Init to detect if affinity has already been set, and to attempt to set it if not already done. I have not tested this as I haven't yet f igured out a way to do so - I couldn't get slurm to perform cpu bindings, even though it supposedly does do so. This has only been lightly tested and would definitely benefit from a wider range of evaluation... This commit was SVN r21209.	2009-05-12 02:18:35 +00:00

1 2 3 4 5 ...

280 Коммитов