openmpi

Автор	SHA1	Сообщение	Дата
Gilles Gouaillardet	2dd345465f	ompi/communicator: optimize ompi_comm_split() set grp_local_rank as MPI_UNDEFINED before invoking ompi_comm_nexcid() in order to benefit from the optimizations introduced in open-mpi/ompi@68167ec879 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-12-28 15:34:04 +09:00
Gilles Gouaillardet	1ba4c185bc	ompi/communicator: destruct ompi_communicator_t's c_lock in the destructor Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-12-01 16:06:27 +09:00
Gilles Gouaillardet	f1778d2778	communicator: remove the USE_MUTEX_FOR_COMMS macro It should have always been #define'd in order to correctly handle the multi-threaded case. Also fix indentation in ompi/mpi/c/comm_get_errhandler.c Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-11-30 14:29:11 +09:00
Gilles Gouaillardet	7c3e675479	fix communicator's c_lock usage - initialize c_lock in the ompi_communicator_t constructor - USE_OPAL_THREAD_[UN]LOCK(c_lock) - #ifdef USE_MUTEX_FOR_COMMS protect c_lock access Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-11-30 14:27:59 +09:00
Ralph Castain	7ad6886a30	Add a new OMPI rte component to support direct-launch using PMIx. Cleanup several places where abstraction violations crept into OMPI layer (direct reference of ORTE). Add some missing includes that were exposed by this change. Note that this compiles, but I haven't tested it for execution yet. Handing it over to Noah Evans for completion Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-11-28 12:05:01 -08:00
Aurelien Bouteiller	3ef23f41a3	Bugfix a crash when a comm cannot be initialized Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>	2017-10-18 11:32:37 -04:00
Ralph Castain	d80b0c7990	If the HWLOC shared memory system is unable to connect, then fallback to providing the topology via XML. Do not automatically provide the XML to every process as that defeats the purpose of the shared memory system. Instead, use PMIx_Query_info_nb to get the info from the server when required. Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-08-22 18:12:26 -07:00
Gilles Gouaillardet	a3e31fa8d0	ompi/communicator: plug a memory leak in ompi_comm_init() Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-08-21 11:47:11 +09:00
Edgar Gabriel	92eff9050c	communicator/comm_init.c: add a new flag indicating binding policy Check for the binding policy used. We are only interested in whether mapby-node has been set right now (could be extended later) and only on MPI_COMM_WORLD, since for all other sub-communicators it is virtually impossible to identify their layout across nodes in the most generic sense. This is used by OMPIO for deciding which ranks to use for aggregators Signed-off-by: Edgar Gabriel <gabriel@cs.uh.edu>	2017-08-15 09:50:41 -05:00
KAWASHIMA Takahiro	3eac4b0c9a	communicator: Refine `ompi_comm_set` error check The `ompi_comm_set` function never sets `NULL` to its first argument `ncomm`. So `NULL` check is unnecessary in its callers. Furthermore, `NULL` check may obscure a real return code when an error occurs if the variable is initialized to a `NULL` value. Also, `NULL` check is added in the `ompi_comm_set` function to avoid segmentation fault in an out-of-memory condition. Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>	2017-07-31 20:26:51 +09:00
Nathan Hjelm	9b702fb9bd	ompi: clean up topo helper functions This commit removes the communicator topo helper functions in favor of functions in mca/topo/base. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2017-07-17 10:30:42 -05:00
Nathan Hjelm	db2204f2f3	ompi: add support for new communicator info assertions This commit adds code to allow support for the info assertions added by mpi-forum/mpi-issues#11. The assertions added are: mpi_assert_no_any_tag, mpi_assert_no_any_source, mpi_assert_exact_length, and mpi_assert_allow_overtaking. This commit also adds support for the mpi_assert_no_any_source and mpi_assert_allow_overtaking info keys to the ob1 pml. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2017-06-08 15:52:12 -06:00
Jeff Squyres	d520c24f3a	predefined MPI object padding: set to fixed number of bytes (#3634 ) Convert the predefined MPI object padding to a fixed number of bytes (vs. a multiple of sizeof(void*)) so that the padding is the same size between 32 and 64 bit builds. I.e., we won't have a situation where we've run out of padding in 32 bit builds but still have more space available in 64 bit builds. Fixes #3610 Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-06-01 15:28:23 -04:00
Mark Allen	482d84b6e5	fixes for Dave's get/set info code The expected sequence of events for processing info during object creation is that if there's an incoming info arg, it is opal_info_dup()ed into the obj at obj->s_info first. Then interested components register callbacks for keys they want to know about using opal_infosubscribe_infosubscribe(). Inside info_subscribe_subscribe() the specified callback() is called with whatever matching k/v is in the object's info, or with the default. The return string from the callback goes into the new k/v stored in info, and the input k/v is saved as __IN_<key>/<val>. It's saved the same way whether the input came from info or whether it was a default. A null return from the callback indicates an ignored key/val, and no k/v is stored for it, but an __IN_<key>/<val> is still kept so we still have access to the original. At MPI__set_info() time, opal_infosubscribe_change_info() is used. That function calls the registered callbacks for each item in the provided info. If the callback returns non-null, the info is updated with that k/v, or if the callback returns null, that key is deleted from info. An __IN_<key>/<val> is saved either way, and overwrites any previously saved value. When MPI__get_info() is called, opal_info_dup_mpistandard() is used, which allows relatively easy changes in interpretation of the standard, by looking at both the <key>/<val> and __IN_<key>/<val> in info. Right now it does 1. includes system extras, eg k/v defaults not expliclty set by the user 2. omits ignored keys 3. shows input values, not callback modifications, eg not the internal values Currently the callbacks are doing things like return some_condition ? "true" : "false" that is, returning static strings that are not to be freed. If the return strings start becoming more dynamic in the future I don't see how unallocated strings could support that, so I'd propose a change for the future that the callback()s registered with info_subscribe_subscribe() do a strdup on their return, and we change the callers of callback() to free the strings it returns (there are only two callers). Rough outline of the smaller changes spread over the less central files: comm.c initialize comm->super.s_info to NULL copy into comm->super.s_info in comm creation calls that provide info OBJ_RELEASE comm->super.s_info at free time comm_init.c initialize comm->super.s_info to NULL file.c copy into file->super.s_info if file creation provides info OBJ_RELEASE file->super.s_info at free time win.c copy into win->super.s_info if win creation provides info OBJ_RELEASE win->super.s_info at free time comm_get_info.c file_get_info.c win_get_info.c change_info() if there's no info attached (shouldn't happen if callbacks are registered) copy the info for the user The other category of change is generally addressing compiler warnings where ompi_info_t and opal_info_t were being used a little too interchangably. An ompi_info_t* contains an opal_info_t*, at &(ompi_info->super) Also this commit updates the copyrights. Signed-off-by: Mark Allen <markalle@us.ibm.com>	2017-05-17 01:12:49 -04:00
David Solt	50aa143ab6	Major structural changes to data types: .super infosubscriber ompi_communicator_t, ompi_win_t, ompi_file_t all have a super class of type opal_infosubscriber_t instead of a base/super type of opal_object_t (in previous code comm used c_base, but file used super). It may be a bit bold to say that being a subscriber of MPI_Info is the foundational piece that ties these three things together, but if you object, then I would prefer to turn infosubscriber into a more general name that encompasses other common features rather than create a different super class. The key here is that we want to be able to pass comm, win and file objects as if they were opal_infosubscriber_t, so that one routine can heandle all 3 types of objects being passed to it. MPI_INFO_NULL is still an ompi_predefined_info_t type since an MPI_Info is part of ompi but the internal details of the underlying information concept is part of opal. An ompi_info_t type still exists for exposure to the user, but it is simply a wrapper for the opal object. Routines such as ompi_info_dup, etc have all been moved to opal_info_dup and related to the opal directory. Fortran to C translation tables are only used for MPI_Info that is exposed to the application and are therefore part of the ompi_info_t and not the opal_info_t The data structure changes are primarily in the following files: communicator/communicator.h ompi/info/info.h ompi/win/win.h ompi/file/file.h The following new files were created: opal/util/info.h opal/util/info.c opal/util/info_subscriber.h opal/util/info_subscriber.c This infosubscriber concept is that communicators, files and windows can have subscribers that subscribe to any changes in the info associated with the comm/file/window. When xxx_set_info is called, the new info is presented to each subscriber who can modify the info in any way they want. The new value is presented to the next subscriber and so on until all subscribers have had a chance to modify the value. Therefore, the order of subscribers can make a difference but we hope that there is generally only one subscriber that cares or modifies any given key/value pair. The final info is then stored and returned by a call to xxx_get_info. The new model can be seen in the following files: ompi/mpi/c/comm_get_info.c ompi/mpi/c/comm_set_info.c ompi/mpi/c/file_get_info.c ompi/mpi/c/file_set_info.c ompi/mpi/c/win_get_info.c ompi/mpi/c/win_set_info.c The current subscribers where changed as follows: mca/io/ompio/io_ompio_file_open.c mca/io/ompio/io_ompio_module.c mca/osc/rmda/osc_rdma_component.c (This one actually subscribes to "no_locks") mca/osc/sm/osc_sm_component.c (This one actually subscribes to "blocking_fence" and "alloc_shared_contig") Signed-off-by: Mark Allen <markalle@us.ibm.com> Conflicts: AUTHORS ompi/communicator/comm.c ompi/debuggers/ompi_mpihandles_dll.c ompi/file/file.c ompi/file/file.h ompi/info/info.c ompi/mca/io/ompio/io_ompio.h ompi/mca/io/ompio/io_ompio_file_open.c ompi/mca/io/ompio/io_ompio_file_set_view.c ompi/mca/osc/pt2pt/osc_pt2pt.h ompi/mca/sharedfp/addproc/sharedfp_addproc.h ompi/mca/sharedfp/addproc/sharedfp_addproc_file_open.c ompi/mca/topo/treematch/topo_treematch_dist_graph_create.c ompi/mpi/c/lookup_name.c ompi/mpi/c/publish_name.c ompi/mpi/c/unpublish_name.c opal/mca/mpool/base/mpool_base_alloc.c opal/util/Makefile.am	2017-05-12 14:41:05 -04:00
Artem Polyakov	68167ec879	ompi/comm: Improve MPI_Comm_create algorithm Force only procs that are participating in the ne Comm to decide what CID is appropriate. This will have 2 advantages: * Speedup Comm creation for small communicators: non-participating procs will not interfere * Reduce CID fragmentation: non-overlaping groups will be allowed to use same CID. Signed-off-by: Artem Polyakov <artpol84@gmail.com>	2017-04-21 08:33:29 +07:00
bosilca	872cf44c28	Improve the opal_pointer_array & more (#3369 ) * Complete rewrite of opal_pointer_array Instead of a cache oblivious linear search use a bits array to speed up the management of the free space. As a result we slightly increase the memory used by the structure, but we get a significant boost in performance. Signed-off-by: George Bosilca <bosilca@icl.utk.edu> * Do not register datatypes in the f2c translation table. The registration is now done up into the Fortran layer, by forcing a call to MPI_Type_c2f. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2017-04-18 21:41:26 -04:00
Noah Evans	ef29fb13cb	de-ORTEfy the ompi tree The ompi tree should be runtime independent, but over time a few ORTE depedent definitions and functions have escaped into the ompi tree. I'm working on my own runtime so I've used this as an opportunity to get rid of ORTE dependencies in the ompi/ tree. I still need to go back and change orte to conform to the new world and these changes are untested, but I can now compile (but not link) without orte so I'm commiting this changeset. Signed-off-by: Noah Evans <noah.evans@gmail.com>	2017-04-07 12:35:58 -06:00
George Bosilca	366d64b7e5	Move the collective structure outside the communicator. As we changed the ABI (forcing a major release), we can limit the size of the predefined communicators by moving the collective structure outside the communicator. This might have a minimal, but unnoticeable, impact on performance. This approach has been discussed during the January 2017 devel meeting. Signed-off-by: George Bosilca <bosilca@icl.utk.edu> Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>	2017-02-27 11:54:17 -06:00
Sylvain Jeaugey	f827b6b8dd	Fix more typos using the allgather module for allreduce operations, causing a crash when CUDA collectives are enabled. Signed-off-by: Sylvain Jeaugey <sjeaugey@nvidia.com> Signed-off-by: Akshay Venkatesh <akvenkatesh@nvidia.com>	2017-02-24 16:35:29 -08:00
Joshua Hursey	78006f93a4	coll: Move reduce_local into the coll framework * Since we are adding a new function to `mca_coll_base_module_2_1_0_t` we need to increase the version of the module structure to `2_2_0`. * Add a comment just above the PREDEFINED_COMMUNICATOR_PAD describing it's purpose and when it should change. To help future developers trying to answer the question noted in the comment. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>	2017-02-14 08:56:07 -06:00
Joshua Hursey	a2d45f6e9f	communicator: Fix uninitialized variable Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>	2017-01-24 16:46:13 -06:00
Joshua Gerrard	d5a45bc12e	Swapped use of fprintf for opal_output_verbose Signed-off-by: Joshua Gerrard <enquiries@joshuagerrard.com>	2016-12-01 19:56:06 +00:00
Joshua Gerrard	7cf5de12b9	Fixed -Werror=unused-result warnings in comm_cid.c by adding error checking Signed-off-by: Joshua Gerrard <enquiries@joshuagerrard.com>	2016-11-29 21:08:12 +00:00
Ralph Castain	1e2019ce2a	Revert "Update to sync with OMPI master and cleanup to build" This reverts commit `cb55c88a8b`.	2016-11-22 15:03:20 -08:00
Ralph Castain	cb55c88a8b	Update to sync with OMPI master and cleanup to build Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-11-22 14:24:54 -08:00
Ralph Castain	a2919174d0	Bring the RML modifications across. This is the first step in a revamp of the ORTE messaging subsystem to support fabric-based communications during launch and wireup phases. When completed, the grpcomm and plm frameworks will each have their own "conduit" for communication - each conduit corresponds to a particular RML messaging transport. This can be the active OOB-based component, or a provider from within the RML/OFI component. Messages sent down the conduit will flow across the associated transport. Multiple conduits can exist at the same time, and can even point to the same base transport. Each conduit can have its own characteristics (e.g., flow control) based on the info keys provided to the "open_conduit" call. For ease during the transition period, the "legacy" RML interfaces remain as wrappers over the new conduit-based APIs using a default conduit opened during orte_init - this default conduit is tied to the OOB framework so that current behaviors are preserved. Once the transition has been completed, a one-time cleanup will be done to update all RML calls to the new APIs and the "legacy" interfaces will be deleted. While we are at it: Remove oob/usock component to eliminate the TMPDIR length problem - get all working, including oob_stress	2016-10-11 16:01:02 -07:00
Gilles Gouaillardet	6c6e35bb40	ompi/communicator: silence warnings	2016-10-06 15:03:06 +09:00
Joshua Hursey	f6f24a4f67	build: Custom libmpi(_FOO) name option in configure * Add a configure time option to rename libmpi(_FOO).* - `--with-libmpi-name=STRING` * This commit only impacts the installed libraries. Internal, temporary libraries have not been renamed to limit the scope of the patch to only what is needed. For example: ```shell shell$ ./configure --with-libmpi-name=wookie ... shell$ find . -name "libmpi" shell$ find . -name "libwookie" ./lib/libwookie.so.0.0.0 ./lib/libwookie.so.0 ./lib/libwookie.so ./lib/libwookie.la ./lib/libwookie_mpifh.so.0.0.0 ./lib/libwookie_mpifh.so.0 ./lib/libwookie_mpifh.so ./lib/libwookie_mpifh.la ./lib/libwookie_usempi.so.0.0.0 ./lib/libwookie_usempi.so.0 ./lib/libwookie_usempi.so ./lib/libwookie_usempi.la shell$ ```	2016-09-29 21:47:24 -05:00
George Bosilca	803897a915	Correctly indent the code.	2016-09-21 07:46:53 -04:00
Gilles Gouaillardet	3b968ec6bb	ompi/communicator: fix typos in CID generation use MPI_MIN instead of MPI_MAX when appropriate, otherwise a currently used CID can be reused, and bad things will likely happen. Refs open-mpi/ompi#2061	2016-09-09 10:10:35 +09:00
Nathan Hjelm	54cc829aab	comm/cid: use ibcast to distribute result in intercomm case This commit updates the intercomm allgather to do a local comm bcast as the final step. This should resolve a hang seen in intercomm tests. Signed-off-by: Nathan Hjelm <hjelmn@me.com>	2016-09-07 10:49:04 -06:00
Nathan Hjelm	f3e9a72f1a	Merge pull request #1987 from hjelmn/cid comm/cid: fix threaded CID allocation	2016-08-19 14:26:39 -06:00
Nathan Hjelm	fbbf743c36	comm/cid: fix threaded CID allocation This commit should restore the pre-non-blocking behavior of the CID allocator when threads are used. There are two primary changes: 1) do not hold the cid allocator lock past the end of a request callback, and 2) if a lower id communicator is detected during CID allocation back off and let the lower id communicator finish before continuing. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-08-19 11:47:19 -06:00
Sylvain Jeaugey	61e900eea5	Fix typo calling allreduce with the allgather module. That was causing CUDA collective to crash.	2016-08-17 17:05:13 -07:00
Ralph Castain	ba77d9beff	Remove forced debugs	2016-08-08 13:20:24 -07:00
Ralph Castain	cacb582ecd	Support timeout values when performing connect/accept operations. Bump default timeout to 10 minutes so folks have time to start the partnering application	2016-07-28 14:09:06 -07:00
Gilles Gouaillardet	bbc6d4b3d4	ompi/communicator: remove an other debug print statement in ompi_comm_allreduce_intra_pmix_nb()	2016-07-22 15:42:56 +09:00
Ralph Castain	01a653d50a	Remove a debug print in comm_cid.c. Update PMIx2 to include the revised PMIx_Get logic for higher performance by reducing the number of hash table lookups. Fix a bug where requests for data from a proc in another nspace could hang, or result in "not found". Remove stale file reference Restore autogen pass thru pmix Remove generated file	2016-07-20 00:58:19 -07:00
Ralph Castain	36a9063466	Silence warnings	2016-07-19 17:36:13 -07:00
Nathan Hjelm	4c49c42dd0	ompi/comm: improve comm_split_type scalability This commit introduces a new algorithm for MPI_Comm_split_type. The old algorithm performed an allgather on the communicator to decide which processes were part of the new communicators. This does not scale well in either time or memory. The new algorithm performs a couple of all reductions to determine the global parameters of the MPI_Comm_split_type call. If any rank gives an inconsistent split_type (as defined by the standard) an error is returned without proceeding further. The algorithm then creates a communicator with all the ranks that match the split_type (no communication required) in the same order as the original communicator. It then does an allgather on the new communicator (which should be much smaller) to determine 1) if the new communicator is in the correct order, and 2) if any ranks in the new communicator supplied MPI_UNDEFINED as the split_type. If either of these conditions are detected the new communicator is split using ompi_comm_split and the intermediate communicator is freed. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-07-18 12:47:05 -06:00
Nathan Hjelm	035c2e2e2a	ompi/comm: refactor communicator cid code This commit simplifies the communicator context ID generation by removing the blocking code. The high level calls: ompi_comm_nextcid and ompi_comm_activate remain but now call the non-blocking variants and wait on the resulting request. This was done to remove the parallel paths for context ID generation in preperation for further improvements of the CID generation code. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-07-18 12:47:05 -06:00
George Bosilca	73972768f8	Remove an apparently useless function.	2016-07-05 18:30:11 +02:00
Nathan Hjelm	65be935676	comm/split_type: allow MPI_UNDEFINED for split_type It is valid for any rank to deviate on the split_type argument if they specify MPI_UNDEFINED. The code was incorrectly not allowing this condition. Changed the split_type uniformity check and allow local_size to be 0 if the local split_type is MPI_UNDEFINED. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-06-16 17:42:28 -06:00
bosilca	b90c83840f	Refactor the request completion (#1422 ) * Remodel the request. Added the wait sync primitive and integrate it into the PML and MTL infrastructure. The multi-threaded requests are now significantly less heavy and less noisy (only the threads associated with completed requests are signaled). * Fix the condition to release the request.	2016-05-24 18:20:51 -05:00
Nathan Hjelm	ae0ffbb67f	Merge pull request #1397 from hjelmn/enable_thread_multiple ompi: always enable MPI_THREAD_MULTIPLE support	2016-04-23 08:40:22 -06:00
Nathan Hjelm	a1420003b6	ompi/comm: clean up includes in comm_request.h Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-03-22 09:17:38 -06:00
Nathan Hjelm	230d04327e	ompi: always enable MPI_THREAD_MULTIPLE support This commit removes the --with-mpi-thread-multiple option and forces MPI_THREAD_MULTIPLE support. This cleans up an abstration violation in opal where OMPI_ENABLE_THREAD_MULTIPLE determines whether the opal_using_threads is meaningful. To reduce the performance hit on MPI_THREAD_SINGLE programs an OPAL_UNLIKELY is used for the check on opal_using_threads in OPAL_THREAD_* macros. This commit does not clean up the arguments to the various functions that take whether muti-threading support is enabled. That should be done at a later time. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-02-23 10:02:14 -07:00
Gilles Gouaillardet	030a5f2054	sentinel: use type uintptr_t for sentinel MSB is now automatically cleared when right shifting Thanks George for pointing this	2016-02-10 11:28:56 +09:00
Artem Polyakov	2abb2972ac	Fix Mellanox copyrights with respect to the following PRs: * https://github.com/open-mpi/ompi/pull/1184 * https://github.com/open-mpi/ompi/pull/1188 * https://github.com/open-mpi/ompi/pull/1197 * https://github.com/open-mpi/ompi/pull/1202 * https://github.com/open-mpi/ompi/pull/1210 * https://github.com/open-mpi/ompi/pull/1216 * https://github.com/open-mpi/ompi/pull/1236 * https://github.com/open-mpi/ompi/pull/1237 * https://github.com/open-mpi/ompi/pull/1248 * https://github.com/open-mpi/ompi/pull/1260 * https://github.com/open-mpi/ompi/pull/1264	2015-12-30 00:12:19 +06:00

1 2 3 4 5 ...

280 Коммитов