openmpi

Автор	SHA1	Сообщение	Дата
Gilles Gouaillardet	638a59adf3	fix compilation in heterogeneous mode use OPAL_PMIX_GLOBAL instead of PMIX_GLOBAL	2015-09-11 09:23:21 +09:00
Nathan Hjelm	ad3a2ef6cc	silence warnings introduced by add_procs merge Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-09-10 16:33:52 -06:00
Ralph Castain	f94f3cda21	Fix the handling of cpusets so we get the correct cpuset for each local peer. Add the ability to indicate that a modex request is "optional" so we don't call the server if we don't find the value. Take advantage of that to allow the MPI layer to decide that the lack of locality info indicates non-local	2015-09-10 10:25:30 -07:00
Nathan Hjelm	ed005f2a61	ompi/dpm: improve scalability of ompi_dpm_mark_dyncomm This commit removes the use of ompi_group_peer_lookup in the ompi_dpm_mark_dyncomm function. The function now uses ompi_group_get_proc_name which does not allocate an ompi_proc_t if one does not already exist. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-09-10 10:50:58 -06:00
Nathan Hjelm	987e865c99	mtl/psm2: add support for dynamic add_procs Add an accessor for the proc_endpoints[OMPI_PROC_ENDPOINT_TAG_MTL] member of the ompi_proc_t structure. This accessort calls add_procs with the ompi_proc_t if the member is NULL. Signed-off-by: Nathan Hjelm <hjelmn@me.com>	2015-09-10 08:55:55 -06:00
Nathan Hjelm	8df9b1d40d	mtl/psm: add support for dynamic add_procs Add an accessor for the proc_endpoints[OMPI_PROC_ENDPOINT_TAG_MTL] member of the ompi_proc_t structure. This accessort calls add_procs with the ompi_proc_t if the member is NULL. Tested on an infinipath system with InfiniPath_QLE7340 HCAs. Signed-off-by: Nathan Hjelm <hjelmn@me.com>	2015-09-10 08:55:55 -06:00
Nathan Hjelm	0a0e6d8eef	ompi/group: clean up union/difference code Updated the union/difference code to remove an extra n^2 translation of ranks. This comes at the cost of extra memory but greatly simplifies the code. Signed-off-by: Nathan Hjelm <hjelmn@me.com>	2015-09-10 08:55:55 -06:00
Nathan Hjelm	5b7943db78	ompi/group: do not allocate ompi_proc_t's on group union/difference This commit modifies the ompi_group_t union/difference code to compare/copy the raw group values. This will either be a ompi_proc_t or a sentinel value. This commit also adds helper functions to convert between opal process names and sentinel values. Signed-off-by: Nathan Hjelm <hjelmn@me.com>	2015-09-10 08:55:55 -06:00
Nathan Hjelm	d8b0a6efda	Remove use of ompi_comm_peer_lookup in osc/sm Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-09-10 08:55:54 -06:00
Nathan Hjelm	a41889112c	Remove calls to ompi_group_peer_lookup in coll/sm and coll/fca Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-09-10 08:55:54 -06:00
Nathan Hjelm	0bf06de3f1	group\|comm: add initial support for group sentinel values This commit modifies ompi's process list group object to support a sentinel value for non-existant ompi_proc_t objects. The sentinel was chosen to be the negative of the opal_process_name_t of the associated ompi_proc_t. This takes advantage of the fact that on most (all?) systems the top bit of a user-space pointer is never set. If this changes then a new sentinel will be needed. In addition this commit modifies the way ompi_mpi_comm_world is initialized to fill in the group with sentinel values if the number of processes exceeds the new add_procs behavior cutoff. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-09-10 08:55:54 -06:00
Nathan Hjelm	408da16d50	ompi/proc: add proc hash table for ompi_proc_t objects This commit adds an opal hash table to keep track of mapping between process identifiers and ompi_proc_t's. This hash table is used by the ompi_proc_by_name() function to lookup (in O(1) time) a given process. This can be used by a BTL or other component to get a ompi_proc_t when handling an incoming message from an as yet unknown peer. Additionally, this commit adds a new MCA variable to control the new add_procs behavior: mpi_add_procs_cutoff. If the number of ranks in the process falls below the threshold a ompi_proc_t is created for every process. If the number of ranks is above the threshold then a ompi_proc_t is only created for the local rank. The code needed to generate additional ompi_proc_t's for a communicator is not yet complete. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-09-10 08:55:54 -06:00
Nathan Hjelm	b4a0d40915	pml/ob1: Add support for dynamically calling add_procs This commit contains the following changes: - pml/ob1: use the bml accessor function when requesting a bml endpoint. this will ensure that bml endpoints are only created when needed. for example, a bml endpoint is not requested and not allocated when receiving an eager message from a peer. - pml/ob1: change the pml_procs array in the ob1 communicator to a proc pointer array. at the cost of a single level of extra redirection this will allow us to allocate pml procs on demand. - pml/ob1: add an accessor function to access the pml proc structure for a given peer. this function will allocate the proc if it doesn't already exist. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-09-10 08:55:54 -06:00
Nathan Hjelm	6fa6513003	bml: Add support for dynamically calling add_procs This commit contains the following changes: - bml: add a function to add a single process. this function is intended to remove the need to maintain a opal_bitmap_t as it is irrelevant for a single proc. BTLs will need to be updated to either 1) ignore the return code from opal_bitmap_set_bit or not call the function if the reachability bitmap is NULL. - bml: add an inline accessor function for getting the bml endpoint for a peer proc. this function will either 1) return the cached bml endpoint, or 2) create the endpoint and call add_proc will all available BTL modules. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-09-10 08:55:54 -06:00
Ralph Castain	b79cffc73b	Protect ourselves - if the active pmix component doesn't have some optional functions, then gracefully decline to perform the operation OR use a required alternative (e.g., fence in place of disconnect) This fixes the Slurm pmi2 support - still something wrong in pmi1	2015-09-09 02:29:00 -07:00
Gilles Gouaillardet	fe351f6801	io: do not cast way the const modifier when this is not necessary update the io framework and mpi c bindings	2015-09-09 09:18:58 +09:00
Gilles Gouaillardet	e01bac962f	coll: do not cast way the const modifier when this is not necessary update the coll framework and mpi c bindings	2015-09-09 09:18:57 +09:00
Gilles Gouaillardet	6e6a3e965c	pml: do not cast way the const modifier when this is not necessary update the pml framework and mpi c bindings	2015-09-09 09:18:57 +09:00
Gilles Gouaillardet	43ef261d46	topo: do not cast way the const modifier when this is not necessary update the topo framework and mpi c bindings	2015-09-09 09:18:57 +09:00
rhc54	3a446c9797	Merge pull request #876 from rhc54/topic/hnp Fix segfault upon job error	2015-09-08 15:10:51 -07:00
Ralph Castain	459f169e06	Fix segfault upon job error Silence some unnecessary error-logs	2015-09-08 14:03:06 -07:00
Ralph Castain	ae7156cabb	Stop a segfault in the test by correctly passing all the argv during spawn	2015-09-08 13:42:46 -07:00
Jeff Squyres	bc9e5652ff	whitespace: purge whitespace at end of lines Generated by running "./contrib/whitespace-purge.sh".	2015-09-08 09:47:17 -07:00
Edgar Gabriel	c83e6ad0c8	fix coverty warnings 1322865 and 72136	2015-09-08 09:15:57 -05:00
Ralph Castain	e6add86e4f	Deal with connect/accept between two jobs from different mpirun's. Somewhat optimize connect/accept by using MPI bcast to distribute the participants instead of another PMIx lookup. Cleanup some Coverity issues.	2015-09-07 09:19:24 -07:00
Gilles Gouaillardet	c404e98dce	coll/ml: silence warnings (incorrect callback prototype)	2015-09-07 14:56:49 +09:00
Gilles Gouaillardet	56f8a7b840	coll/ml: declare a global variable as static to avoid an uninitialized common symbol.	2015-09-07 14:56:03 +09:00
Ralph Castain	37c3ed68e7	Cleanup connect/disconnect and bring comm_spawn back online!	2015-09-06 10:27:39 -07:00
Jeff Squyres	794ee4a604	treematch: remove stale test This test was accidentally left over from open-mpi/ompi@d97bc29102 that prevented the treematch component from building.	2015-09-05 05:02:30 -07:00
rhc54	665b30376a	Merge pull request #868 from rhc54/topic/hwloc Remove OPAL_HAVE_HWLOC qualifier and error out if --without-hwloc is given	2015-09-04 17:58:07 -07:00
Ralph Castain	d97bc29102	Remove OPAL_HAVE_HWLOC qualifier and error out if --without-hwloc is given	2015-09-04 16:54:40 -07:00
rhc54	d45ccda813	Merge pull request #866 from rhc54/topic/updatepmix Update PMIx support	2015-09-04 11:09:36 -07:00
Ralph Castain	f6948c2bb4	Sync with PMIx master 43e45c3. Get multi-node publish/lookup/unpublish working	2015-09-04 10:07:17 -07:00
Pavel Shamis / Pasha	c3446f363b	Merge pull request #859 from shamisp/topic/ml_soft_disable ML: Replace opal ignore with a zero priority	2015-09-04 12:37:37 -04:00
Pavel Shamis (Pasha)	32c69630ad	ML: Replace opal ignore with a zero priority The priority set by default to 0. As a result component open reports an error and the component is not loaded (no resources allocated).	2015-09-04 11:28:47 -04:00
yohann	404393b9d7	mtl/ofi: Minor code cleanup.	2015-09-03 15:04:55 -07:00
yohann	a8cac09769	mtl/ofi: Renamed macro to prevent clash with FI_ namespace.	2015-09-03 14:42:45 -07:00
yohann	7adb9b7ab4	mtl/ofi: Handle -FI_EAGAIN on send and recv operations.	2015-09-03 10:47:00 -07:00
Edgar Gabriel	c9710660af	Merge pull request #863 from edgargabriel/topic/fcoll-static-cleanup Topic/fcoll static cleanup	2015-09-03 11:21:02 -05:00
Edgar Gabriel	a96a15a83c	re-enable the contiguous buffer optimization similarly to the dynamic component. Passes all hdf5testsi and our own test suite. Please enter the commit message for your changes. Lines starting	2015-09-03 10:13:03 -05:00
Edgar Gabriel	8007effc93	code cleanup for static component, similarly to the dynamic one	2015-09-03 10:12:45 -05:00
Jeff Squyres	6d9faf07e5	Merge pull request #858 from jsquyres/pr/fortran-use-only fortran configiry: test for USE...ONLY support	2015-09-03 10:19:48 -04:00
Edgar Gabriel	ac3a01c39c	Silence coverty warnings 1321702, 1321701, 1321700, 72331, 72330, 72327, 72326, 72325,	2015-09-03 09:10:25 -05:00
Jeff Squyres	66dda00f06	fortran configiry: test for USE...ONLY support As of v15.7, the PGI Fortran compiler does not properly support how Open MPI uses the "USE ... ONLY" Fortran syntax to include modules with conflicting symbol definitions (interestingly, pgfortran only has a problem with this when compiling with -g). In short, OMPI uses "USE :: module_aaa, ONLY: foo" and "USE :: module_bbb, ONLY: bar" to use modules aaa and bbb, even though they contain conflicting definitions for some symbols. However, the use of the ONLY clause should preclude the inclusion of the conflicting symbols -- as the word implies, it should direct the compiler to only use the symbols identified by the clause (i.e., foo and bar, in this example). This commit adds a configure test for this capability. If the compiler fails to build a simple test that mimics this behavior, then disable the mpi_f08 bindings. Fixes open-mpi/ompi#857	2015-09-02 15:55:24 -07:00
Ralph Castain	a772b46c15	Bring the MPI_Publish and friends online	2015-09-02 12:04:07 -07:00
Edgar Gabriel	e95d01be97	Merge pull request #847 from edgargabriel/topic/fcoll-dynamic-cleanup Topic/fcoll dynamic cleanup	2015-09-01 16:10:55 -05:00
Nathan Hjelm	2a8cc5e637	osc/pt2pt: remove outstanding lock only after lock/flush ack received fixes #840 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-09-01 10:54:47 -06:00
Edgar Gabriel	82efc23e8d	iclean up indenting and tabs/space of fcoll_static_file_read/write_all	2015-09-01 09:39:33 -05:00
Edgar Gabriel	a1778406d6	Re-enable the contiguous buffer optimization to the read_all and the write_all routines. After long debugging, I found last week the reason this optimization originally broke some hdf5 tests. We now pass the hdf5 test suite with the optimization being actively used.	2015-09-01 09:29:07 -05:00
Edgar Gabriel	c2c44b11dc	Code cleanup for dynamic read_all and write_all Specifically: - reduce the number of realloc's and malloc's by moving some arrays out of the cycle loop, if we know that there size is not changing - store the rank of the aggregator in a separate variable to avoid continuous dereferencing - change the wait_all logic in write_all to use a fix number of requests (even if they are MPI_REQUEST_NULL) - fix the timing to considere the two initial allgather and the one allgatherv operation to be a part of it - add more comments.	2015-09-01 09:29:07 -05:00

1 2 3 4 5 ...

8404 Коммитов