openmpi

Автор	SHA1	Сообщение	Дата
Nadezhda Kogteva	b2a93943dc	oshmem mmap: set lvl4 for sshmem_mmap_anonymous and sshmem_mmap_fixed variables, define MAP_ANONYMOUS returned.	2014-10-14 08:54:44 +03:00
Gilles Gouaillardet	0f983d5a4f	add a disable function for coll module	2014-10-14 14:46:36 +09:00
Devendar Bureddy	7a6b4c36b0	HCOLL: Update the proc structure dereference Update the proc structure dereference to reflect the new opal_proc_t super field	2014-10-13 20:49:19 +03:00
Devendar Bureddy	b8d2a15be9	HCOLL: by default off	2014-10-13 20:49:09 +03:00
Mike Dubman	ec1f761d8e	OSHMEM: add missing help file, got lost during merge. Thanks to Yossi/Igor for finding it. Change-Id: I466e40a3fea70e8045dd1e897edcc50ccf0451a3 Conflicts: oshmem/mca/sshmem/base/Makefile.am oshmem/mca/sshmem/base/help-oshmem-sshmem.txt	2014-10-13 16:58:35 +03:00
Alex Mikheev	8fcbcba516	Merge branch 'topic/oshmem_shared_mr_fix'	2014-10-13 15:24:12 +03:00
Alex Mikheev	cd67642183	OSHMEM: sshmem verbs: workaround shared_mr procfs bug dereg shared_mr before doing dereg on its mr.	2014-10-13 15:14:34 +03:00
Nadezhda Kogteva	c68c4b45b5	Merge remote-tracking branch 'upstream/master'	2014-10-13 15:12:39 +03:00
Mike Dubman	a1db93077d	Merge pull request #230 from nkogteva/oshmem_refactor_macro_style oshmem: refactor of oshmem/mca/sshmem/*.[ch] files to use #if MACRO style	2014-10-13 13:33:32 +03:00
Nadezhda Kogteva	de68d58a9e	oshmem: refactor of oshmem/mca/sshmem/*.[ch] files to use #if MACRO style	2014-10-13 13:12:16 +03:00
Nadezhda Kogteva	3e7002e8aa	oshmem mmap: copyrights for memheap_base_alloc.c files updated	2014-10-13 11:41:35 +03:00
Nadezhda Kogteva	ce4ee2aa8d	oshmem mmap: new mca parameters were introduced - sshmem_mmap_anonymous, sshmem_mmap_fixed and sshmem_base_backing_file_dir - for runtime mmap management. (cherry picked up from Mellanox-v1.8 repo commit 4c391a)	2014-10-13 11:39:26 +03:00
Mike Dubman	6372ac926c	tools: fix cli args parsing No need to "shift" if argument does not expect parameter on the command line.	2014-10-13 11:33:26 +03:00
Vasily Filipov	a215a4831d	MTL/MXM: disable "bulk_connect" by default.	2014-10-13 09:47:56 +03:00
Ralph Castain	3ef94a0675	Per email thread on devel list: Revert "OPAL: drop dead with core on bad flow. rarely happens with helloworld on large scale." This reverts commit `86f1d5af3e`. Will be reconsidered via RFC as it represents a significant change in behavior	2014-10-12 21:13:42 -07:00
Mike Dubman	113f40b0ec	OSHMEM: sshmem verbs: allocate memory at fixed address Use experimental verbs to allocate memory at fixed base virtual address. verbs will disqualify itself if shared_mr is disabled or not supported and it is impossible to allocate memory starting at fixed base virtual address. verbs contig pages allocator did not guarantee fixed va, now it does. (cherry picked from commit `fd77ebd452`) Apply Jeff`s comments Update with Jeff commits (cherry picked from commit open-mpi/ompi-release@4dc487fc3d)	2014-10-12 09:53:48 +03:00
Ralph Castain	4d27eb70f2	Extend the dstore framework to include a new "update_handle" API so the attributes of an existing handle can be changed. We can't just open a new handle as the upper layers won't know where to find the info. :-(	2014-10-10 12:40:32 -07:00
Ralph Castain	1ae34da5e5	Add an attributes parameter to the dstore.open function so we can pass directives to the active storage component. This can, for example, include the backing file info for a new shared memory segment.	2014-10-10 12:13:25 -07:00
Ralph Castain	63f619f871	Provide a mechanism by which an upstream project can rename the OPAL and ORTE libraries. This is required by projects such as ORCM that have their own ORTE and OPAL libraries in order to avoid library confusion. By renaming their version of the libraries, the OMPI applications can correctly dynamically load the correct one for their build.	2014-10-10 11:39:08 -07:00
Ralph Castain	1be1654e5f	Correctly identify the synonym for orte_direct_modex_cutoff as ompi_hostname_cutoff	2014-10-10 06:05:06 -07:00
Gilles Gouaillardet	8eb2d62919	coll/sm: fix an other memory leak	2014-10-10 19:54:45 +09:00
Gilles Gouaillardet	5d44a30111	coll/sm: fix minor memory leaks port 4488.1.patch attached in #196 to master	2014-10-10 14:21:34 +09:00
Ralph Castain	4fc4a8346b	Fix a couple of minor issues. Ensure usock isn't used if the session dirs aren't setup. Protect an oddball case where orte_xml_fp is NULL.	2014-10-09 20:58:46 -07:00
Nathan Hjelm	169a1866b8	Modify Cray PMI check to detect PMI on older systems	2014-10-09 17:01:31 -06:00
Ralph Castain	b1a58726ac	Cleanup the PMI m4 syntax with respect to -a, and look for libpmi* so we can pickup both .a, .la, and whatever other extensions that particular system might use.	2014-10-09 14:04:43 -07:00
Nathan Hjelm	a31cf3b740	btl/vader: missing include	2014-10-09 13:57:21 -06:00
Nathan Hjelm	9e0c07e4ce	btl/ugni: improve the handling of eager get fragments when the btl runs out of preregistered buffers Before this change eager gets we retried on each progress loop. This commit modifies the protocol to only retry eager gets when another eager get has completed. This commit also cleans up some callback code that is no longer needed.	2014-10-09 13:57:21 -06:00
Howard Pritchard	ebc368d26b	remove GNI_RDMAMODE_FENCE bit in GNI_PostRdma The GNI_RDMAMODE_FENCE bit was a left over from async progress work that is not needed at this point in the gni BTL. Removing the bit also allows for the removal of the GNI_CDM_MODE_BTE_SINGLE_CHANNEL bit from the GNI_CdmCreate call.	2014-10-09 12:41:19 -06:00
Ralph Castain	ce8e33447f	Silence warning	2014-10-09 10:45:25 -07:00
Joshua Ladd	1cabd73522	Adding a new OPAL hash table routine. Please read the algorithm description in opal/class/opal_hash_table.c for more precise details on the design and implementation. This algorithm was contributed by David Linden of H.P. in partnership with Mellanox Technologies. This contribution achieves two objectives: 1. It's actually hashing now, whereas the old OPAL hash table was not. Thus, it is a bug fix for and, as such, should be included in the 1.8 series. 2. It is dynamic and can grow and shrink the number of buckets in accordance with job size, whereas the old OPAL hash table had a fixed number of buckets which resulted in poor retrieval performance at large scale. This scheme has been deployed in the field on very large H.P./Mellanox systems and has been demonstrated to significantly decrease job start-up time (~ 20% improvement) when launching applications directly with srun in SLURM environments. However, neither SLURM nor direct launch are prerequisites to take advantage of this change as any entity that utilizes OPAL hash table objects can benefit (at least partially) from this contribution.	2014-10-09 17:24:23 +02:00
Nadezhda Kogteva	ffa8674e01	Fix bugs in PMI configure: set correct include path, fix test command with multiple conditions.	2014-10-09 17:23:56 +03:00
Elena	b937b31693	fix for multiple spawn test	2014-10-09 06:18:16 +02:00
Elena	3d65799236	pmix: fixed ugly bug which caused many strange hangs	2014-10-09 06:17:03 +02:00
Elena	c905fe9b78	pmix: removed pmix_base_direct modex mca parameter, renamed orte_full_modex_cutoff and ompi_hostname_cutoff to direct_modex_cutoff	2014-10-09 06:15:31 +02:00
Elena	e319c95267	fixes for grpcomm rcd/brucks algorithms	2014-10-09 06:12:26 +02:00
Howard Pritchard	9947758d98	initial thread safety for ugni btl This commit adds initial ugni thread safety support. With this commit, sun thread tests (excepting MPI-2 RMA) pass with various process counts and threads/process. Also osu_latency_mt passes.	2014-10-08 10:13:22 -06:00
Mike Dubman	81917412a8	tools: add flag to avoid git pull during tarball create it breaks jenkins scripts because jenkins fetches the branch and disconnects it from the repo. hence git pull fails	2014-10-08 12:16:08 +03:00
Nathan Hjelm	23cb00d7d5	Merge pull request #225 from hjelmn/master osc/rdma: fix issue identified by Berk Hess	2014-10-07 12:04:40 -06:00
Nathan Hjelm	eed7b45db5	osc/rdma: fix issue identified by Berk Hess osc/rdma uses counters to determine if all messages have been received before exiting synchronization calls. The problem is that the active target counter is always increasing (never zeroed). If over 2^31-1 messages are sent this causes the counter to overflow (in itself this isn't an error). This causes test/wait to return before the communication is complete. There is an additional error in the use of the fragment flush function. If PSCW synchronization is in use this function CAN NOT be called unless a post message has arrived. Relevant mailing list thread: http://www.open-mpi.org/community/lists/devel/2014/10/16016.php This commit fixes both issues. Tested against MTT and issue reproducer. Closes #224.	2014-10-07 11:45:22 -06:00
Ralph Castain	9c027e6def	Update the PMI configure logic to handle the oddball case where both lib and lib64 may exist, and the required files may be in one or the other of them.	2014-10-07 10:20:46 -07:00
Jeff Squyres	a422d893b8	memchecker: per RFC, use calloc for OBJ_NEW With --enable-memchecker builds, use calloc(3) for OBJ_NEW instead of malloc(3). This cuts down on a lot of valgrind/memory checker false positive output. Also make a minor change in the valgrind configure.m4; have it assign 0xf to a char. The prior assignment (of 0xff) was warning about an overflow. This didn't really matter, but we might as well make the test not have a gratuitious warning in it.	2014-10-07 09:55:54 -07:00
Mike Dubman	86f1d5af3e	OPAL: drop dead with core on bad flow. rarely happens with helloworld on large scale.	2014-10-07 14:07:41 +03:00
Jeff Squyres	cd48fbeec6	Merge pull request #221 from opoplawski/master Fix typo in liboshmem name	2014-10-06 09:17:44 -04:00
Alex Mikheev	89535a3272	OSHMEM: sshmem mmap: use MAP_PRIVATE instead of MAP_SHARED It looks like using MAP_PRIVATE instead of MAP_SHARED greatly speeds up infiniband memory registration. Change-Id: Id7089f58458ef8fff4034a2c4707d31f7e8b6694	2014-10-06 11:41:06 +03:00
Gilles Gouaillardet	399fc1bb3e	configury: remove unneeded assignments	2014-10-06 16:36:03 +09:00
Mike Dubman	fd77ebd452	OSHMEM: sshmem verbs: allocate memory at fixed address Use experimental verbs to allocate memory at fixed base virtual address. verbs will disqualify itself if shared_mr is disabled or not supported and it is impossible to allocate memory starting at fixed base virtual address. verbs contig pages allocator did not guarantee fixed va, now it does.	2014-10-05 14:33:56 +03:00
Alex Mikheev	4ac5936257	OSHMEM: sshmem verbs: improve hca name parsing If user gives hca port ignore port, use only hca name. Ex: mlx4_0:1 -> mlx4_0 fixed by @alex-mikheev reviewed by @miked-mellanox	2014-10-05 14:29:11 +03:00
Igor Ivanov	d82dc7f67f	OSHMEM: Add two new mca variables Added use_hp flag in sshmem/sysv variable to control huge page usage; Added shared_mr sshmem/verbs; Both paraemetes are set in auto. Fix help messages fixed by Igor, reviewed by @miked-mellanox and @alex-mikheev	2014-10-05 14:25:39 +03:00
Alex Mikheev	067fa05209	OSHMEM: fixes bug in shmem_lock Lock server pe computation was incorrect in cases when: lock virtual address is signed long. In this case negative pe value was returned. In case when lock has different virtual adresses on different pes. It can happen when memheap or static segment have different base addresses. Use offset instead of absolute virtual address to compute server pe Fixed by @alex-mikheev, reviewed by @miked-mellanox	2014-10-05 09:31:03 +03:00
Howard Pritchard	93eba3ac70	Merge branch 'master' of https://github.com/open-mpi/ompi	2014-10-03 16:08:11 -06:00

... 2 3 4 5 6 ...

21228 Коммитов