openmpi

Автор	SHA1	Сообщение	Дата
Ralph Castain	3ef94a0675	Per email thread on devel list: Revert "OPAL: drop dead with core on bad flow. rarely happens with helloworld on large scale." This reverts commit 86f1d5af3ee484f34092ad3f7a645d9a5ccbcb6c. Will be reconsidered via RFC as it represents a significant change in behavior	2014-10-12 21:13:42 -07:00
Mike Dubman	113f40b0ec	OSHMEM: sshmem verbs: allocate memory at fixed address Use experimental verbs to allocate memory at fixed base virtual address. verbs will disqualify itself if shared_mr is disabled or not supported and it is impossible to allocate memory starting at fixed base virtual address. verbs contig pages allocator did not guarantee fixed va, now it does. (cherry picked from commit fd77ebd4525e9e0c1a3ab1c4966bf31aa45251b4) Apply Jeff`s comments Update with Jeff commits (cherry picked from commit open-mpi/ompi-release@4dc487fc3d)	2014-10-12 09:53:48 +03:00
Ralph Castain	4d27eb70f2	Extend the dstore framework to include a new "update_handle" API so the attributes of an existing handle can be changed. We can't just open a new handle as the upper layers won't know where to find the info. :-(	2014-10-10 12:40:32 -07:00
Ralph Castain	1ae34da5e5	Add an attributes parameter to the dstore.open function so we can pass directives to the active storage component. This can, for example, include the backing file info for a new shared memory segment.	2014-10-10 12:13:25 -07:00
Ralph Castain	63f619f871	Provide a mechanism by which an upstream project can rename the OPAL and ORTE libraries. This is required by projects such as ORCM that have their own ORTE and OPAL libraries in order to avoid library confusion. By renaming their version of the libraries, the OMPI applications can correctly dynamically load the correct one for their build.	2014-10-10 11:39:08 -07:00
Ralph Castain	1be1654e5f	Correctly identify the synonym for orte_direct_modex_cutoff as ompi_hostname_cutoff	2014-10-10 06:05:06 -07:00
Gilles Gouaillardet	8eb2d62919	coll/sm: fix an other memory leak	2014-10-10 19:54:45 +09:00
Gilles Gouaillardet	5d44a30111	coll/sm: fix minor memory leaks port 4488.1.patch attached in #196 to master	2014-10-10 14:21:34 +09:00
Ralph Castain	4fc4a8346b	Fix a couple of minor issues. Ensure usock isn't used if the session dirs aren't setup. Protect an oddball case where orte_xml_fp is NULL.	2014-10-09 20:58:46 -07:00
Nathan Hjelm	169a1866b8	Modify Cray PMI check to detect PMI on older systems	2014-10-09 17:01:31 -06:00
Ralph Castain	b1a58726ac	Cleanup the PMI m4 syntax with respect to -a, and look for libpmi* so we can pickup both .a, .la, and whatever other extensions that particular system might use.	2014-10-09 14:04:43 -07:00
Nathan Hjelm	a31cf3b740	btl/vader: missing include	2014-10-09 13:57:21 -06:00
Nathan Hjelm	9e0c07e4ce	btl/ugni: improve the handling of eager get fragments when the btl runs out of preregistered buffers Before this change eager gets we retried on each progress loop. This commit modifies the protocol to only retry eager gets when another eager get has completed. This commit also cleans up some callback code that is no longer needed.	2014-10-09 13:57:21 -06:00
Howard Pritchard	ebc368d26b	remove GNI_RDMAMODE_FENCE bit in GNI_PostRdma The GNI_RDMAMODE_FENCE bit was a left over from async progress work that is not needed at this point in the gni BTL. Removing the bit also allows for the removal of the GNI_CDM_MODE_BTE_SINGLE_CHANNEL bit from the GNI_CdmCreate call.	2014-10-09 12:41:19 -06:00
Ralph Castain	ce8e33447f	Silence warning	2014-10-09 10:45:25 -07:00
Joshua Ladd	1cabd73522	Adding a new OPAL hash table routine. Please read the algorithm description in opal/class/opal_hash_table.c for more precise details on the design and implementation. This algorithm was contributed by David Linden of H.P. in partnership with Mellanox Technologies. This contribution achieves two objectives: 1. It's actually hashing now, whereas the old OPAL hash table was not. Thus, it is a bug fix for and, as such, should be included in the 1.8 series. 2. It is dynamic and can grow and shrink the number of buckets in accordance with job size, whereas the old OPAL hash table had a fixed number of buckets which resulted in poor retrieval performance at large scale. This scheme has been deployed in the field on very large H.P./Mellanox systems and has been demonstrated to significantly decrease job start-up time (~ 20% improvement) when launching applications directly with srun in SLURM environments. However, neither SLURM nor direct launch are prerequisites to take advantage of this change as any entity that utilizes OPAL hash table objects can benefit (at least partially) from this contribution.	2014-10-09 17:24:23 +02:00
Nadezhda Kogteva	ffa8674e01	Fix bugs in PMI configure: set correct include path, fix test command with multiple conditions.	2014-10-09 17:23:56 +03:00
Elena	b937b31693	fix for multiple spawn test	2014-10-09 06:18:16 +02:00
Elena	3d65799236	pmix: fixed ugly bug which caused many strange hangs	2014-10-09 06:17:03 +02:00
Elena	c905fe9b78	pmix: removed pmix_base_direct modex mca parameter, renamed orte_full_modex_cutoff and ompi_hostname_cutoff to direct_modex_cutoff	2014-10-09 06:15:31 +02:00
Elena	e319c95267	fixes for grpcomm rcd/brucks algorithms	2014-10-09 06:12:26 +02:00
Howard Pritchard	9947758d98	initial thread safety for ugni btl This commit adds initial ugni thread safety support. With this commit, sun thread tests (excepting MPI-2 RMA) pass with various process counts and threads/process. Also osu_latency_mt passes.	2014-10-08 10:13:22 -06:00
Mike Dubman	81917412a8	tools: add flag to avoid git pull during tarball create it breaks jenkins scripts because jenkins fetches the branch and disconnects it from the repo. hence git pull fails	2014-10-08 12:16:08 +03:00
Nathan Hjelm	23cb00d7d5	Merge pull request #225 from hjelmn/master osc/rdma: fix issue identified by Berk Hess	2014-10-07 12:04:40 -06:00
Nathan Hjelm	eed7b45db5	osc/rdma: fix issue identified by Berk Hess osc/rdma uses counters to determine if all messages have been received before exiting synchronization calls. The problem is that the active target counter is always increasing (never zeroed). If over 2^31-1 messages are sent this causes the counter to overflow (in itself this isn't an error). This causes test/wait to return before the communication is complete. There is an additional error in the use of the fragment flush function. If PSCW synchronization is in use this function CAN NOT be called unless a post message has arrived. Relevant mailing list thread: http://www.open-mpi.org/community/lists/devel/2014/10/16016.php This commit fixes both issues. Tested against MTT and issue reproducer. Closes #224.	2014-10-07 11:45:22 -06:00
Ralph Castain	9c027e6def	Update the PMI configure logic to handle the oddball case where both lib and lib64 may exist, and the required files may be in one or the other of them.	2014-10-07 10:20:46 -07:00
Jeff Squyres	a422d893b8	memchecker: per RFC, use calloc for OBJ_NEW With --enable-memchecker builds, use calloc(3) for OBJ_NEW instead of malloc(3). This cuts down on a lot of valgrind/memory checker false positive output. Also make a minor change in the valgrind configure.m4; have it assign 0xf to a char. The prior assignment (of 0xff) was warning about an overflow. This didn't really matter, but we might as well make the test not have a gratuitious warning in it.	2014-10-07 09:55:54 -07:00
Mike Dubman	86f1d5af3e	OPAL: drop dead with core on bad flow. rarely happens with helloworld on large scale.	2014-10-07 14:07:41 +03:00
Jeff Squyres	cd48fbeec6	Merge pull request #221 from opoplawski/master Fix typo in liboshmem name	2014-10-06 09:17:44 -04:00
Alex Mikheev	89535a3272	OSHMEM: sshmem mmap: use MAP_PRIVATE instead of MAP_SHARED It looks like using MAP_PRIVATE instead of MAP_SHARED greatly speeds up infiniband memory registration. Change-Id: Id7089f58458ef8fff4034a2c4707d31f7e8b6694	2014-10-06 11:41:06 +03:00
Gilles Gouaillardet	399fc1bb3e	configury: remove unneeded assignments	2014-10-06 16:36:03 +09:00
Mike Dubman	fd77ebd452	OSHMEM: sshmem verbs: allocate memory at fixed address Use experimental verbs to allocate memory at fixed base virtual address. verbs will disqualify itself if shared_mr is disabled or not supported and it is impossible to allocate memory starting at fixed base virtual address. verbs contig pages allocator did not guarantee fixed va, now it does.	2014-10-05 14:33:56 +03:00
Alex Mikheev	4ac5936257	OSHMEM: sshmem verbs: improve hca name parsing If user gives hca port ignore port, use only hca name. Ex: mlx4_0:1 -> mlx4_0 fixed by @alex-mikheev reviewed by @miked-mellanox	2014-10-05 14:29:11 +03:00
Igor Ivanov	d82dc7f67f	OSHMEM: Add two new mca variables Added use_hp flag in sshmem/sysv variable to control huge page usage; Added shared_mr sshmem/verbs; Both paraemetes are set in auto. Fix help messages fixed by Igor, reviewed by @miked-mellanox and @alex-mikheev	2014-10-05 14:25:39 +03:00
Alex Mikheev	067fa05209	OSHMEM: fixes bug in shmem_lock Lock server pe computation was incorrect in cases when: lock virtual address is signed long. In this case negative pe value was returned. In case when lock has different virtual adresses on different pes. It can happen when memheap or static segment have different base addresses. Use offset instead of absolute virtual address to compute server pe Fixed by @alex-mikheev, reviewed by @miked-mellanox	2014-10-05 09:31:03 +03:00
Howard Pritchard	93eba3ac70	Merge branch 'master' of https://github.com/open-mpi/ompi	2014-10-03 16:08:11 -06:00
Ralph Castain	fd6a044b7f	Cleanup some cruft resulting from the move of the btl's to opal. We had created the ability to delay modex operations, which included a need to delay retrieving hostname info for remote procs. This allowed us to not retrieve the modex info until first message unless required - the hostname is generally only required for debug and error messages. Properly setup the opal_process_info structure early in the initialization procedure. Define the local hostname right at the beginning of opal_init so all parts of opal can use it. Overlay that during orte_init as the user may choose to remove fqdn and strip prefixes during that time. Setup the job_session_dir and other such info immediately when it becomes available during orte_init.	2014-10-03 16:02:57 -06:00
Jeff Squyres	b44a244fbc	openmpi-release.sh: update for git Also add consistent indenting to make the loop easier to read.	2014-10-03 16:02:57 -06:00
Orion Poplawski	2d5832ccc4	Fix typo in liboshmem name	2014-10-03 15:36:37 -06:00
Ralph Castain	bd2974f239	Merge branch 'master' of ssh://github.com/open-mpi/ompi	2014-10-03 14:23:25 -07:00
Ralph Castain	fb1f487d85	Cleanup some cruft resulting from the move of the btl's to opal. We had created the ability to delay modex operations, which included a need to delay retrieving hostname info for remote procs. This allowed us to not retrieve the modex info until first message unless required - the hostname is generally only required for debug and error messages. Properly setup the opal_process_info structure early in the initialization procedure. Define the local hostname right at the beginning of opal_init so all parts of opal can use it. Overlay that during orte_init as the user may choose to remove fqdn and strip prefixes during that time. Setup the job_session_dir and other such info immediately when it becomes available during orte_init.	2014-10-03 14:19:48 -07:00
Jeff Squyres	0997c91a6a	openmpi-release.sh: update for git Also add consistent indenting to make the loop easier to read.	2014-10-03 17:04:18 -04:00
Howard Pritchard	5428301c81	Remove catamount timer support With the 1.9 release, support for catamount is being dropped. Hence, removing catamount timer support.	2014-10-03 14:53:09 -06:00
Howard Pritchard	d2bb8d8829	remove alps ess component The alps ess component is obsolete. It relies on header files only present in very old CLE (Cray Linux) 3.X for the Cray XT series. As support for these systems is being dropped starting with release 1.9, this code is being removed.	2014-10-03 13:17:33 -06:00
Jeff Squyres	d0336745f4	openmpi-nightly-tarball.sh: don't even check v1.6 any more The create_tarball.sh in v1.6 assumes SVN. So let's not check that any more...	2014-10-03 11:16:30 -07:00
Jeff Squyres	534d773a9a	openmpi-nightly-tarball.sh: fix typo in ompi-release URLs Fix copy-n-paste error: had the ompi URLs instead of the ompi-release URLs.	2014-10-03 10:20:58 -07:00
Jeff Squyres	0e21c66fd4	openmpi-nightly-tarball.sh: fix typo	2014-10-03 09:01:02 -07:00
Jeff Squyres	f72bf3b3c3	gkcommit.pl: so long gkcommit; you served us well in SVN days...	2014-10-03 08:53:23 -07:00
Jeff Squyres	a12eef6ecf	find-copyrights.pl: updates for git And minor whitespace cleanup.	2014-10-03 08:52:48 -07:00
Jeff Squyres	58e6213d2f	make_dist_tarball: remove debug statement Remove the default to only build a "no OMPI" tarball when invoked via "make_tarball".	2014-10-03 08:32:29 -07:00

1 2 3 4 5 ...

21064 Коммитов