openmpi

Автор	SHA1	Сообщение	Дата
Jeff Squyres	13707ec0af	Remove this comment: it turns out that the benefit was to make multiple SM ''modules'', not multiple SM ''mpools''. This commit was SVN r26584.	2012-06-08 22:37:26 +00:00
Jeff Squyres	5451ee46bd	Per r26575, the sync coll module is no longer necessary! (the crowd goes wild) This commit was SVN r26583. The following SVN revision numbers were found above: r26575 --> open-mpi/ompi@59e529cf1d	2012-06-08 19:19:19 +00:00
Nathan Hjelm	59e529cf1d	ob1: as per developer discussion disable rdma retries. the failure path currently suffers from live-lock This commit was SVN r26575.	2012-06-07 23:31:20 +00:00
Nathan Hjelm	4c6be00de2	fix erroneous commit in rdma mpool This commit was SVN r26572.	2012-06-07 20:01:44 +00:00
Nathan Hjelm	ceee4bcb0d	libevent2019: libevent_pthreads.la is never built. don't include it This commit was SVN r26570.	2012-06-07 19:22:45 +00:00
Jeff Squyres	56a537a5f5	This component wasn't even in 1.5.0; no one has had a GM network in forever. There is no point in carrying this component forward. This commit was SVN r26563.	2012-06-06 21:43:54 +00:00
Mike Dubman	10831e111a	detect num of local procs This commit was SVN r26555.	2012-06-05 09:13:16 +00:00
Mike Dubman	e9c274f3b9	raise cm prio for mxm as well, somehow was removed from 1.7, exists in 1.6 This commit was SVN r26554.	2012-06-05 09:02:03 +00:00
Yevgeny Kliteynik	1cbce83ece	Fixed wording of MXM parameters as suggested By Jeff. This commit was SVN r26545.	2012-06-03 21:48:42 +00:00
Yevgeny Kliteynik	f02bf707a4	Added MXM parameter "np" that controls the minimal number of processes that allow MXM to run Default: 128 MXM advantages kick in with large number of processes. This commit was SVN r26544.	2012-06-02 11:07:20 +00:00
Nathan Hjelm	71bffa5158	ugni: update to latest btl code. bug fixes and cleanup This commit was SVN r26529.	2012-05-31 20:02:41 +00:00
Edgar Gabriel	3ccd286de1	silence a compiler warning for optimized builds. This commit was SVN r26528.	2012-05-31 13:32:10 +00:00
Vishwanath Venkatesan	86a57c7b66	Initializing sorted_file_offsets to NULL This commit was SVN r26526.	2012-05-30 06:56:40 +00:00
Jeff Squyres	99c5afb397	Remove clang compiler warnings. This commit was SVN r26523.	2012-05-29 23:36:06 +00:00
Edgar Gabriel	d1e91e9372	make the file compile properly. This commit was SVN r26497.	2012-05-26 01:06:36 +00:00
Brian Barrett	2effbb1ba6	fix copy/paste typo This commit was SVN r26492.	2012-05-24 16:06:20 +00:00
Ralph Castain	c0304eb23a	Fix copy/paste typo This commit was SVN r26491.	2012-05-24 15:47:20 +00:00
Nathan Hjelm	cdc3c87ba6	move pmi init/finalize into a common component This commit was SVN r26470.	2012-05-22 15:15:39 +00:00
George Bosilca	e890a8379b	Various minor cleanups. This commit was SVN r26461.	2012-05-21 13:15:24 +00:00
Brian Barrett	25693363e9	* Fix internal accounting error regarding number of available credits * Use a single MD covering all of address space for put transfers, rather than a per-send MD. This commit was SVN r26458.	2012-05-20 23:42:26 +00:00
Vishwanath Venkatesan	8d4bb65bd4	Modifying the explicit operations to make it absolute This commit was SVN r26451.	2012-05-18 21:43:34 +00:00
Vishwanath Venkatesan	cbad31cc88	1. Freeing the displs array after allgatherv to avoid segmentation faults in dynamic segmentation 2. Checking for 0 bytes datatypes and sending only when data available to avoid 0 byte messages being sent and received. 3. Changing timing extraction to support calculating, min, max and avg communication costs + min and avg write costs This commit was SVN r26450.	2012-05-18 21:39:58 +00:00
Rolf vandeVaart	f8ace21366	Rename a few things for clarity. Add a stream. This commit was SVN r26447.	2012-05-17 18:10:59 +00:00
Rolf vandeVaart	c228bd2311	Fix broken compile. Keep in sync with sm btl. This commit was SVN r26440.	2012-05-15 15:32:33 +00:00
Yevgeny Kliteynik	d59b8d5dc4	Fixing malformed error message This commit was SVN r26434.	2012-05-12 21:13:42 +00:00
Mike Dubman	98c2c749fb	fix define name to BTL_OPENIB_MALLOC_HOOKS_ENABLED Thanks to Ludovic.Hablot@ext.bull.net for pointing this out This commit was SVN r26432.	2012-05-11 18:30:45 +00:00
Brian Barrett	2e52374847	* Split send and receive eq sizes * Need to look at slot count before flowcontrol for sending to prevent race in restart * Need to free pending request fragments when done with the request * A number of branch prediction optimizations for error conditions This commit was SVN r26430.	2012-05-10 21:43:48 +00:00
Yevgeny Kliteynik	244d66d95b	Fixed FDR link speed details, added EDR. This commit was SVN r26423.	2012-05-10 13:44:18 +00:00
Nathan Hjelm	91d99c6fef	ugni: reserve memory domain descriptors (MDDs) for mailbox registration This commit was SVN r26419.	2012-05-10 00:24:42 +00:00
Jeff Squyres	de4bbacd13	It turns out that we can't always include the hwloc OpenFabrics verbs helper file, even if we find that the system has <infiniband/verbs.h>. The reason is because there are some inline functions in that verbs helper file that invoke ibv_* functions. Some linkers (e.g., Solaris Studio Compilers) will instantiate those static inline functions -- even if we don't use them -- and therefore we need to be able to resolve the ibv_* symbols at link time. But since -libverbs is only specified in places where we use other ibv_* functions (e.g., the OpenFabrics-based BTLs), that means that linking random executables can/will fail (e.g., orterun). So instead, introduce a new #define: OPAL_HWLOC_WANT_VERBS_HELPER. If this macro is set to 1 before including opal/mca/hwloc/hwloc.h, then you'll also get the hwloc OpenFabrics verbs helper header file (if hwloc found <infiniband/verbs.h> -- otherwise, it'll #error). This commit was SVN r26417.	2012-05-09 20:18:31 +00:00
Mike Dubman	cd17fee9a8	performance fix: openib use memalign for malloc This commit was SVN r26409.	2012-05-08 20:42:09 +00:00
Nathan Hjelm	903f9fac09	ugni: fixed buffered sends and code cleanup This commit was SVN r26401.	2012-05-07 17:23:06 +00:00
Nathan Hjelm	49eda71ca0	ugni: fix invalid parameter with opal_pointer_array_init This commit was SVN r26400.	2012-05-07 17:22:55 +00:00
Nathan Hjelm	584c457352	ugni: update smsg defaults and add parameter to control local completion queue size This commit was SVN r26399.	2012-05-07 17:22:49 +00:00
Nathan Hjelm	bfcf67391a	ugni: set fragment id from opal_pointer_array_add This commit was SVN r26398.	2012-05-07 17:22:42 +00:00
Nathan Hjelm	b3dc726e9d	ugni: don't create completion queues until add_procs This commit was SVN r26397.	2012-05-07 17:22:35 +00:00
Nathan Hjelm	0e48ea1f65	vader: remove #include of headers that no longer exist This commit was SVN r26396.	2012-05-07 17:22:28 +00:00
Nathan Hjelm	a32d4c648d	ob1: rewind convertor after failed send This commit was SVN r26395.	2012-05-07 17:22:22 +00:00
Jeff Squyres	2ba10c37fe	Per RFC, bring in the following changes: * Remove paffinity, maffinity, and carto frameworks -- they've been wholly replaced by hwloc. * Move ompi_mpi_init() affinity-setting/checking code down to ORTE. * Update sm, smcuda, wv, and openib components to no longer use carto. Instead, use hwloc data. There are still optimizations possible in the sm/smcuda BTLs (i.e., making multiple mpools). Also, the old carto-based code found out how many NUMA nodes were ''available'' -- not how many were used ''in this job''. The new hwloc-using code computes the same value -- it was not updated to calculate how many NUMA nodes are used ''by this job.'' * Note that I cannot compile the smcuda and wv BTLs -- I ''think'' they're right, but they need to be verified by their owners. * The openib component now does a bunch of stuff to figure out where "near" OpenFabrics devices are. '''THIS IS A CHANGE IN DEFAULT BEHAVIOR!!''' and still needs to be verified by OpenFabrics vendors (I do not have a NUMA machine with an OpenFabrics device that is a non-uniform distance from multiple different NUMA nodes). * Completely rewrite the OMPI_Affinity_str() routine from the "affinity" mpiext extension. This extension now understands hyperthreads; the output format of it has changed a bit to reflect this new information. * Bunches of minor changes around the code base to update names/types from maffinity/paffinity-based names to hwloc-based names. * Add some helper functions into the hwloc base, mainly having to do with the fact that we have the hwloc data reporting ''all'' topology information, but sometimes you really only want the (online \| available) data. This commit was SVN r26391.	2012-05-07 14:52:54 +00:00
Mike Dubman	1b475523de	add support for FDR speed This commit was SVN r26385.	2012-05-06 05:53:05 +00:00
Nathan Hjelm	b6ae288a59	fix segfault when pml direct enabled This commit was SVN r26371.	2012-05-01 23:12:41 +00:00
Brian Barrett	0ae2277796	Add a backoff mechanism for re-establishing communication This commit was SVN r26366.	2012-05-01 15:53:00 +00:00
Brian Barrett	74ade8b181	need to order the pending list before we restart This commit was SVN r26365.	2012-04-30 23:06:00 +00:00
Brian Barrett	5dec52af8d	remove some now unneeded debugging This commit was SVN r26364.	2012-04-30 22:50:52 +00:00
Brian Barrett	c654ee6afc	* Use triggered operations for restart barrier as well This commit was SVN r26363.	2012-04-30 22:48:10 +00:00
Brian Barrett	91a9973bde	* Make flow control on by default * Move alarm code back into a triggered operation This commit was SVN r26362.	2012-04-30 22:25:40 +00:00
Brian Barrett	e6a0a1cf8a	* Make sure to release all resources on failed send * Avoid triggered ops until we get everything debugged * Simplify flowctl interface a bit This commit was SVN r26356.	2012-04-27 21:11:01 +00:00
Nathan Hjelm	c36ab84116	ugni: missed a couple of lines in the last commit This commit was SVN r26340.	2012-04-25 14:24:48 +00:00
Nathan Hjelm	a753fe91f7	fix merge This commit was SVN r26332.	2012-04-24 21:16:51 +00:00
Nathan Hjelm	0eb18b9699	ob1: update copyrights This commit was SVN r26331.	2012-04-24 20:19:15 +00:00
Nathan Hjelm	0a0e487d9c	ob1: add emacs mode/indentation defaults This commit was SVN r26330.	2012-04-24 20:19:06 +00:00
Nathan Hjelm	9a35f96bda	ob1: add support for get fallback on put/send This commit was SVN r26329.	2012-04-24 20:18:56 +00:00
Nathan Hjelm	93780c63be	replace tabs w/ spaces This commit was SVN r26328.	2012-04-24 20:18:45 +00:00
Nathan Hjelm	0f60858a01	ugni: improve handling of smsg completions This commit was SVN r26327.	2012-04-24 20:18:35 +00:00
Nathan Hjelm	e3b9040e69	vader: remove maffinity code This commit was SVN r26321.	2012-04-24 15:38:03 +00:00
Nathan Hjelm	363bd184e7	ugni: re-disable uGNI for local procs This commit was SVN r26318.	2012-04-23 21:12:12 +00:00
Nathan Hjelm	ca3ceb840c	ugni: add mca parameter to control the number of smsg retries This commit was SVN r26317.	2012-04-23 21:12:05 +00:00
Nathan Hjelm	95b12f140a	ugni: cleanup frag setup code This commit was SVN r26316.	2012-04-23 21:11:57 +00:00
Nathan Hjelm	37ca31b295	ugni: remove unused completion queue This commit was SVN r26315.	2012-04-23 21:11:39 +00:00
Nathan Hjelm	1340f9c65a	ugni update: - Move endpoint code back up to BTL - Use opal_pointer_array_t for bounce buffer to identify local smsg completions. - Update and reenable sendi - Create a new endpoint for FMA/BTE transactions (keep local smsg/fma transactions seperate) - Move reverse get code into btl_ugni_put.c - Move eager get code into btl_ugni_get.c - Handle remote SMSG overruns correctly - Added support for inplace sends - etc This commit was SVN r26307.	2012-04-19 21:51:55 +00:00
Nathan Hjelm	2b9827f45c	ugni: restrict number of memory registrations per process This commit was SVN r26306.	2012-04-19 21:51:44 +00:00
Jeff Squyres	253444c6d0	== Highlights == 1. New mpifort wrapper compiler: you can utilize mpif.h, use mpi, and use mpi_f08 through this one wrapper compiler 1. mpif77 and mpif90 still exist, but are sym links to mpifort and may be removed in a future release 1. The mpi module has been re-implemented and is significantly "mo' bettah" 1. The mpi_f08 module offers many, many improvements over mpif.h and the mpi module This stuff is coming from a VERY long-lived mercurial branch (3 years!); it'll almost certainly take a few SVN commits and a bunch of testing before I get it correctly committed to the SVN trunk. == More details == Craig Rasmussen and I have been working with the MPI-3 Fortran WG and Fortran J3 committees for a long, long time to make a prototype MPI-3 Fortran bindings implementation. We think we're at a stable enough state to bring this stuff back to the trunk, with the goal of including it in OMPI v1.7. Special thanks go out to everyone who has been incredibly patient and helpful to us in this journey: * Rolf Rabenseifner/HLRS (mastermind/genius behind the entire MPI-3 Fortran effort) * The Fortran J3 committee * Tobias Burnus/gfortran * Tony !Goetz/Absoft * Terry !Donte/Oracle * ...and probably others whom I'm forgetting :-( There's still opportunities for optimization in the mpi_f08 implementation, but by and large, it is as far along as it can be until Fortran compilers start implementing the new F08 dimension(..) syntax. Note that gfortran is currently unsupported for the mpi_f08 module and the new mpi module. gfortran users will a) fall back to the same mpi module implementation that is in OMPI v1.5.x, and b) not get the new mpi_f08 module. The gfortran maintainers are actively working hard to add the necessary features to support both the new mpi_f08 module and the new mpi module implementations. This will take some time. As mentioned above, ompi/mpi/f77 and ompi/mpi/f90 no longer exist. All the fortran bindings implementations have been collated under ompi/mpi/fortran; each implementation has its own subdirectory: {{{ ompi/mpi/fortran/ base/ - glue code mpif-h/ - what used to be ompi/mpi/f77 use-mpi-tkr/ - what used to be ompi/mpi/f90 use-mpi-ignore-tkr/ - new mpi module implementation use-mpi-f08/ - new mpi_f08 module implementation }}} There's also a prototype 6-function-MPI implementation under use-mpi-f08-desc that emulates the new F08 dimension(..) syntax that isn't fully available in Fortran compilers yet. We did that to prove it to ourselves that it could be done once the compilers fully support it. This directory/implementation will likely eventually replace the use-mpi-f08 version. Other things that were done: * ompi_info grew a few new output fields to describe what level of Fortran support is included * Existing Fortran examples in examples/ were renamed; new mpi_f08 examples were added * The old Fortran MPI libraries were renamed: * libmpi_f77 -> libmpi_mpifh * libmpi_f90 -> libmpi_usempi * The configury for Fortran was consolidated and significantly slimmed down. Note that the F77 env variable is now IGNORED for configure; you should only use FC. Example: {{{ shell$ ./configure CC=icc CXX=icpc FC=ifort ... }}} All of this work was done in a Mercurial branch off the SVN trunk, and hosted at Bitbucket. This branch has got to be one of OMPI's longest-running branches. Its first commit was Tue Apr 07 23:01:46 2009 -0400 -- it's over 3 years old! :-) We think we've pulled in all relevant changes from the OMPI trunk (e.g., Fortran implementations of the new MPI-3 MPROBE stuff for mpif.h, use mpi, and use mpi_f08, and the recent Fujitsu Fortran patches). I anticipate some instability when we bring this stuff into the trunk, simply because it touches a LOT of code in the MPI layer in the OMPI code base. We'll try our best to make it as pain-free as possible, but please bear with us when it is committed. This commit was SVN r26283.	2012-04-18 15:57:29 +00:00
Brian Barrett	8a70747da2	Fix some naming that doesn't make a ton of sense This commit was SVN r26277.	2012-04-18 01:05:18 +00:00
Brian Barrett	f4d4e87176	add some flow control debugging output This commit was SVN r26276.	2012-04-17 23:14:05 +00:00
Brian Barrett	fe0dfc8e26	First take at flow control protocol This commit was SVN r26274.	2012-04-17 21:46:21 +00:00
Brian Barrett	dde6f094eb	In preperation for flow control changes coming, always utilize ACKs for message completion. This commit was SVN r26272.	2012-04-16 17:25:27 +00:00
Terry Dontje	81d7fcaf82	back out r26255 to avoid cross component linkage so Solaris can build a usable openib btl This commit was SVN r26269. The following SVN revision numbers were found above: r26255 --> open-mpi/ompi@fe25b8704b	2012-04-13 18:08:54 +00:00
Nathan Hjelm	f88babfb92	ugni: minor updates This commit was SVN r26262.	2012-04-10 19:56:19 +00:00
Mike Dubman	34acf769d4	mtl_mxm: support canceling messages This commit was SVN r26256.	2012-04-09 16:02:05 +00:00
Mike Dubman	fe25b8704b	performance fix: set alignment for openib internal buffers Thanks to Jeff/Pasha for valuable comments Thanks to Valentin Petrov for implementation This commit was SVN r26255.	2012-04-09 08:06:15 +00:00
George Bosilca	f09e3ce5a4	Spring cleanup. Nothing important. This commit was SVN r26247.	2012-04-06 15:48:07 +00:00
George Bosilca	654c75ff24	As suggested on the mailing list a while back, switch the default alltoallv algorithm to pairwise exchange instead of the default one. This might improve the scheduling and relax the pressure on the network. This commit was SVN r26246.	2012-04-06 15:47:29 +00:00
Ralph Castain	bd8b4f7f1e	Sorry for mid-day commit, but I had promised on the call to do this upon my return. Roll in the ORTE state machine. Remove last traces of opal_sos. Remove UTK epoch code. Please see the various emails about the state machine change for details. I'll send something out later with more info on the new arch. This commit was SVN r26242.	2012-04-06 14:23:13 +00:00
Brian Barrett	d46d55ee9b	If we're locking the local window, need to wait until the lock returns. This commit was SVN r26234.	2012-04-04 16:27:24 +00:00
Josh Hursey	d1571b027a	Fix a few error return paths This commit was SVN r26233.	2012-04-04 15:11:03 +00:00
Nathan Hjelm	b0c3c18e02	Initial upload of grdma mpool This commit was SVN r26232.	2012-04-03 23:03:03 +00:00
Mike Dubman	ff1c84c53f	revert previous commit This commit was SVN r26206.	2012-03-29 14:07:13 +00:00
Mike Dubman	43a5775e8a	performance fix: set alignment for openib internal buffers This commit was SVN r26205.	2012-03-29 14:00:08 +00:00
Nathan Hjelm	d62c0f1872	ugni: handle smsg failure in mca_btl_ugni_ep_connect_finish This commit was SVN r26202.	2012-03-28 05:40:16 +00:00
Brian Barrett	451af0e832	Ensure async progress for long unexpected messages by waiting for an event on the ME. The events we're likely to see are LINK (the ME was added to the match list), PUT (weird to see first, but means that the ME was linked to the match list and then matched), or PUT_OVERFLOW, meaning the message was unexpected. This commit was SVN r26199.	2012-03-26 22:54:35 +00:00
Brian Barrett	2a26d0f9a2	Forgot to add new file in the last commit. Mark ME as invalid once we see a completion event, and look for events before trying to unlink. This commit was SVN r26198.	2012-03-26 22:39:05 +00:00
Brian Barrett	0e91084385	* Add type field to the request structure to deal with random user requests (ie, cancel) * Implement cancel for receives. Sends are slightly more complicated... This commit was SVN r26197.	2012-03-26 22:32:36 +00:00
Brian Barrett	61a090e0d1	Checking for NULL function pointers and direct-call semantics can't work together, so implement all functions in the MTL interface for all MTLs. The only places NULL was still being set was for add_comm/del_comm, and matched probe, both of which are straight forward to implement (or return ERROR_NOT_IMPLEMENTED, since the PML can't emulate matched probe). This commit was SVN r26194.	2012-03-26 19:27:03 +00:00
Brian Barrett	cdaf110c0f	* Implement mtl_send in addition to mtl_sendi This commit was SVN r26193.	2012-03-26 19:19:11 +00:00
Brian Barrett	27c8f71773	Start of the flow control implementation. #defined out for now. This commit was SVN r26192.	2012-03-26 01:31:58 +00:00
Jeff Squyres	fa8980157a	Fix typo. This commit was SVN r26183.	2012-03-23 00:12:32 +00:00
Brian Barrett	cce936b94c	* Implement matched probe for the CM PML. Required adding a peer field to the ompi_message_t structure to properly initialize convertor (the peer is available in the request in OB1, and wasn't needed when I did the original implementation). * Implement matched probe for the Portals4 MTL and add NULL function pointers for the other MTLs. * Add add_comm and del_comm functions to portals4 MTL so that direct call almost works again. * Add NEWS item that we've implemented matched probe This commit was SVN r26180.	2012-03-22 22:55:59 +00:00
Brian Barrett	4d12616b64	Frank pointed out that PTL_OK is zero and PtlHandleIsEqual either returns PTL_OK or PTL_FAIL and that I had these backwards. This commit was SVN r26179.	2012-03-22 15:58:00 +00:00
Brian Barrett	1c6b5a1358	* Set all appropriate flags for portal table entries * split eq into send and receive eqs so that we can control the number of outstanding events in send eq and ensure we never lose an ack * Shouldn't ever truncate on short unexpected receive bocks, so don't set the truncate bit * Track active vs. waiting for free short unexpected receive blocks so to ensure an active short unexpected receive block is posted coming out of flow control. Also allow creation of "temporary" blocks which should be released once FREE event is received. * Slight reorganization of some code in preparation for more flow control work. This commit was SVN r26174.	2012-03-21 22:20:55 +00:00
Mike Dubman	a45898ea9c	fix support for fca 2.2, warning fixes on rhel 6.x This commit was SVN r26166.	2012-03-20 10:00:52 +00:00
Nathan Hjelm	135ac32b64	ugni: use hash table to keep track of smsg frag completion This commit was SVN r26154.	2012-03-15 20:15:59 +00:00
Nathan Hjelm	fca42347e3	ugni: use hash table to keep track of smsg frag completion This commit was SVN r26153.	2012-03-15 20:13:32 +00:00
Nathan Hjelm	deddf0b33e	ugni: fix frag leak in sendi This commit was SVN r26152.	2012-03-15 20:13:20 +00:00
Nathan Hjelm	99f05d56e3	ugni: updated parameters and code cleanup This commit was SVN r26151.	2012-03-15 20:13:11 +00:00
Nathan Hjelm	921176745d	vader: remove lock based fifos This commit was SVN r26150.	2012-03-15 20:12:59 +00:00
Nathan Hjelm	4e01440b05	vader: clean frag alloc/return This commit was SVN r26149.	2012-03-15 20:12:46 +00:00
Terry Dontje	e73df369e4	Update bfo pml with code from ob1 to support mprobe, improbe, mrecv, imrecv and cuda. This commit was SVN r26145.	2012-03-15 10:20:46 +00:00
Christopher Yeoh	524de80eaa	Adds support for Cross Memory Attach in the sm btl. This feature can be enabled at compile time with --with-cma passed to configure. At runtime it is also necessary to add "--mca btl btl_sm_use_cma 1" to the mpirun command. If both CMA and KNEM are compiled in and enabled at runtime then KNEM will take precedence and CMA will disable itself This commit was SVN r26134.	2012-03-14 06:29:09 +00:00
Mike Dubman	bd7abd72a9	in mca_mtl_mxm, don't allow negative tags for MPI_ANY_TAG This commit was SVN r26128.	2012-03-09 22:11:14 +00:00
Rolf vandeVaart	41870ce6ee	Mostly fix some of the verbose output. Also fix issue where memory handle was blocking other registration. This commit was SVN r26124.	2012-03-09 21:28:56 +00:00
George Bosilca	de1078a71b	Thanks to Alex Margolin for pointing out this relique. This commit was SVN r26121.	2012-03-09 14:01:45 +00:00
Pavel Shamis	102da281c4	OPENIB BTL - use orte_show_help instead of BTL_ERROR print in case ibv_reg_mr failed. This commit was SVN r26111.	2012-03-08 09:04:03 +00:00
Mike Dubman	4e7e7d7c3f	print error which is ignored on upper layer This commit was SVN r26106.	2012-03-06 14:25:56 +00:00
George Bosilca	a78a7bd8e8	The tuned collectives can now deal with more than 2Gb of data. This commit was SVN r26103.	2012-03-05 22:23:44 +00:00
George Bosilca	762b3e13a9	Use the correct name for the datatype destruction function. This commit was SVN r26100.	2012-03-05 15:54:53 +00:00
George Bosilca	7d523a8852	Avoid calling the bcast with counts larger than INT_MAX. This commit was SVN r26098.	2012-03-05 14:30:30 +00:00
George Bosilca	e8c358c188	Allow Open MPI to deal with size_t internally. This commit was SVN r26097.	2012-03-05 14:10:26 +00:00
Abhishek Kulkarni	08ca0f80bc	Fix a C/R bug where the restart hung due to dangling fds in the openib btl. This commit was SVN r26094.	2012-03-04 06:57:33 +00:00
George Bosilca	f83670211e	Allow the user to define dynamic rules for messages larger than 2GB. This commit was SVN r26084.	2012-03-02 21:16:23 +00:00
George Bosilca	8791ade293	Help he selection of the right algorithm for large data (> 2Gb). Thanks to Fujitsu for the patch. This commit was SVN r26080.	2012-03-02 19:12:22 +00:00
Terry Dontje	3e70cad203	Correct a few alignment problems to address the issue brought up in ticket #2964 This commit was SVN r26078.	2012-03-01 17:29:40 +00:00
Nathan Hjelm	f1525bdbff	ob1: fix two fragment leaks - MAJOR! get src descriptor leaks if mca_bml_base_send fails - minor. descriptor leaked in mca_pml_send_request_start_copy if the btl returns OMPI_ERR_RESOURCE_BUSY. This commit was SVN r26077.	2012-03-01 15:53:39 +00:00
Mike Dubman	540b3c0c25	update mxm mtl to changes in mxm api This commit was SVN r26073.	2012-02-29 22:02:34 +00:00
Nathan Hjelm	a7209e309a	ugni: opps, sendi was missing from Makefile.am This commit was SVN r26067.	2012-02-28 16:10:35 +00:00
Edgar Gabriel	636cf786be	local_heap_sort should really be a static function. This commit was SVN r26065.	2012-02-28 14:42:56 +00:00
Vishwanath Venkatesan	7c9c3ede61	Modified implementation for the static segmentation read algorithm with improved performance and bug fixes. This commit was SVN r26056.	2012-02-24 20:55:33 +00:00
Vishwanath Venkatesan	d5a9223a9a	Removed a variable which was allocated but never used. This commit was SVN r26055.	2012-02-24 20:48:52 +00:00
Vishwanath Venkatesan	326bc69df4	Modified implementation for static file write all algorithm which fixes all the previous bugs and provides improved performance. This commit was SVN r26054.	2012-02-24 20:47:09 +00:00
Rolf vandeVaart	b0a84b0a7d	New btl that extends sm btl to support GPU transfers within a node. Uses new CUDA IPC support. Also, a few minor changes in PML to take advantage of it. This code has no effect unless user asks for it explicitly via configure arguments. Otherwise, it is either #ifdef'ed out or not compiled. This commit was SVN r26039.	2012-02-24 02:13:33 +00:00
Nathan Hjelm	8217c46666	ompi_free_list: allocate payload if payload size > 0 in the fl_mpool = NULL case This commit was SVN r26027.	2012-02-23 16:47:28 +00:00
Nathan Hjelm	9843cd0466	ugni: missed one more merge typo This commit was SVN r26026.	2012-02-23 16:39:15 +00:00
Nathan Hjelm	d7cd95c802	vader: fix typo This commit was SVN r26025.	2012-02-23 16:29:45 +00:00
Nathan Hjelm	c83fe003a0	ugni: fix branch merge typo This commit was SVN r26024.	2012-02-23 16:16:21 +00:00
Rolf vandeVaart	c7a0ce2755	Two new mpools. They are not used now (and by default, not compiled) but they will be soon. Provide support for GPU buffer transfers within a node. This commit was SVN r26008.	2012-02-22 23:32:36 +00:00
Nathan Hjelm	dc94b6a3fb	vader: minor fast box optimization This commit was SVN r26002.	2012-02-22 20:53:49 +00:00
Nathan Hjelm	4c7b7c675a	vader: minor code cleanup. move xpmem_create to component_init This commit was SVN r25999.	2012-02-22 18:32:40 +00:00
Mike Dubman	81bd5eee8d	in mxm, use sender_len field and not actual_len when returning result from probe This commit was SVN r25993.	2012-02-21 19:55:16 +00:00
Jeff Squyres	b6a90434e4	Fix some include file header ordering issues for some BSDs, suggested by Paul Hargrove. This commit was SVN r25984.	2012-02-21 13:32:14 +00:00
Jeff Squyres	7fb1144d27	Add missing AC_CONFIG_FILES This commit was SVN r25974.	2012-02-20 22:59:03 +00:00
Jeff Squyres	6a109c8e15	Ensure that we have aio.h before trying to compile this component. This commit was SVN r25966.	2012-02-20 15:53:20 +00:00
George Bosilca	bd9402fd8d	Fix call to error_cb. This commit was SVN r25946.	2012-02-17 03:18:51 +00:00
Jeff Squyres	4c0d24ff9a	Improve the help messages for tcp_btl_if_[in\|ex]clude This commit was SVN r25940.	2012-02-16 15:59:18 +00:00
Nathan Hjelm	1ee6d5d21a	ugni: fix typos in mca_btl_ugni_put This commit was SVN r25937.	2012-02-15 23:26:03 +00:00
Mike Dubman	6ec768f0c6	fix #2971 This commit was SVN r25908.	2012-02-12 09:28:42 +00:00
Nathan Hjelm	8c9dc990a9	ugni: fixed typo This commit was SVN r25899.	2012-02-10 01:27:31 +00:00
Nathan Hjelm	994127cb6b	ugni: move mca_btl_ugni_smsg_mbox_t back to btl_ugni_endpoint.h This commit was SVN r25898.	2012-02-10 01:11:54 +00:00
Nathan Hjelm	0ccfd3e6db	ugni btl update. changes: - re-enable sendi - move smsg common code into btl_ugni_smsg.h - added new parameters for smsg/eager frags - use get for frags larger than the smsg_limit - bug fixes - code cleanup This commit was SVN r25897.	2012-02-10 00:47:29 +00:00
Mike Dubman	b18a1611c3	- if everything is ok set return value to OMPI_SUCCESS in mtl/mxm This commit was SVN r25879.	2012-02-08 14:19:58 +00:00
Christopher Yeoh	bc26adcc32	Fixes trac:2998 Adds a lock to protect the sm pending_sends list from concurrent access Fixes bug where btl_sm_process_pending_sends would return an item to the free list and then continue to use it for a little while cmr:v1.6 This commit was SVN r25878. The following Trac tickets were found above: Ticket 2998 --> https://svn.open-mpi.org/trac/ompi/ticket/2998	2012-02-08 01:32:36 +00:00
Brian Barrett	25d48e22fa	Implementation of the MPI-3 Matched Probe functionality. Currently only implemented in the OB1 PML, will return NOT_SUPPORTED in other PMLs. This commit was SVN r25865.	2012-02-06 17:35:21 +00:00
Mike Dubman	6188ab7317	* ep init refactoring * split ep_info into fragments to fit PMI limit This commit was SVN r25857.	2012-02-02 15:00:47 +00:00
Vishwanath Venkatesan	15ebe838e9	Modified implementation of two_phase read all similar to the changes for the write all incorporating romio style partitioning. This commit was SVN r25853.	2012-02-01 18:30:13 +00:00
Vishwanath Venkatesan	158374bdd0	Dynamic file write all algorithm optimized by using derived datatype for receiving actual data thereby avoiding the merging step in the fbtl. This commit was SVN r25852.	2012-02-01 18:20:44 +00:00
Vishwanath Venkatesan	b9026ccbd0	Fix for two-phase generating flattened datatype using decoded iovec for handling non-contiguous memory and contiguous file cases. This commit was SVN r25850.	2012-02-01 17:23:51 +00:00
Jeff Squyres	4f35b62154	Components should not be linking to top-level libraries -- and definitely should not be linking to more than libmpi.la! (remember that libmpi.la now wholly contains libopen-rte.la, which wholly contains libopen-pal.la). This commit was SVN r25843.	2012-01-31 20:43:27 +00:00
Mike Dubman	92873872f5	revert r25813 This commit was SVN r25816. The following SVN revision numbers were found above: r25813 --> open-mpi/ompi@8ed781d7e9	2012-01-30 13:22:38 +00:00
Mike Dubman	8ed781d7e9	add mca param to enable/disable mxm This commit was SVN r25813.	2012-01-30 11:14:20 +00:00
Mike Dubman	9f0ca9dfc0	fix: extract source from imm request fields instead from depricated api This commit was SVN r25812.	2012-01-30 10:37:37 +00:00
Ralph Castain	a0edae52f2	Ensure the wrapper flags get entered in the right order, with -lpmi coming before the alps util libs This commit was SVN r25809.	2012-01-27 20:56:21 +00:00
Nathan Hjelm	97dad0ac49	ugni: don't release eager fragments until we get local smsg completion This commit was SVN r25796.	2012-01-27 00:32:43 +00:00
Nathan Hjelm	669f0afd14	ugni: poll smsg mailbox until it is empty This commit was SVN r25794.	2012-01-26 20:50:09 +00:00
Nathan Hjelm	2a83297f96	silence vader warnings This commit was SVN r25793.	2012-01-26 20:07:33 +00:00
Mike Dubman	6c954ad43f	set mxm to call opal_progress in tight loops This commit was SVN r25788.	2012-01-26 18:33:43 +00:00
Nathan Hjelm	3d9bc68435	reorder udcm init/finalize code. fixes trac:2973 This commit was SVN r25787. The following Trac tickets were found above: Ticket 2973 --> https://svn.open-mpi.org/trac/ompi/ticket/2973	2012-01-26 16:28:55 +00:00
Shiqing Fan	2c9a4beffd	Add and remove a few components for windows build. This commit was SVN r25775.	2012-01-25 09:01:27 +00:00
Nathan Hjelm	7b9bf6fe41	ugni: remove another erroneous error message This commit was SVN r25768.	2012-01-23 21:23:01 +00:00
Nathan Hjelm	f3b60062cb	ugni: remove erroneous error message This commit was SVN r25767.	2012-01-23 21:05:24 +00:00
Nathan Hjelm	521546aaa3	bug fix: ugni: pack only as many bytes as the pml requested This commit was SVN r25766.	2012-01-23 17:21:45 +00:00
Ralph Castain	f03b82ab0a	Don't fiddle with the port_name memory as, per standard, this is an input-only parameter This commit was SVN r25756.	2012-01-20 13:15:41 +00:00
Ralph Castain	be3dfb6a1a	Ensure that we only add -lpmi once to the wrapper compilers, no matter how many components might use it. This commit was SVN r25753.	2012-01-20 04:56:38 +00:00
Ralph Castain	f5c43e8d60	Get the header files into the tarball This commit was SVN r25746.	2012-01-19 21:02:20 +00:00
Nathan Hjelm	804a494036	zero out ugni fragments in constructor This commit was SVN r25731.	2012-01-17 19:52:26 +00:00
Vishwanath Venkatesan	1e95d8b1e2	remove the MPI functions used in these files by the OMPI internal corresponding functionality and also add error checking in these for functions which did not have them' This commit was SVN r25723.	2012-01-13 17:21:51 +00:00
Rolf vandeVaart	16d676aa5b	Fix minor issue with CUDA. Cannot register overlappiing regions. This commit was SVN r25716.	2012-01-12 13:00:42 +00:00
Samuel Gutierrez	63869c431b	init seg_num_procs_inited to zero before the atomic add. This commit was SVN r25710.	2012-01-11 03:37:23 +00:00
Nathan Hjelm	96c1df8d90	clean up vader registration code This commit was SVN r25704.	2012-01-10 22:33:22 +00:00
Edgar Gabriel	fb4d1a7099	remove the MPI functions used in this file by the OMPI internal corresponding functionality. This commit was SVN r25703.	2012-01-10 19:55:05 +00:00
Nathan Hjelm	f65f6f5c39	bugfix: ugni: increase smsg mailbox size to a multiple of 4096 This commit was SVN r25702.	2012-01-10 19:50:25 +00:00
Mike Dubman	37dc53bbc9	mxm: return the MXM_REQ_SEND_SYNC flag to mxm_req_send This commit was SVN r25694.	2012-01-06 18:56:28 +00:00
Mike Dubman	3b97d609a8	mtl_mxm: fix double free This commit was SVN r25693.	2012-01-06 16:22:58 +00:00
Samuel Gutierrez	d1a44ecd34	send packed buffers instead of using iovecs in common sm rml. this commit will hopefully resolve the periodic bus errors that some mtt tests have been encountering. This commit was SVN r25692.	2012-01-05 00:11:59 +00:00
Rolf vandeVaart	9441f33981	Improve an error message. Replace tabs with spaces. This commit was SVN r25688.	2012-01-03 15:19:01 +00:00
Rolf vandeVaart	8073f5002a	Some additional CUDA specific code. Adding a few more support functions that will be used in future development. This commit was SVN r25684.	2011-12-29 12:31:54 +00:00
Edgar Gabriel	e0139a2d7e	provide descriptions about the functionality of these frameworks. This commit was SVN r25682.	2011-12-22 19:42:00 +00:00
Vishwanath Venkatesan	0f928be1d5	Modifying selection logic back to select two-phase at the cases it should. This commit was SVN r25681.	2011-12-22 01:01:32 +00:00
Vishwanath Venkatesan	37c8470e3d	modified implementation for two-phase write_all incorporating romio style domain partitioning This commit was SVN r25680.	2011-12-22 00:16:29 +00:00
Vishwanath Venkatesan	738a67b704	Removing duplicate code while setting default file view and using internal file-set-view for setting the default file view This commit was SVN r25679.	2011-12-21 21:50:47 +00:00
Rolf vandeVaart	6ca186fb64	Delay some initialization until needed. This eliminates some warnings and removes need for CUDA init before MPI_Init. This commit was SVN r25678.	2011-12-21 15:21:57 +00:00
Samuel Gutierrez	519f71ab7e	silences valgrind warning in common sm (Syscall param writev(vector[...]) points to uninitialised byte(s)). probably also silences a large stack allocation warning in coverity. This commit was SVN r25666.	2011-12-16 23:17:48 +00:00
Samuel Gutierrez	0ca6603fa0	remove some unused cruft in shmem. minor common sm cleanup. This commit was SVN r25665.	2011-12-16 22:43:55 +00:00
Nathan Hjelm	71527c8058	minor ugni btl code cleanup This commit was SVN r25618.	2011-12-10 08:20:46 +00:00
Nathan Hjelm	c8a4687402	don't set SIGSEGV to default This commit was SVN r25610.	2011-12-09 21:54:05 +00:00
Nathan Hjelm	e03d23d96e	Intial support for Cray's uGNI interface (XE-6/XK-6) This commit was SVN r25608.	2011-12-09 21:24:07 +00:00
Nathan Hjelm	87b7e85d53	rfc timeout. retry registration after removing old registration from lru This commit was SVN r25587.	2011-12-07 18:20:44 +00:00
Josh Hursey	e56b4de2c9	Fixes trac:2550 : Cleanup comment in crcp_bkmrk_pml.h This commit was SVN r25585. The following Trac tickets were found above: Ticket 2550 --> https://svn.open-mpi.org/trac/ompi/ticket/2550	2011-12-07 14:50:04 +00:00
Jeff Squyres	c10f41c87e	Do not build these frameworks when --disable-mpi-io is specified. Fixes some Cisco MTT MPI install errors. This commit was SVN r25566.	2011-12-02 22:11:23 +00:00
Ralph Castain	07655e2945	Handle the case where the allocator "fibs" to us about the node names. In some cases (ahem...you know who you are!), the allocator will tell us a node number (e.g., "16"). However, the daemon will return a node name (e.g., "nid0016") - leaving us not recognizing its location. So provide a new parameter (can't have too many!) that handles this situation by stripping the prefix from the returned node name. Also do a little cleanup to ensure we cleanly exit from errors, without generating too many annoying messages. This commit was SVN r25562.	2011-12-02 14:10:08 +00:00
Ralph Castain	357ac14530	Can't return a numerical value here This commit was SVN r25559.	2011-12-02 10:36:57 +00:00
Nathan Hjelm	bb1fec0407	added put/get btl descriptor flags This commit was SVN r25553.	2011-11-30 21:37:23 +00:00
Ralph Castain	c56acf60ca	Although we never really thought about it, we made an unconscious assumption in the mapper system - we assumed that the daemons would be placed on nodes in the order that the nodes appear in the allocation. In other words, we assumed that the launch environment would map processes in node order. Turns out, this isn't necessarily true. The Cray, for example, launches processes in a toroidal pattern, thus causing the daemons to wind up somewhere other than what we thought. Other environments (e.g., slurm) are also capable of such behavior, depending upon the default mapping algorithm they are told to use. Resolve this problem by making the daemon-to-node assignment in the affected environments when the daemon calls back and tells us what node it is on. Order the nodes in the mapping list so they are in daemon-vpid order as opposed to the order in which they show in the allocation. For environments that don't exhibit this mapping behavior (e.g., rsh), this won't have any impact. Also, clean up the vm launch procedure a little bit so it more closely aligns with the state machine implementation that is coming, and remove some lingering "slave" code. This commit was SVN r25551.	2011-11-30 19:58:24 +00:00
Jeff Squyres	6fbbfd0f7a	Gah! r25545 acidentally included ''waaaay'' more stuff than it was supposed to. I.e., half-baked/not complete stuff. This commit backs out all of r25545. Sorry folks! This commit was SVN r25546. The following SVN revision numbers were found above: r25545 --> open-mpi/ompi@7f9ae11faf	2011-11-29 23:24:52 +00:00
Jeff Squyres	7f9ae11faf	Per http://www.open-mpi.org/community/lists/users/2011/11/17862.php , to make MPI_IN_PLACE (and other sentinel Fortran constants) work on OS X, we need to use the following compiler (linker) flag: -Wl,-commons,use_dylibs So if we're compiling on OS X, test to see if that flag works with the compiler. If so, add it to the wrapper FFLAGS and FCFLAGS (note that per a future update, we'll only have one Fortran compiler anyway). Fixes trac:1982. This commit was SVN r25545. The following Trac tickets were found above: Ticket 1982 --> https://svn.open-mpi.org/trac/ompi/ticket/1982	2011-11-29 23:05:54 +00:00
Terry Dontje	5209de048c	add code to service_thread_start to handle EBADF returns from select. This commit fixes trac:2922. This commit was SVN r25520. The following Trac tickets were found above: Ticket 2922 --> https://svn.open-mpi.org/trac/ompi/ticket/2922	2011-11-29 16:49:59 +00:00
Samuel Gutierrez	375162c693	this commit fixes a few things. 1. silence warning in common sm. 2. remove unneeded config code in common sm. 3. move opal_shmem_base_close to a better place in opal_finalize. 4. fix opal_path_nfs output. This commit was SVN r25518.	2011-11-28 23:41:19 +00:00
George Bosilca	0bd2bf9aae	The number of segments accepted should be bounded by MCA_BTL_DES_MAX_SEGMENTS and not by 2. This commit was SVN r25515.	2011-11-28 17:19:12 +00:00
Nathan Hjelm	f8c8c641f1	added asserts to warn developers that ob1/csum match fragments do not support more than 2 segments This commit was SVN r25514.	2011-11-28 16:12:25 +00:00
Samuel Gutierrez	b4edf0ff5c	getting ready for 1.5 port of the shared memory enhancements. remove some unused/unneeded stuff and minor style update. This commit was SVN r25513.	2011-11-28 16:08:32 +00:00
Ralph Castain	9b59d8de6f	This is actually a much smaller commit than it appears at first glance - it just touches a lot of files. The --without-rte-support configuration option has never really been implemented completely. The option caused various objects not to be defined and conditionally compiled some base functions, but did nothing to prevent build of the component libraries. Unfortunately, since many of those components use objects covered by the option, it caused builds to break if those components were allowed to build. Brian dealt with this in the past by creating platform files and using "no-build" to block the components. This was clunky, but acceptable when only one organization was using that option. However, that number has now expanded to at least two more locations. Accordingly, make --without-rte-support actually work by adding appropriate configury to prevent components from building when they shouldn't. While doing so, remove two frameworks (db and rmcast) that are no longer used as ORCM comes to a close (besides, they belonged in ORCM now anyway). Do some minor cleanups along the way. This commit was SVN r25497.	2011-11-22 21:24:35 +00:00
Ralph Castain	6310361532	At long last, the fabled revision to the affinity system has arrived. A more detailed explanation of how this all works will be presented here: https://svn.open-mpi.org/trac/ompi/wiki/ProcessPlacement The wiki page is incomplete at the moment, but I hope to complete it over the next few days. I will provide updates on the devel list. As the wiki page states, the default and most commonly used options remain unchanged (except as noted below). New, esoteric and complex options have been added, but unless you are a true masochist, you are unlikely to use many of them beyond perhaps an initial curiosity-motivated experimentation. In a nutshell, this commit revamps the map/rank/bind procedure to take into account topology info on the compute nodes. I have, for the most part, preserved the default behaviors, with three notable exceptions: 1. I have at long last bowed my head in submission to the system admin's of managed clusters. For years, they have complained about our default of allowing users to oversubscribe nodes - i.e., to run more processes on a node than allocated slots. Accordingly, I have modified the default behavior: if you are running off of hostfile/dash-host allocated nodes, then the default is to allow oversubscription. If you are running off of RM-allocated nodes, then the default is to NOT allow oversubscription. Flags to override these behaviors are provided, so this only affects the default behavior. 2. both cpus/rank and stride have been removed. The latter was demanded by those who didn't understand the purpose behind it - and I agreed as the users who requested it are no longer using it. The former was removed temporarily pending implementation. 3. vm launch is now the sole method for starting OMPI. It was just too darned hard to maintain multiple launch procedures - maybe someday, provided someone can demonstrate a reason to do so. As Jeff stated, it is impossible to fully test a change of this size. I have tested it on Linux and Mac, covering all the default and simple options, singletons, and comm_spawn. That said, I'm sure others will find problems, so I'll be watching MTT results until this stabilizes. This commit was SVN r25476.	2011-11-15 03:40:11 +00:00
Jeff Squyres	e8dcad6017	This typo has been here since August 2005. :-) This commit was SVN r25468.	2011-11-11 03:01:52 +00:00

... 2 3 4 5 6 ...

3940 Коммитов