openmpi

Автор	SHA1	Сообщение	Дата
Jeff Squyres	5ec6a65a72	After I spent a while looking in libibverbs for ibv_get_device_list_compat() and not finding it, I finally realized that it was a function in OMPI. So let's name it with a proper ompi_ prefix, not an ibv_ prefix. This commit was SVN r26867.	2012-07-25 16:32:51 +00:00
Nathan Hjelm	cd2cbdca09	btl/openib: limit each process to a ppn fraction of the available registered memory when using mellanox hardware (mlx4 and mthca). fixed This commit was SVN r26811.	2012-07-19 17:52:21 +00:00
Ralph Castain	66fe57f746	Revert r26804 so openib can build again This commit was SVN r26810. The following SVN revision numbers were found above: r26804 --> open-mpi/ompi@610be870f9	2012-07-19 16:16:38 +00:00
Nathan Hjelm	610be870f9	btl/openib: limit each process to a ppn fraction of the available registered memory when using mellanox hardware (mlx4 and mthca) This commit was SVN r26804.	2012-07-18 17:29:48 +00:00
Nathan Hjelm	4a97ecbdd2	btl/openib: remove tab characters This commit was SVN r26803.	2012-07-18 17:29:37 +00:00
Nathan Hjelm	771b427027	udcm: unmonitor the fd BEFORE tearing down the listen qp This commit was SVN r26800.	2012-07-18 14:22:45 +00:00
Nathan Hjelm	fc1b295606	udcm: evict from the lru of the openib device's grdma mpool if a qp can not be created. Note: there doesn't appear to be a standard way to differentiate between ibv_create_qp failing because the node is out of registered memory and failing because no more qps are available This commit was SVN r26797.	2012-07-14 01:58:29 +00:00
Nathan Hjelm	3798f38386	do not print out an error message if ibv_reg_mr fails This commit was SVN r26796.	2012-07-14 01:35:45 +00:00
Nathan Hjelm	4d1920ee87	Fix a bug on 32-bit systems introduced by r26626. This fix ensures that all supported btls (with exception of wv-- shiqing will need to help bring that one up to date with r26626) set the lval in prepare_src/dst when preparing a put or get segment. This fix also ensures a consistent use of lval in put and get for both local and remote segments. This commit was SVN r26793. The following SVN revision numbers were found above: r26626 --> open-mpi/ompi@249066e06d	2012-07-13 21:19:16 +00:00
Nathan Hjelm	344fe61616	remove assertion in udcm This commit was SVN r26790.	2012-07-13 15:14:48 +00:00
Terry Dontje	6f3195faca	add some missing casts This commit was SVN r26779.	2012-07-10 18:03:29 +00:00
Nathan Hjelm	05c5c1f412	remove unused i_initiate function from udcm This commit was SVN r26778.	2012-07-10 17:22:19 +00:00
Jeff Squyres	bb13e21538	Roll back r26730, but bump the default CQ length base up to 1500, not 1000. Refs trac:3154. IB/iWarp vendors need to get together to figure out a real fix. This commit was SVN r26777. The following SVN revision numbers were found above: r26730 --> open-mpi/ompi@5315c91baf The following Trac tickets were found above: Ticket 3154 --> https://svn.open-mpi.org/trac/ompi/ticket/3154	2012-07-10 16:53:27 +00:00
Terry Dontje	95a3b4a423	corrected the change of pval to lval introduced in r26626 This commit was SVN r26732. The following SVN revision numbers were found above: r26626 --> open-mpi/ompi@249066e06d	2012-07-03 18:52:18 +00:00
Jeff Squyres	5315c91baf	Fixes trac:3152: slightly more advanced than the patch on the ticket: * If the MCA param btl_openib_cq_size is set to 0 (which is the default), use the device CQ max size. Otherwise, use the MCA param value (and never adjust it again). * Remove the CQ size adjustment code. Since we default to max CQ size, there really isn't much point in having it any more. I think people setting an absolute CQ size is going to be rare, so let's not do anything fancy with it. * If the MCA param value is larger than what the device supports, print a warning (only once per process) and default to using the device max * Add a BTL_VERBOSE displaying which CQ size we used This commit was SVN r26730. The following Trac tickets were found above: Ticket 3152 --> https://svn.open-mpi.org/trac/ompi/ticket/3152	2012-07-03 16:49:59 +00:00
Nathan Hjelm	9f3717959e	remove sync step from udcm as it really isn't necessary This commit was SVN r26724.	2012-07-02 22:54:44 +00:00
Pavel Shamis	f7664b3814	1. Adding 2 new components: ofacm - generic connection manager for IB interconnects. ofautils - IB common utilities and compatibility code 2. Updating OpenIB configure code - ORNL & Mellanox Teams This commit was SVN r26707.	2012-07-02 15:20:12 +00:00
Jeff Squyres	b936229b54	Refs trac:3130: fix the openib BTL to properly set the memalign malloc hook early in the setup, but ''not'' during the component register function. And then properly unset it if was set. This commit was SVN r26697. The following Trac tickets were found above: Ticket 3130 --> https://svn.open-mpi.org/trac/ompi/ticket/3130	2012-06-29 13:51:36 +00:00
Josh Hursey	28681deffa	Backout the ORCA commit. :( There is a linking issue on Mac OSX that needs to be addressed before this is able to come back into the trunk. This commit was SVN r26676.	2012-06-27 01:28:28 +00:00
Josh Hursey	542330e3a7	Commit of ORCA: Open MPI Runtime Collaborative Abstraction This is a runtime interposition project that sits between the OMPI and ORTE layers in Open MPI. The project is described on the wiki: https://svn.open-mpi.org/trac/ompi/wiki/Runtime_Interposition And on this email thread: http://www.open-mpi.org/community/lists/devel/2012/06/11109.php This commit was SVN r26670.	2012-06-26 21:42:16 +00:00
Nathan Hjelm	37c624ee43	prepare to delete mpool/rdma This commit was SVN r26664.	2012-06-26 15:55:23 +00:00
Ralph Castain	e6f3586415	Remove the orte notifier framework, per discussion at the devel meeting and follow-up with Jeff (who took the action item) This commit was SVN r26637.	2012-06-22 18:09:23 +00:00
Nathan Hjelm	249066e06d	Timeout! Per RFC update the BTL interface to hide segment keys. All BTLs (with the exception of wv), all relevant PMLs, and osc/rdma have been updated for the new interface. This commit was SVN r26626.	2012-06-21 17:09:12 +00:00
Yevgeny Kliteynik	df783c0472	Precise speed of FDR and EDR This commit was SVN r26614.	2012-06-17 07:06:37 +00:00
Yevgeny Kliteynik	d59b8d5dc4	Fixing malformed error message This commit was SVN r26434.	2012-05-12 21:13:42 +00:00
Mike Dubman	98c2c749fb	fix define name to BTL_OPENIB_MALLOC_HOOKS_ENABLED Thanks to Ludovic.Hablot@ext.bull.net for pointing this out This commit was SVN r26432.	2012-05-11 18:30:45 +00:00
Yevgeny Kliteynik	244d66d95b	Fixed FDR link speed details, added EDR. This commit was SVN r26423.	2012-05-10 13:44:18 +00:00
Jeff Squyres	de4bbacd13	It turns out that we can't always include the hwloc OpenFabrics verbs helper file, even if we find that the system has <infiniband/verbs.h>. The reason is because there are some inline functions in that verbs helper file that invoke ibv_* functions. Some linkers (e.g., Solaris Studio Compilers) will instantiate those static inline functions -- even if we don't use them -- and therefore we need to be able to resolve the ibv_* symbols at link time. But since -libverbs is only specified in places where we use other ibv_* functions (e.g., the OpenFabrics-based BTLs), that means that linking random executables can/will fail (e.g., orterun). So instead, introduce a new #define: OPAL_HWLOC_WANT_VERBS_HELPER. If this macro is set to 1 before including opal/mca/hwloc/hwloc.h, then you'll also get the hwloc OpenFabrics verbs helper header file (if hwloc found <infiniband/verbs.h> -- otherwise, it'll #error). This commit was SVN r26417.	2012-05-09 20:18:31 +00:00
Mike Dubman	cd17fee9a8	performance fix: openib use memalign for malloc This commit was SVN r26409.	2012-05-08 20:42:09 +00:00
Jeff Squyres	2ba10c37fe	Per RFC, bring in the following changes: * Remove paffinity, maffinity, and carto frameworks -- they've been wholly replaced by hwloc. * Move ompi_mpi_init() affinity-setting/checking code down to ORTE. * Update sm, smcuda, wv, and openib components to no longer use carto. Instead, use hwloc data. There are still optimizations possible in the sm/smcuda BTLs (i.e., making multiple mpools). Also, the old carto-based code found out how many NUMA nodes were ''available'' -- not how many were used ''in this job''. The new hwloc-using code computes the same value -- it was not updated to calculate how many NUMA nodes are used ''by this job.'' * Note that I cannot compile the smcuda and wv BTLs -- I ''think'' they're right, but they need to be verified by their owners. * The openib component now does a bunch of stuff to figure out where "near" OpenFabrics devices are. '''THIS IS A CHANGE IN DEFAULT BEHAVIOR!!''' and still needs to be verified by OpenFabrics vendors (I do not have a NUMA machine with an OpenFabrics device that is a non-uniform distance from multiple different NUMA nodes). * Completely rewrite the OMPI_Affinity_str() routine from the "affinity" mpiext extension. This extension now understands hyperthreads; the output format of it has changed a bit to reflect this new information. * Bunches of minor changes around the code base to update names/types from maffinity/paffinity-based names to hwloc-based names. * Add some helper functions into the hwloc base, mainly having to do with the fact that we have the hwloc data reporting ''all'' topology information, but sometimes you really only want the (online \| available) data. This commit was SVN r26391.	2012-05-07 14:52:54 +00:00
Mike Dubman	1b475523de	add support for FDR speed This commit was SVN r26385.	2012-05-06 05:53:05 +00:00
Terry Dontje	81d7fcaf82	back out r26255 to avoid cross component linkage so Solaris can build a usable openib btl This commit was SVN r26269. The following SVN revision numbers were found above: r26255 --> open-mpi/ompi@fe25b8704b	2012-04-13 18:08:54 +00:00
Mike Dubman	fe25b8704b	performance fix: set alignment for openib internal buffers Thanks to Jeff/Pasha for valuable comments Thanks to Valentin Petrov for implementation This commit was SVN r26255.	2012-04-09 08:06:15 +00:00
Ralph Castain	bd8b4f7f1e	Sorry for mid-day commit, but I had promised on the call to do this upon my return. Roll in the ORTE state machine. Remove last traces of opal_sos. Remove UTK epoch code. Please see the various emails about the state machine change for details. I'll send something out later with more info on the new arch. This commit was SVN r26242.	2012-04-06 14:23:13 +00:00
Mike Dubman	ff1c84c53f	revert previous commit This commit was SVN r26206.	2012-03-29 14:07:13 +00:00
Mike Dubman	43a5775e8a	performance fix: set alignment for openib internal buffers This commit was SVN r26205.	2012-03-29 14:00:08 +00:00
Pavel Shamis	102da281c4	OPENIB BTL - use orte_show_help instead of BTL_ERROR print in case ibv_reg_mr failed. This commit was SVN r26111.	2012-03-08 09:04:03 +00:00
Mike Dubman	4e7e7d7c3f	print error which is ignored on upper layer This commit was SVN r26106.	2012-03-06 14:25:56 +00:00
Abhishek Kulkarni	08ca0f80bc	Fix a C/R bug where the restart hung due to dangling fds in the openib btl. This commit was SVN r26094.	2012-03-04 06:57:33 +00:00
Terry Dontje	3e70cad203	Correct a few alignment problems to address the issue brought up in ticket #2964 This commit was SVN r26078.	2012-03-01 17:29:40 +00:00
Nathan Hjelm	3d9bc68435	reorder udcm init/finalize code. fixes trac:2973 This commit was SVN r25787. The following Trac tickets were found above: Ticket 2973 --> https://svn.open-mpi.org/trac/ompi/ticket/2973	2012-01-26 16:28:55 +00:00
Terry Dontje	5209de048c	add code to service_thread_start to handle EBADF returns from select. This commit fixes trac:2922. This commit was SVN r25520. The following Trac tickets were found above: Ticket 2922 --> https://svn.open-mpi.org/trac/ompi/ticket/2922	2011-11-29 16:49:59 +00:00
Brad Benton	96395c916e	de-tab'd This commit was SVN r25465.	2011-11-09 19:45:12 +00:00
Brad Benton	0712b911a5	Updated IBM copyright This commit was SVN r25464.	2011-11-09 19:38:53 +00:00
Christopher Yeoh	fb57a74a40	Removes pointless memmove which because of a previous memcpy will always have identical source and destination pointers. See #2871 Plugs a couple of minor memory leaks related to remote qp info This commit was SVN r25431.	2011-11-04 00:15:08 +00:00
George Bosilca	9d8e84142f	Survivor!!! This commit was SVN r25371.	2011-10-26 00:58:55 +00:00
Ralph Castain	e7f6be5385	Unused variable This commit was SVN r25301.	2011-10-17 18:59:22 +00:00
Jeff Squyres	2c6254b70d	Second change from Intel. This commit was SVN r25279.	2011-10-12 23:26:34 +00:00
Jeff Squyres	28118d0611	Updte the parameters for the Intel iWARP devices, per request from Faisal Latif <faisal.latif@intel.com>. This commit was SVN r25278.	2011-10-12 22:58:30 +00:00
Rainer Keller	4e6a6fc146	- Check, whether the compiler supports __builtin_clz (count leading zeroes); if so, use it for bit-operations like opal_cube_dim and opal_hibit. Implement two versions of power-of-two. In case of opal_next_poweroftwo, this reduces the average execution time from 83 cycles to 4 cycles (Intel Nehalem, icc, -O2, inlining, measured rdtsc, with loop over 2^27 values). Numbers for other functions are similar (but of course heavily depend on the usage, e.g. opal_hibit() with a start of 4 does not save much). The bsr instruction on AMD Opteron is also not as fast. - Replace various places where the next power-of-two is computed. Tested on Intel Nehalem Cluster with openib, compilers GNU-4.6.1 and Intel-12.0.4 using mpi_testsuite -t "Collective" with 128 processes. This commit was SVN r25270.	2011-10-11 22:49:01 +00:00

... 2 3 4 5 6 ...

953 Коммитов