openmpi

Автор	SHA1	Сообщение	Дата
Ralph Castain	554da83865	Set the locality for remote procs even after a comm_spawn. Ensure we store our own local cpuset upon launch so it will be shared during comm_join. This provides full locality - i.e., not just node-level, but all the way down to whatever common binding level exists between the procs. cmr=v1.7.5:reviewer=jsquyres This commit was SVN r31106.	2014-03-18 14:51:07 +00:00
Ralph Castain	9c66c4f439	Correctly implement --disable-oshmem and --without-orte so we don't build the disabled section of code. Fix a bunch of code rot in the PMI rte component, and add several missing headers when building --without-orte. NOTE: I transferred the oshmem-disabled-by-default from the 1.7 branch to the trunk to minimize future disruption if/when we change that option. cmr=v1.8:reviewer=jsquyres This commit was SVN r31006.	2014-03-11 22:02:40 +00:00
George Bosilca	b324884375	This might explain the current difficulties with the mapping... This commit was SVN r30047.	2013-12-21 23:26:13 +00:00
Ralph Castain	0995a6f3b9	Revert r29917 and replace it with a fix that resolves the thread deadlock while retaining the desired debug info. In an earlier commit, we had changed the modex accordingly: * automatically retrieve the hostname (and all RTE info) for all procs during MPI_Init if nprocs < cutoff * if nprocs > cutoff, retrieve the hostname (and all RTE info) for a proc upon the first call to modex_recv for that proc. This would provide the hostname for debugging purposes as we only report errors on messages, and so we must have called modex_recv to get the endpoint info * BTLs are not to call modex_recv until they need the endpoint info for first message - i.e., not during add_procs so we don't call it for every process in the job, but only those with whom we communicate My understanding is that only some BTLs have been modified to meet that third requirement, but those include the Cray ones where jobs are big enough that launch times were becoming an issue. Other BTLs would hopefully be modified as time went on and interest in using them at scale arose. Meantime, those BTLs would call modex_recv on every proc, and we would therefore be no worse than the prior behavior. This commit revises the MPI-RTE interface to pass the ompi_proc_t instead of the ompi_process_name_t for the proc so that the hostname can be easily inserted. I have advised the ORNL folks of the change. cmr=v1.7.4:reviewer=jsquyres:subject=Fix thread deadlock This commit was SVN r29931. The following SVN revision numbers were found above: r29917 --> open-mpi/ompi@1a972e2c9d	2013-12-17 03:26:00 +00:00
Nathan Hjelm	5bf6555604	Fix locality when in the case where the OMPI_RTE_HOST_ID is not found. cmr=v1.7.4:ticket=3847 This commit was SVN r29475. The following Trac tickets were found above: Ticket 3847 --> https://svn.open-mpi.org/trac/ompi/ticket/3847	2013-10-22 19:07:03 +00:00
Ralph Castain	24c811805f	************************************************************** This change contains a non-mandatory modification of the MPI-RTE interface. Anyone wishing to support coprocessors such as the Xeon Phi may wish to add the required definition and underlying support ************************************************************** Add locality support for coprocessors such as the Intel Xeon Phi. Detecting that we are on a coprocessor inside of a host node isn't straightforward. There are no good "hooks" provided for programmatically detecting that "we are on a coprocessor running its own OS", and the ORTE daemon just thinks it is on another node. However, in order to properly use the Phi's public interface for MPI transport, it is necessary that the daemon detect that it is colocated with procs on the host. So we have to split the locality to separately record "on the same host" vs "on the same board". We already have the board-level locality flag, but not quite enough flexibility to handle this use-case. Thus, do the following: 1. add OPAL_PROC_ON_HOST flag to indicate we share a host, but not necessarily the same board 2. modify OPAL_PROC_ON_NODE to indicate we share both a host AND the same board. Note that we have to modify the OPAL_PROC_ON_LOCAL_NODE macro to explicitly check both conditions 3. add support in opal/mca/hwloc/base/hwloc_base_util.c for the host to check for coprocessors, and for daemons to check to see if they are on a coprocessor. The former is done via hwloc, but support for the latter is not yet provided by hwloc. So the code for detecting we are on a coprocessor currently is Xeon Phi specific - hopefully, we will find more generic methods in the future. 4. modify the orted and the hnp startup so they check for coprocessors and to see if they are on a coprocessor, and have the orteds pass that info back in their callback message. Automatically detect that coprocessors have been found and identify which coprocessors are on which hosts. Note that this algo isn't scalable at the moment - this will hopefully be improved over time. 5. modify the ompi proc locality detection function to look for coprocessor host info IF the OMPI_RTE_HOST_ID database key has been defined. RTE's that choose not to provide this support do not have to do anything - the associated code will simply be ignored. 6. include some cleanup of the hwloc open/close code so it conforms to how we did things in other frameworks (e.g., having a single "frame" file instead of open/close). Also, fix the locality flags - e.g., being on the same node means you must also be on the same cluster/cu, so ensure those flags are also set. cmr:v1.7.4:reviewer=hjelmn This commit was SVN r29435.	2013-10-14 16:52:58 +00:00
Jeff Squyres	886e2cbf0f	Remove and eliminate this extra redundant phrase. This commit was SVN r29381.	2013-10-04 22:12:04 +00:00
Ralph Castain	5bd0b42356	Store the proc's locality on the ompi_proc_t Refs trac:3766 This commit was SVN r29325. The following Trac tickets were found above: Ticket 3766 --> https://svn.open-mpi.org/trac/ompi/ticket/3766	2013-10-02 01:36:52 +00:00
George Bosilca	d0ad20aacb	Don't use ORTE specifics in the OMPI layer. Instead use the RTE equivalents. Patch submitted by Geoffroy Vallee. This commit was SVN r29303.	2013-09-30 23:31:04 +00:00
Ralph Castain	d565a76814	Do some cleanup of the way we handle modex data. Identify data that needs to be shared with peers in my job vs data that needs to be shared with non-peers - no point in sharing extra data. When we share data with some process(es) from another job, we cannot know in advance what info they have or lack, so we have to share everything just in case. This limits the optimization we can do for things like comm_spawn. Create a new required key in the OMPI layer for retrieving a "node id" from the database. ALL RTE'S MUST DEFINE THIS KEY. This allows us to compute locality in the MPI layer, which is necessary when we do things like intercomm_create. cmr:v1.7.4:reviewer=rhc:subject=Cleanup handling of modex data This commit was SVN r29274.	2013-09-27 00:37:49 +00:00
Ralph Castain	dee8336f68	Do not use modex recv to fetch the locality as this will automatically force retrieval of hostnames, which we are trying to avoid. Instead, use the database API to fetch that info. cmr:v1.7.3:reviewer=hjelmn This commit was SVN r29248.	2013-09-25 21:36:25 +00:00
George Bosilca	85db48df0e	Identification, tab vs. space. This commit was SVN r29206.	2013-09-18 16:45:00 +00:00
Ralph Castain	f051500166	Sadly, there is no RTE-agnostic way to prune the modex entries, so we must send them all. Refs trac:3766 This commit was SVN r29204. The following Trac tickets were found above: Ticket 3766 --> https://svn.open-mpi.org/trac/ompi/ticket/3766	2013-09-18 14:09:23 +00:00
Ralph Castain	865a7028f8	Per patch from George, with a few minor cleanups. Correctly address the complete exchange of required wireup information in Intercomm_create so all procs in the resulting communicator know how to talk to each other. Refs trac:29166 This commit was SVN r29200. The following Trac tickets were found above: Ticket 29166 --> https://svn.open-mpi.org/trac/ompi/ticket/29166	2013-09-18 02:01:30 +00:00
Brian Barrett	16a1166884	Remove the proc_pml and proc_bml fields from ompi_proc_t and replace with a configure-time dynamic allocation of flags. The net result for platforms which only support BTL-based communication is a reduction of 8*nprocs bytes per process. Platforms which support both MTLs and BTLs will not see a space reduction, but will now be able to safely run both the MTL and BTL side-by-side, which will prove useful. This commit was SVN r29100.	2013-08-30 16:54:55 +00:00
George Bosilca	cf09fe7c99	It wasn't even compiling when heterogeneous support was on. This commit was SVN r29067.	2013-08-27 16:53:33 +00:00
Ralph Castain	16c5b30a1f	Since the calls to "PMI get" scale by number of procs (not nodes), it makes more sense to have the MCA param be the cutoff based on number of procs. Also, it occurred to me that this shouldn't impact the nidmap process as that is built and circulated when we launch via mpirun, not during direct launch. So shift the cutoff param to the MPI layer, and have it solely determine whether or not we call modex_recv on the hostname. If comm_world is of size greater than the cutoff, then we don't automatically retrieve the hostname when we build the ompi_proc_t for a process - instead, we fill the hostname entry on first call to modex_recv for that process. The param is now "ompi_hostname_cutoff=N", where N=number of procs for cutoff. Refs trac:3729 This commit was SVN r29056. The following Trac tickets were found above: Ticket 3729 --> https://svn.open-mpi.org/trac/ompi/ticket/3729	2013-08-22 03:40:26 +00:00
Ralph Castain	45e695928f	As per the email discussion, revise the sparse handling of hostnames so that we avoid potential infinite loops while allowing large-scale users to improve their startup time: * add a new MCA param orte_hostname_cutoff to specify the number of nodes at which we stop including hostnames. This defaults to INT_MAX => always include hostnames. If a value is given, then we will include hostnames for any allocation smaller than the given limit. * remove ompi_proc_get_hostname. Replace all occurrences with a direct link to ompi_proc_t's proc_hostname, protected by appropriate "if NULL" * modify the OMPI-ORTE integration component so that any call to modex_recv automatically loads the ompi_proc_t->proc_hostname field as well as returning the requested info. Thus, any process whose modex info you retrieve will automatically receive the hostname. Note that on-demand retrieval is still enabled - i.e., if we are running under direct launch with PMI, the hostname will be fetched upon first call to modex_recv, and then the ompi_proc_t->proc_hostname field will be loaded * removed a stale MCA param "mpi_keep_peer_hostnames" that was no longer used anywhere in the code base * added an envar lookup in ess/pmi for the number of nodes in the allocation. Sadly, PMI itself doesn't provide that info, so we have to get it a different way. Currently, we support PBS-based systems and SLURM - for any other, rank0 will emit a warning and we assume max number of daemons so we will always retain hostnames This commit was SVN r29052.	2013-08-20 18:59:36 +00:00
Ralph Castain	611d7f9f6b	When we direct launch an application, we rely on PMI for wireup support. In doing so, we lose the de facto data compression we get from the ORTE modex since we no longer get all the wireup info from every proc in a single blob. Instead, we have to iterate over all the procs, calling PMI_KVS_get for every value we require. This creates a really bad scaling behavior. Users have found a nearly 20% launch time differential between mpirun and PMI, with PMI being the slower method. Some of the problem is attributable to poor exchange algorithms in RM's like Slurm and Alps, but we make things worse by calling "get" so many times. Nathan (with a tad advice from me) has attempted to alleviate this problem by reducing the number of "get" calls. This required the following changes: * upon first request for data, have the OPAL db pmi component fetch and decode all the info from a given remote proc. It turned out we weren't caching the info, so we would continually request it and only decode the piece we needed for the immediate request. We now decode all the info and push it into the db hash component for local storage - and then all subsequent retrievals are fulfilled locally * reduced the amount of data by eliminating the exchange of the OMPI_ARCH value if heterogeneity is not enabled. This was used solely as a check so we would error out if the system wasn't actually homogeneous, which was fine when we thought there was no cost in doing the check. Unfortunately, at large scale and with direct launch, there is a non-zero cost of making this test. We are open to finding a compromise (perhaps turning the test off if requested?), if people feel strongly about performing the test * reduced the amount of RTE data being automatically fetched, and fetched the rest only upon request. In particular, we no longer immediately fetch the hostname (which is only used for error reporting), but instead get it when needed. Likewise for the RML uri as that info is only required for some (not all) environments. In addition, we no longer fetch the locality unless required, relying instead on the PMI clique info to tell us who is on our local node (if additional info is required, the fetch is performed when a modex_recv is issued). Again, all this only impacts direct launch - all the info is provided when launched via mpirun as there is no added cost to getting it Barring objections, we may move this (plus any required other pieces) to the 1.7 branch once it soaks for an appropriate time. This commit was SVN r29040.	2013-08-17 00:49:18 +00:00
Brian Barrett	312f37706e	In talking about this with Jeff and Ralph, we don't actually need ompi_show_help, because opal_show_help is replaced with an aggregating version when using ORTE, so there's no reason to directly call orte_show_help. This commit was SVN r28051.	2013-02-12 21:10:11 +00:00
Brian Barrett	f42783ae1a	Move the RTE framework change into the trunk. With this change, all non-CR runtime code goes through one of the rte, dpm, or pubsub frameworks. This commit was SVN r27934.	2013-01-27 23:25:10 +00:00
Shiqing Fan	4bbddf6875	Fix another typecast This commit was SVN r26817.	2012-07-20 12:11:47 +00:00
Ralph Castain	0dfe29b1a6	Roll in the rest of the modex change. Eliminate all non-modex API access of RTE info from the MPI layer - in some cases, the info was already present (either in the ompi_proc_t or in the orte_process_info struct) and no call was necessary. This removes all calls to orte_ess from the MPI layer. Calls to orte_grpcomm remain required. Update all the orte ess components to remove their associated APIs for retrieving proc data. Update the grpcomm API to reflect transfer of set/get modex info to the db framework. Note that this doesn't recreate the old GPR. This is strictly a local db storage that may (at some point) obtain any missing data from the local daemon as part of an async methodology. The framework allows us to experiment with such methods without perturbing the default one. This commit was SVN r26678.	2012-06-27 14:53:55 +00:00
Josh Hursey	28681deffa	Backout the ORCA commit. :( There is a linking issue on Mac OSX that needs to be addressed before this is able to come back into the trunk. This commit was SVN r26676.	2012-06-27 01:28:28 +00:00
Josh Hursey	542330e3a7	Commit of ORCA: Open MPI Runtime Collaborative Abstraction This is a runtime interposition project that sits between the OMPI and ORTE layers in Open MPI. The project is described on the wiki: https://svn.open-mpi.org/trac/ompi/wiki/Runtime_Interposition And on this email thread: http://www.open-mpi.org/community/lists/devel/2012/06/11109.php This commit was SVN r26670.	2012-06-26 21:42:16 +00:00
Jeff Squyres	2ba10c37fe	Per RFC, bring in the following changes: * Remove paffinity, maffinity, and carto frameworks -- they've been wholly replaced by hwloc. * Move ompi_mpi_init() affinity-setting/checking code down to ORTE. * Update sm, smcuda, wv, and openib components to no longer use carto. Instead, use hwloc data. There are still optimizations possible in the sm/smcuda BTLs (i.e., making multiple mpools). Also, the old carto-based code found out how many NUMA nodes were ''available'' -- not how many were used ''in this job''. The new hwloc-using code computes the same value -- it was not updated to calculate how many NUMA nodes are used ''by this job.'' * Note that I cannot compile the smcuda and wv BTLs -- I ''think'' they're right, but they need to be verified by their owners. * The openib component now does a bunch of stuff to figure out where "near" OpenFabrics devices are. '''THIS IS A CHANGE IN DEFAULT BEHAVIOR!!''' and still needs to be verified by OpenFabrics vendors (I do not have a NUMA machine with an OpenFabrics device that is a non-uniform distance from multiple different NUMA nodes). * Completely rewrite the OMPI_Affinity_str() routine from the "affinity" mpiext extension. This extension now understands hyperthreads; the output format of it has changed a bit to reflect this new information. * Bunches of minor changes around the code base to update names/types from maffinity/paffinity-based names to hwloc-based names. * Add some helper functions into the hwloc base, mainly having to do with the fact that we have the hwloc data reporting ''all'' topology information, but sometimes you really only want the (online \| available) data. This commit was SVN r26391.	2012-05-07 14:52:54 +00:00
Ralph Castain	bd8b4f7f1e	Sorry for mid-day commit, but I had promised on the call to do this upon my return. Roll in the ORTE state machine. Remove last traces of opal_sos. Remove UTK epoch code. Please see the various emails about the state machine change for details. I'll send something out later with more info on the new arch. This commit was SVN r26242.	2012-04-06 14:23:13 +00:00
Ralph Castain	43e35486a4	Correct flag type - thanks George! This commit was SVN r25335.	2011-10-20 04:00:13 +00:00
Ralph Castain	b44f8d4b28	Complete implementation of the ess.proc_get_locality API. Up to this point, the API was only capable of telling if the specified proc was sharing a node with you. However, the returned value was capable of telling you much more detailed info - e.g., if the proc shares a socket, a cache, or numa node. We just didn't have the data to provide that detail. Use hwloc to obtain the cpuset for each process during mpi_init, and share that info in the modex. As it arrives, use a new opal_hwloc_base utility function to parse the value against the local proc's cpuset and determine where they overlap. Cache the value in the pmap object as it may be referenced multiple times. Thus, the return value from orte_ess.proc_get_locality is a 16-bit bitmask that describes the resources being shared with you. This bitmask can be tested using the macros in opal/mca/paffinity/paffinity.h Locality is available for all procs, whether launched via mpirun or directly with an external launcher such as slurm or aprun. This commit was SVN r25331.	2011-10-19 20:18:14 +00:00
George Bosilca	c453614f8b	A more meaningful name for this function (mpi_proc_complete_init instead of ompi_proc_set_arch). Change the comment to reflect the real behavior of the function. This commit was SVN r25312.	2011-10-18 02:54:38 +00:00
Ralph Castain	8f0ef54130	Complete implementation of pmi support. Ensure we support both mpirun and direct launch within same configuration to avoid requiring separate builds. Add support for generic pmi, not just under slurm. Add publish/subscribe support, although slurm's pmi implementation will just return an error as it hasn't been done yet. This commit was SVN r25303.	2011-10-17 20:51:22 +00:00
George Bosilca	80c02647c8	Each level (OPAL/ORTE/OMPI) should only return it's own constants, instead of the current mismatch. This commit was SVN r25230.	2011-10-04 14:50:31 +00:00
Wesley Bland	4e7ff0bd5e	By popular demand the epoch code is now disabled by default. To enable the epochs and the resilient orte code, use the configure flag: --enable-resilient-orte This will define both: ORTE_ENABLE_EPOCH ORTE_RESIL_ORTE This commit was SVN r25093.	2011-08-26 22:16:14 +00:00
Wesley Bland	09274cd047	Make sure that the epoch is initialized everywhere so we don't get weird output during valgrind. This shouldn't have caused any problems with any actual execution. Just extra warnings in valgrind. This commit was SVN r25015.	2011-08-08 15:11:55 +00:00
Wesley Bland	e1ba09ad51	Add a resilience to ORTE. Allows the runtime to continue after a process (or ORTED) failure. Note that more work will be necessary to allow the MPI layer to take advantage of this. Per RFC: http://www.open-mpi.org/community/lists/devel/2011/06/9299.php This commit was SVN r24815.	2011-06-23 20:38:02 +00:00
Abhishek Kulkarni	afbe3e99c6	* Wrap all the direct error-code checks of the form (OMPI_ERR_* == ret) with (OMPI_ERR_* = OPAL_SOS_GET_ERR_CODE(ret)), since the return value could be a SOS-encoded error. The OPAL_SOS_GET_ERR_CODE() takes in a SOS error and returns back the native error code. * Since OPAL_SUCCESS is preserved by SOS, also change all calls of the form (OPAL_ERROR == ret) to (OPAL_SUCCESS != ret). We thus avoid having to decode 'ret' to get the native error code. This commit was SVN r23162.	2010-05-17 23:08:56 +00:00
Brian Barrett	8b4825ff37	Updates to make trunk run on Catamount again: * Don't build the pstat component if all defines needed aren't there. * Update platform file to work better * Work around two places that depended on modex being operational This commit was SVN r22536.	2010-02-03 05:07:40 +00:00
George Bosilca	76222eb869	Get rid of the useless mca_pml_base_endpoint_t and replace it by [the well known and widely used!] mca_pml_endpoint_t. This commit was SVN r22277.	2009-12-08 17:29:54 +00:00
Ralph Castain	2a1cf2ae81	Create two new modex functions for send/recv of specific OPAL data types - basically, ensure that the values are packed and unpacked appropriately. Provides a convenience function as otherwise anyone wanting to send a value had to do this themselves. Use it to send the arch. This commit was SVN r21677.	2009-07-15 04:54:53 +00:00
Ralph Castain	60edbc7220	Fix hetero operations and comm_spawn (to a point). Remove all architecture references from ORTE and put them back in the modex using modex_send/recv calls. Hetero operations are now fully supported again. Comm_spawn now works up to the point where it segfaults due to an error in the CID code - which now allows Edgar to dig further! :-) This commit was SVN r21655.	2009-07-13 20:03:41 +00:00
Rainer Keller	6c5532072a	- Split the datatype engine into two parts: an MPI specific part in OMPI and a language agnostic part in OPAL. The convertor is completely moved into OPAL. This offers several benefits as described in RFC http://www.open-mpi.org/community/lists/devel/2009/07/6387.php namely: - Fewer basic types (int* and float* types, boolean and wchar - Fixing naming scheme to ompi-nomenclature. - Usability outside of the ompi-layer. - Due to the fixed nature of simple opal types, their information is completely known at compile time and therefore constified - With fewer datatypes (22), the actual sizes of bit-field types may be reduced from 64 to 32 bits, allowing reorganizing the opal_datatype structure, eliminating holes and keeping data required in convertor (upon send/recv) in one cacheline... This has implications to the convertor-datastructure and other parts of the code. - Several performance tests have been run, the netpipe latency does not change with this patch on Linux/x86-64 on the smoky cluster. - Extensive tests have been done to verify correctness (no new regressions) using: 1. mpi_test_suite on linux/x86-64 using clean ompi-trunk and ompi-ddt: a. running both trunk and ompi-ddt resulted in no differences (except for MPI_SHORT_INT and MPI_TYPE_MIX_LB_UB do now run correctly). b. with --enable-memchecker and running under valgrind (one buglet when run with static found in test-suite, commited) 2. ibm testsuite on linux/x86-64 using clean ompi-trunk and ompi-ddt: all passed (except for the dynamic/ tests failed!! as trunk/MTT) 3. compilation and usage of HDF5 tests on Jaguar using PGI and PathScale compilers. 4. compilation and usage on Scicortex. - Please note, that for the heterogeneous case, (-m32 compiled binaries/ompi), neither ompi-trunk, nor ompi-ddt branch would successfully launch. This commit was SVN r21641.	2009-07-13 04:56:31 +00:00
Greg Koenig	60485ff95f	This is a very large change to rename several #define values from OMPI_* to OPAL_*. This allows opal layer to be used more independent from the whole of ompi. NOTE: 9 "svn mv" operations immediately follow this commit. This commit was SVN r21180.	2009-05-06 20:11:28 +00:00
Rainer Keller	221fb9dbca	... Delayed due to notifier commits earlier this day ... - Delete unnecessary header files using contrib/check_unnecessary_headers.sh after applying patches, that include headers, being "lost" due to inclusion in one of the now deleted headers... In total 817 files are touched. In ompi/mpi/c/ header files are moved up into the actual c-file, where necessary (these are the only additional #include), otherwise it is only deletions of #include (apart from the above additions required due to notifier...) - To get different MCAs (OpenIB, TM, ALPS), an earlier version was successfully compiled (yesterday) on: Linux locally using intel-11, gcc-4.3.2 and gcc-SVN + warnings enabled Smoky cluster (x86-64 running Linux) using PGI-8.0.2 + warnings enabled Lens cluster (x86-64 running Linux) using Pathscale-3.2 + warnings enabled This commit was SVN r21096.	2009-04-29 01:32:14 +00:00
Rainer Keller	d8cf4c0fec	- Get pgcc on XT to complain less: In case we use memcmp, strlen, strup and friends include <string.h> Also several constants.h are not included directly - Let's have mca_topo_base_cart_create return ompi-errors in ompi/mca/topo/base/topo_base_cart_create.c This commit was SVN r20773.	2009-03-13 02:10:32 +00:00
Rainer Keller	ec0ed48718	- Revert r20739 This commit was SVN r20742. The following SVN revision numbers were found above: r20739 --> open-mpi/ompi@781caee0b6	2009-03-05 21:56:03 +00:00
Rainer Keller	781caee0b6	- First of two or three patches, in orte/util/proc_info.h: Adapt orte_process_info to orte_proc_info, and change orte_proc_info() to orte_proc_info_init(). - Compiled on linux-x86-64 - Discussed with Ralph This commit was SVN r20739.	2009-03-05 20:36:44 +00:00
Rainer Keller	9dea63d63a	- Last of intrusive commits (promised)... err for now. Anyway, this is blocking the move: do not include pml.h if not really needed, aka none of the following used: mca_pml MCA_PML_CALL OMPI_ANY_TAG OMPI_ANY_SOURCE OMPI_PROC_NULL - Notable exceptions (deleting in one header->adding): - ompi/mca/mtl/psm/ - ompi/mca/osc/rdma/ - ompi/mca/btl/openib/btl_openib_endpoint.c depended on pml_base_sendreq.h - Tested on Linux/x86-64, this time including make check (thanks Jeff and Ralph) This commit was SVN r20725.	2009-03-04 17:06:51 +00:00
Rainer Keller	fd28b392bf	- An intrusive commit yet again (sorry): with the separation we get bitten by header depending on having already included the corresponding [opal\|orte\|ompi]_config.h header. When separating, things like [OPAL\|ORTE\|OMPI]_DECLSPEC are missed. Script to add the corresponding header in front of all following (taking care of possible #ifdef HAVE_...) - Including some minor cleanups to - ompi/group/group.h -- include _after_ #ifndef OMPI_GROUP_H - ompi/mca/btl/btl.h -- nclude _after_ #ifndef MCA_BTL_H - ompi/mca/crcp/bkmrk/crcp_bkmrk_btl.c -- still no need for orte/util/output.h - ompi/mca/pml/dr/pml_dr_recvreq.c -- no need for mpool.h - ompi/mca/btl/btl.h -- reorder to fit - ompi/mca/bml/bml.h -- reorder to fit - ompi/runtime/ompi_mpi_finalize.c -- reorder to fit - ompi/request/request.h -- additionally need ompi/constants.h - Tested on linux/x86-64 This commit was SVN r20720.	2009-03-04 15:35:54 +00:00
Ralph Castain	4cdf91a8d4	Per the RFC, extend the current use of the ompi_proc_t flags field (without changing the field itself). The prior ompi_proc_t structure had a uint8_t flag field in it, where only one bit was used to flag that a proc was "local". In that context, "local" was constrained to mean "local to this node". This commit provides a greater degree of granularity on the term "local", to include tests to see if the proc is on the same socket, PC board, node, switch, CU (computing unit), and cluster. Add #define's to designate which bits stand for which local condition. This was added to the OPAL layer to avoid conflicting with the proposed movement of the BTLs. To make it easier to use, a set of macros have been defined - e.g., OPAL_PROC_ON_LOCAL_SOCKET - that test the specific bit. These can be used in the code base to clearly indicate which sense of locality is being considered. All locations in the code base that looked at the current proc_t field have been changed to use the new macros. Also modify the orte_ess modules so that each returns a uint8_t (to match the ompi_proc_t field) that contains a complete description of the locality of this proc. Obviously, not all environments will be capable of providing such detailed info. Thus, getting a "false" from a test for "on_local_socket" may simply indicate a lack of knowledge. This commit was SVN r20496.	2009-02-10 02:20:16 +00:00
George Bosilca	4d5fbc5955	Remove unused lock from the ompi_proc_t. This reduce the size of the ompi_proc_t by 64 bytes. Remove the useless pml_proc from the PML layer. This commit was SVN r20157.	2008-12-19 19:56:27 +00:00

1 2 3

116 Коммитов