1
1
Граф коммитов

471 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
558fc2836d Bump PLPA version to 1.3rc2, which should fix the "make dist" error
from last night's nightly tarball.

This commit was SVN r20576.
2009-02-17 12:54:57 +00:00
Jeff Squyres
4590582807 Bump PLPA to v1.3rc1, which includes a valgrind API fix for a
known-bad memory access pattern.  Specifically, a NULL pointer is
passed in a system call as part of a probe to figure out which
affinity API this system has.  We know it's a NULL and we did it on
purpose, so don't have Valgrind yell about it.

This commit was SVN r20572.
2009-02-17 02:01:30 +00:00
Jeff Squyres
17d9c2c240 Important clarification about the ownership of strings returned by the
current_value parameter to mca_base_param_reg_string() function.

This commit was SVN r20535.
2009-02-12 22:54:29 +00:00
Jeff Squyres
e3ae1468d3 Don't strdup here; there's already a strdup down in
param_set_override().

This commit was SVN r20533.
2009-02-12 22:36:45 +00:00
Ralph Castain
883a0972bc Add missing include file
This commit was SVN r20529.
2009-02-12 16:41:48 +00:00
Ralph Castain
4cdf91a8d4 Per the RFC, extend the current use of the ompi_proc_t flags field (without changing the field itself).
The prior ompi_proc_t structure had a uint8_t flag field in it, where only one
bit was used to flag that a proc was "local". In that context, "local" was
constrained to mean "local to this node".

This commit provides a greater degree of granularity on the term "local", to include tests
to see if the proc is on the same socket, PC board, node, switch, CU (computing
unit), and cluster.

Add #define's to designate which bits stand for which local condition. This
was added to the OPAL layer to avoid conflicting with the proposed movement of
the BTLs. To make it easier to use, a set of macros have been defined - e.g.,
OPAL_PROC_ON_LOCAL_SOCKET - that test the specific bit. These can be used in
the code base to clearly indicate which sense of locality is being considered.

All locations in the code base that looked at the current proc_t field have
been changed to use the new macros.

Also modify the orte_ess modules so that each returns a uint8_t (to match the
ompi_proc_t field) that contains a complete description of the locality of this
proc. Obviously, not all environments will be capable of providing such detailed
info. Thus, getting a "false" from a test for "on_local_socket" may simply
indicate a lack of knowledge.

This commit was SVN r20496.
2009-02-10 02:20:16 +00:00
Shiqing Fan
7d2d6b16b1 A fix for windows mainly, adding BEGIN/END_C_DECLS pairs.
This commit was SVN r20448.
2009-02-05 16:35:58 +00:00
Jeff Squyres
9c2a6da128 Remove errant '>'. How on earth did that work at all?
This commit was SVN r20416.
2009-02-03 23:21:34 +00:00
Jeff Squyres
35c5e28a8e Up to SVN r20383
This commit was SVN r20384.

The following SVN revision numbers were found above:
  r20383 --> open-mpi/ompi@e0638c84c8
2009-01-29 17:59:04 +00:00
Jeff Squyres
f3b1432260 Fixes trac:1618: ensure to check to see if the symbol RTLD_NEXT exists
before trying to use it (e.g., it doesn't seem to exist on Cygwin).

This commit was SVN r20343.

The following Trac tickets were found above:
  Ticket 1618 --> https://svn.open-mpi.org/trac/ompi/ticket/1618
2009-01-25 16:38:00 +00:00
Jeff Squyres
6d805eb0dd Ensure to not do the found_files stuff is --disable-dlopen is selected.
This commit was SVN r20320.
2009-01-22 16:46:02 +00:00
Jeff Squyres
58a25cae69 Fixes trac:1271: make the OPAL MCA base read the list of MCA DSO filenames
''once'' and keep the names in an argv-style array.  Each time we go
to open a framework, we just scan that array rather than re-reading
all the filenames from the filesystem.

This commit was SVN r20309.

The following Trac tickets were found above:
  Ticket 1271 --> https://svn.open-mpi.org/trac/ompi/ticket/1271
2009-01-21 22:27:05 +00:00
Josh Hursey
fca3c6e571 Fix the BLCR configuration when explicitly disabling it.
It happened that if we supplied:
 --with-ft=cr --without-blcr
then BLCR would be loaded, due to a logic break in the old m4.

Now this works approprately. This should be moved to v1.3.1

This commit was SVN r20296.
2009-01-19 20:21:58 +00:00
Jeff Squyres
4520b00547 Fixes trac:1587: also check the mca component struct framework and
component name against the filename and ensure that they match.
Ignore the component if they do not.

This commit was SVN r20291.

The following Trac tickets were found above:
  Ticket 1587 --> https://svn.open-mpi.org/trac/ompi/ticket/1587
2009-01-17 12:53:21 +00:00
Jeff Squyres
d1c6f3f89a * Fix a truckload of Cisco copyrights to be the same as the rest of
the code base.
 * Fix a few misspellings in other copyrights.

This commit was SVN r20241.
2009-01-11 02:30:00 +00:00
Ralph Castain
17e1911afa Remove unneeded include file
This commit was SVN r20204.
2009-01-05 19:20:02 +00:00
Jeff Squyres
df3a304447 Fix CID 1182: ensure to check return of read() for failure.
This commit was SVN r20191.
2009-01-03 15:30:56 +00:00
Tim Mattox
f911b1a63d Fix a few code comments in the new ompi-top functionality.
This commit was SVN r20166.
2008-12-22 22:36:38 +00:00
Ralph Castain
7787f84540 Per the earlier RFC and some discussion at the Dec ORTE design meeting, add the ompi-top tool and all its supporting infrastructure. This includes a new OPAL pstat framework and data type, currently with rather weak support for Mac OSX and pretty complete support for Linux. The Sun team promised to add Solaris support as well.
Also, per chat with Jeff, modified the Makefile.am's of a few orte tools so that they were consistent in the way we generate the ompi-equivalent cmds.

This commit was SVN r20165.
2008-12-22 20:23:05 +00:00
Josh Hursey
ce8d18bfda This commit changes the use of the deprecated cr_request_file() to use the cr_request_checkpoint() interface to BLCR. Additional configure checks are added to use the best available checkpointing interface available for the BLCR installed on the system (default: cr_request_checkpoint()).
This commit fixes trac:1691

Thanks to Matthias Hovestadt for identifying this issue.

This commit was SVN r20114.

The following Trac tickets were found above:
  Ticket 1691 --> https://svn.open-mpi.org/trac/ompi/ticket/1691
2008-12-11 00:08:34 +00:00
Shiqing Fan
5ae5f0e173 - 4/4 commit for Windows Visual Studio and CCP support:
unnecessary clean up to non windows related files (within ifdef __WINDOWS__).

This commit was SVN r20111.
2008-12-10 21:13:27 +00:00
Shiqing Fan
8673f19f50 - 2/4 commit for Windows Visual Studio and CCP support:
changes to the already existing ccp components
  event/win32.c: merge old FD handling into new
  opal_installdirs_windows.c:fix the registry handling

This commit was SVN r20109.
2008-12-10 21:01:54 +00:00
Shiqing Fan
a5281f0434 - 1/4 commit for Windows Visual Studio and CCP support:
CMakeLists and .windows files.
  In contribs preconfigured and precompiled parts.

This commit was SVN r20108.
2008-12-10 20:59:20 +00:00
Ralph Castain
1ace83c470 Enable modex-less launch. Consists of:
1. minor modification to include two new opal MCA params:
   (a) opal_profile: outputs what components were selected by each framework
       currently enabled for most, but not all, frameworks
   (b) opal_profile_file: name of file that contains profile info required
       for modex

2. introduction of two new tools:
   (a) ompi-probe: MPI process that simply calls MPI_Init/Finalize with
       opal_profile set. Also reports back the rml IP address for all
       interfaces on the node
   (b) ompi-profiler: uses ompi-probe to create the profile_file, also
       reports out a summary of what framework components are actually
       being used to help with configuration options

3. modification of the grpcomm basic component to utilize the
   profile file in place of the modex where possible

4. modification of orterun so it properly sees opal mca params and
   handles opal_profile correctly to ensure we don't get its profile

5. similar mod to orted as for orterun

6. addition of new test that calls orte_init followed by calls to
   grpcomm.barrier

This is all completely benign unless actively selected. At the moment, it only supports modex-less launch for openib-based systems. Minor mod to the TCP btl would be required to enable it as well, if people are interested. Similarly, anyone interested in enabling other BTL's for modex-less operation should let me know and I'll give you the magic details.

This seems to significantly improve scalability provided the file can be locally located on the nodes. I'm looking at an alternative means of disseminating the info (perhaps in launch message) as an option for removing that constraint.

This commit was SVN r20098.
2008-12-09 23:49:02 +00:00
Brian Barrett
8a8cf96b6c Provide configure parameter to allow the disabling of reading parameters
and components from the home directory for platforms that are bad at
reading in files from home directory at scale (like Red Storm)

This commit was SVN r20069.
2008-12-04 01:51:44 +00:00
Jeff Squyres
06097db928 Fixes trac:1667. Ensure to fill in the source_file if it was requested.
This commit was SVN r20067.

The following Trac tickets were found above:
  Ticket 1667 --> https://svn.open-mpi.org/trac/ompi/ticket/1667
2008-12-03 22:17:50 +00:00
Shiqing Fan
abd21b6d17 - An update for memchecker :
1. fix a bug in pml_ob1_recvreq/sendreq.c, buffer was made defined where the request has already been released.
2. complete memchecker support for collective functions.
3. change the wrongly spelled function name of memchecker, i.e. '*_isaddressible' should be '*_isaddressable'

This commit was SVN r20043.
2008-11-27 16:34:02 +00:00
Jeff Squyres
7b32402959 Fixes from Brian for OS X 10.4.
This commit was SVN r19953.
2008-11-07 22:13:43 +00:00
Jeff Squyres
357e9ef070 Move AM_CONDITIONAL to its own POST_CONFIG, as it needs to be. Fixes
#1622.

This commit was SVN r19908.
2008-11-03 22:34:38 +00:00
Jeff Squyres
f2a7993aa5 Refs trac:1578: Shiqing-suggested changes for valgrind configure.m4 support.
This commit was SVN r19776.

The following Trac tickets were found above:
  Ticket 1578 --> https://svn.open-mpi.org/trac/ompi/ticket/1578
2008-10-21 03:27:43 +00:00
Jeff Squyres
e42139710b A typo prevented the valgrind memchecker component finding the
Valgrind header files if they weren't already in the compiler's
default header file search path.  This commit fixes that typo and adds
a little more infrastructure (via an AC_SUBST) to pass in the relevant
CPPFLAGS to the build system for the valgrind memchecker component.

This commit was SVN r19764.
2008-10-17 23:04:39 +00:00
Josh Hursey
88aa45dd52 Commit to bring online OpenIB, MX, and shared memory support for Open MPI's checkpoint/restart functionality. Some tuning is still needed, but basic functionality is in place.
There is still a problem with OpenIB and threads (external to C/R functionality). It has been reported in Ticket #1539

Additionally:
* Fix a file cleanup bug in CRS Base.
* Fix a possible deadlock in the TCP ft_event function
* Add a mca_base_param_deregister() function to MCA base
* Add whole process checkpoint timers
* Add support for BTL: OpenIB, MX,  Shared Memory
* Add support Mpool: rdma, sm
* Sundry bounds checking an cleanup in some scattered functions

This commit was SVN r19756.
2008-10-16 15:09:00 +00:00
Aurelien Bouteiller
4be474f727 CRS is now an opal framework. It should use OPAL version defines.
This commit was SVN r19643.
2008-09-25 21:01:04 +00:00
Josh Hursey
90c936b292 Cleanup BLCR configure logic. Add a '--with-blcr-libdir' option to allow a user to specify a library directory outside of the '--with-blcr' option.
Needs to be moved to v1.3

This commit was SVN r19607.
2008-09-22 19:48:47 +00:00
Josh Hursey
0cd65bfaa8 Fix a SIGPIPE that may occur when checkpointing a restarted process. This was a result of calling system() in the BLCR CRS. After inspection and testing it was determined that the operation was no longer necessary. So the call was removed thus fixing the bug.
This commit was SVN r19601.
2008-09-22 16:49:56 +00:00
Jeff Squyres
53967f2b4e Merge in PLPA v1.2rc2 (README fixes, new version of Autotools, and
have PLPA report its version correctly).

This commit was SVN r19590.
2008-09-19 15:05:03 +00:00
Jeff Squyres
b1ff61b19e Update to PLPA v1.2rc1
This commit was SVN r19589.
2008-09-19 14:49:53 +00:00
Lenny Verkhovsky
ca0a5ea60b Fixed the warnings on the crays.
base/paffinity_base_service.c:153: warning: 'phys_core' may be used uninitialized in this function
base/paffinity_base_service.c:153: note: 'phys_core' was declared here

This commit was SVN r19580.
2008-09-18 11:31:12 +00:00
Shiqing Fan
68f6fdf111 - a small fix for windows, use different environment separators based on the system type.
This commit was SVN r19554.
2008-09-15 15:05:47 +00:00
Jeff Squyres
4b5de753d4 Bring in new PLPA v1.2b5 to fix a typo found by Lenny.
This commit was SVN r19526.
2008-09-09 12:29:31 +00:00
George Bosilca
579d70edad We should use #ifdef and not #if
This commit was SVN r19504.
2008-09-05 12:44:19 +00:00
Josh Hursey
21910d29ff Really clean up the Configure for BLCR. There were two leftover bugs that futher testing highlighted.
Thanks to Ralph for helping me find these bugs on Odin.

This commit was SVN r19491.
2008-09-03 20:10:29 +00:00
Josh Hursey
78c35f6d93 Fix a configure problem on Odin. If --without-blcr option is given and BLCR exists in a default path, then it was trying to build anyway. I ran a couple test builds and this patch seems to fix the problem.
This commit was SVN r19486.
2008-09-02 20:15:57 +00:00
Shiqing Fan
c5845708cb - more *_C_DECLS.
This commit was SVN r19469.
2008-09-01 16:26:10 +00:00
Ralph Castain
274d912fe1 Silence warnings in paffinity_base_service
This commit was SVN r19453.
2008-08-28 22:11:49 +00:00
Ralph Castain
4ef9d15d97 Revamp the opal mca paffinity interface. We ran into a problem when we encountered machines that had "holes" in their physical processor layout - e.g., machines that supported "hotplugging", or that had unpopulated sockets. To solve that problem, we had to clarify at the API level where we were describing physical vs logical processor info, and then translate accordingly in the underlying implementation.
See opal/mca/paffinity/paffinity.h for explanation as to the physical vs logical nature of the params used in the API.

Fixes trac:1435

This commit was SVN r19391.

The following Trac tickets were found above:
  Ticket 1435 --> https://svn.open-mpi.org/trac/ompi/ticket/1435
2008-08-21 19:21:28 +00:00
Jeff Squyres
80d11dba8f Bring in PLPA v1.2b4
This commit was SVN r19299.
2008-08-14 21:04:28 +00:00
Jeff Squyres
bb585922fd This is fixed a different way now; no need to be different than stock
PLPA.

This commit was SVN r19293.
2008-08-14 18:54:34 +00:00
Jeff Squyres
a6e0589f01 Update to PLPA v1.2b3. Sorry again for the mid-day configure change...
This commit was SVN r19292.
2008-08-14 14:26:26 +00:00
Jeff Squyres
a19cf02c2b Refs trac:1435
Bring in a new version of PLPA (v1.2b2) with some new capabilities for
offline processors and mapping of the Nth processor/socket/core to its
corresponding Linux processor/socket/core ID.

(Sorry for the configure change in the middle of the day, folks -- I
need it to be able to continue to integrate paffinity changes for
#1435...)

This commit was SVN r19282.

The following Trac tickets were found above:
  Ticket 1435 --> https://svn.open-mpi.org/trac/ompi/ticket/1435
2008-08-13 20:18:37 +00:00