1
1
Граф коммитов

14032 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
a0d5c80ce0 Add a new framework for discovering local resource information such as cpu type/model, #cpus, available physical memory, etc. Two initial components (darwin and linux) are provided. This is needed to support bootstrap operations where daemons are started at node boot, and applications where initial knowledge of cpu identification is needed to guide framework component selection.
Add orte configuration option to control the use of the framework in the system. Although the code will build, it will not be active unless configured with --enable-bootstrap.

If bootstrap is enabled and the new opal_sysinfo framework can successfully determine the cpu model, pass that info to the application as an MCA param to support some work at Sun.

Also, have daemons report back the resources they find to guide process mapping in bootstrap operations (i.e., where the daemon starts at node boot as opposed to being launched at application start).

Adjust some platform files to enable these capabilities.

This commit was SVN r22244.
2009-11-30 23:11:25 +00:00
Brian Barrett
fd39f466ce Remove elements previously removed from the real structures...
This commit was SVN r22241.
2009-11-30 00:36:26 +00:00
Ralph Castain
e38a0eab9f Remove the fddp and sensor frameworks - relocated to new cluster mgr project
This commit was SVN r22240.
2009-11-27 22:14:47 +00:00
Shiqing Fan
7cf427c39b Include the missing thread header, which is needed when build with --enable-progress-thread.
This commit was SVN r22239.
2009-11-27 14:49:24 +00:00
Matthias Jurenz
f1c55df65c - create a process group with the real node name on CrayXT platforms
- updated version number of integrated VT to 5.4.11

This commit was SVN r22238.
2009-11-26 07:50:07 +00:00
Brian Barrett
b57b8c5b3f Clean up request handling in the I/O framework to be more consistent with
other request-using frameworks.

 - Rather than having mpi/c/* functions allocate requests explicitly,
   pass the MPI_Request* down to the I/O component and have it 
   perform the allocation.
 - While the I/O base provides a base request which can be used,
   it is not required and all request management occurs within
   the component.
 - Push progress management into the component, rather than having it
   happen in the base.  Progress functions are now easily registered,
   and not all (ie, the one existing) components use progress functions
   in any rational way.

ROMIO switched to generalized requests instead of MPIO_Requests many
moons ago, and Open MPI now uses ROMIO's generalized requests, so there
is no reason to wrap those requests (which are OMPI requests) in another
level of request.

Now the file function passes the MPI_Request* to the ROMIO component,
which passes it to the underlying ROMIO function, which calls 
MPI_Grequest_start to create an OMPI request, which is what gets set
as the request to the user.  Much cleaner.

This patch has two motivations.  One, a whole heck of a lot of code
just got removed, and request handling is now much cleaner for I/O
components.  Two, by adding support for Argonne's proposed generalized
request extensions, we can allow ROMIO to provide async I/O through
generalized requests, which we couldn't rationally do in the old
setup due to the crazy request completion rules.

This commit was SVN r22235.
2009-11-26 05:13:43 +00:00
Rainer Keller
70a69e796f - Get rid of a small nuisance: after installation of the
alps-resid script, set it to exec, to allow:

   export OMPI_ALPS_RESID=`$OMPI/share/openmpi/ras-alps-command.sh`

This commit was SVN r22234.
2009-11-25 19:01:33 +00:00
Brian Barrett
8075640ef1 The tests are MPI programs and are built using mpicc, so including
OMPI headers won't work

This commit was SVN r22233.
2009-11-25 18:06:15 +00:00
Ralph Castain
9a6d5697a8 Protect against NULL input - I'm -sure- no one will do it, but...well, actually, they did. :-/
This commit was SVN r22232.
2009-11-25 15:13:21 +00:00
Ralph Castain
c1206139dd Ensure the thread-safe data buffers are initialized prior to use
This commit was SVN r22231.
2009-11-25 15:12:45 +00:00
Jeff Squyres
978fb43a26 Add a Big Hairy Warning if you --enable-progress-threads
This commit was SVN r22230.
2009-11-24 23:20:37 +00:00
George Bosilca
87fd85b17a Detach the user buffer prior to the orte_barrier in MPI_Finalize.
This patch fixes trac:2112.

This commit was SVN r22229.

The following Trac tickets were found above:
  Ticket 2112 --> https://svn.open-mpi.org/trac/ompi/ticket/2112
2009-11-24 02:33:13 +00:00
Ralph Castain
92733b13d9 Add a couple of new tests to the orte system.
Modify the job_complete check so we don't kill jobs when a single proc was terminated by ORTE command via plm.terminate_procs

Still dies gracefully with a ctrl-c, and behaves as before when using plm.terminate_job

This commit was SVN r22227.
2009-11-20 01:47:49 +00:00
Ralph Castain
5e031d9ded Let a restarted process have access to all known nodes instead of only those already in its prior job map
This commit was SVN r22225.
2009-11-19 19:45:11 +00:00
Ralph Castain
852e5d9ee0 Add some diag output
This commit was SVN r22224.
2009-11-19 19:43:36 +00:00
Ralph Castain
a401f05ea3 Add some diagnostics to chase down forced termination of procs. Ensure that procs are removed from the local data list upon termination
This commit was SVN r22223.
2009-11-19 19:43:10 +00:00
Ralph Castain
3921069230 Ensure we completely cleanout the old nidmap info
This commit was SVN r22222.
2009-11-19 19:42:15 +00:00
Ralph Castain
8dc08e304f No longer require name passed separately
This commit was SVN r22221.
2009-11-19 19:41:41 +00:00
Ralph Castain
1a44b84b25 If a process is in certain states (e.g., polling for messages in the event lib), then it can blissfully ignore SIGTERM when we try to order it to die. Unfortunately, the OS thinks the process actually did die, leading us to leave orphaned procs around.
The only sure way to kill the thing is with SIGKILL. After hours spent trying to debug this bizarre situation with a reliable reproducer, I finally tracked it down and fixed it.

Go figure...I sure can't.

This commit was SVN r22220.
2009-11-19 17:25:15 +00:00
Shiqing Fan
11ad25fa77 A few windows fixes:
Add a missing value for the configure file. 
Fix the bug that generating wrong svn version number.
Correct the wrong string length of the headnode name.

cmr:v1.5
cmr:v1.3.4

This commit was SVN r22219.
2009-11-18 09:43:47 +00:00
Ralph Castain
840766a894 Update the rmcast APIs to include tag params and reorder them to look like their rml cousins
This commit was SVN r22218.
2009-11-17 15:58:59 +00:00
Jeff Squyres
766d56dc0a Minor typo fixes from Jed Brown.
This commit was SVN r22217.
2009-11-12 15:00:08 +00:00
Ralph Castain
aea1ab3bd6 Remove diagnostic
This commit was SVN r22216.
2009-11-11 22:16:15 +00:00
Ralph Castain
f5a152bf84 Update a platform file
This commit was SVN r22215.
2009-11-11 22:11:45 +00:00
Ralph Castain
a2f3a47b92 Update the orte_mcast test
This commit was SVN r22214.
2009-11-11 22:11:19 +00:00
Ralph Castain
6496ce7212 Expand the reliable multicast APIs to support sending/recving of iovecs
This commit was SVN r22213.
2009-11-11 22:10:35 +00:00
Samuel Gutierrez
ce3c4426d7 Revert LANL Mac OS X platform file changes. See: r22200
This commit was SVN r22209.

The following SVN revision numbers were found above:
  r22200 --> open-mpi/ompi@92d4a14881
2009-11-10 01:03:31 +00:00
Samuel Gutierrez
63e7bf2783 Revert back to not building carto. Also see: r22204.
This commit was SVN r22208.

The following SVN revision numbers were found above:
  r22204 --> open-mpi/ompi@460f64a39e
2009-11-10 00:34:38 +00:00
Rainer Keller
09cd970ec7 - With ompi r22205 it's not necessary to include xt-catamount module
allowing further cleanup.

This commit was SVN r22207.

The following SVN revision numbers were found above:
  r22205 --> open-mpi/ompi@366bd96c88
2009-11-09 14:32:54 +00:00
Rainer Keller
276b813f48 - Output according to their type.
This commit was SVN r22206.
2009-11-09 14:28:15 +00:00
Rainer Keller
366bd96c88 - Allow to work without xt-catamount module on Jaguar,
reducing the amount of components, that up to now needed to be
   deselected.

This commit was SVN r22205.
2009-11-09 14:26:24 +00:00
Rainer Keller
99c7cc53ae - Grr, shut up the compilation of the ompi_debugger_canary.
This commit was SVN r22203.
2009-11-06 19:58:11 +00:00
Rainer Keller
7dd6df8307 - Remove 2 warnings from tonight's MTT runs:
Everywhere, our offsets within structs are int-sized (and
   compared to <0).

This commit was SVN r22199.
2009-11-06 13:53:01 +00:00
Samuel Gutierrez
8956d9c66b Remove carto from enable_mca_no_build. Not building carto seems to make the LANL machines unhappy.
This commit was SVN r22198.
2009-11-05 23:37:40 +00:00
Eugene Loh
88c0921c5e Corrected the usage of "rc" in mca_btl_sm_component_progress.
The return code for this function should be the number of events
received.

This commit was SVN r22191.
2009-11-04 03:10:35 +00:00
Jeff Squyres
ab00aea1ff Per http://www.open-mpi.org/community/lists/devel/2009/10/7025.php,
use the new Automake "silent rules" if available.

If you are using an Automake prior to v1.11, you won't see the new 
silent rules -- it will automatically default back to the "verbose" 
rules.

Note, too, that even with these changes, you can enable the verbose 
"make all" output in one of two ways:

1. Add "V=1" to your "make" command line

{{{
shell$ make all V=1
}}}

2. Add "--disable-silent-rules" to your "configure" command line:

{{{
shell$ ./configure --disable-silent-rules ...
}}}

The one down side of using the silent rules by default is that we'll 
get less diagnostic information when users send their build logs.  I 
think we should update the web page to request that users send build 
logs of "make V=1", but I'm guessing that not everyone will do it.

Note that I did ''not'' silent-ize the libltdl build (which is a dozen
or so files in the beginning of the build) because we wholly import
libltdl at autogen time.  I therefore didn't want to patch libltdl
(further) after importing it a) to remain as forward- compatible as
possible, and b) patching the imported libltdl build system might be
tricky in terms of timestamps / dependencies.  So those dozen-or-so
files will still be "verbose", but the rest of the files in OMPI will
be "silent".

This commit was SVN r22189.
2009-11-04 02:07:02 +00:00
Shiqing Fan
6f8d0a1ab8 Update a few CMake scripts.
Add Program Database (pdb) files for installation for debug build.

This commit was SVN r22188.
2009-11-03 10:40:58 +00:00
Terry Dontje
f70af6a81e move inclusion of ompi_datatype.h from ompi_common_dll.c to ompi_debugger_canary.c to get rid of unresolved symbols
This commit was SVN r22179.
2009-11-01 11:06:20 +00:00
Rainer Keller
f121e46db1 - Finalize ornl_configure
This commit was SVN r22178.
2009-11-01 03:25:57 +00:00
Jeff Squyres
5e6c494269 Remove the mistaken line (confirmed by Shiqing).
This commit was SVN r22175.
2009-10-30 12:45:05 +00:00
Jeff Squyres
be8a09dc1f Fix spelling error noted by Eugene.
This commit was SVN r22173.
2009-10-30 12:42:58 +00:00
Eugene Loh
1a44fc478d In sm_btl_first_time_init(), when we figure the size of the shared
area, we cap the size at LONG_MAX.  But we are figuring out how much
we need.  So, if that amount exceeds LONG_MAX, we should return an
"out of resource" error code.

This commit was SVN r22172.
2009-10-29 23:06:32 +00:00
Jeff Squyres
16f42c45a6 Ensure to have a PARAM_CONFIG_FILES (I don't know if
PARAM_WINDOWS_FILES is a mistake or not).  Fixes trac:2079.

This commit was SVN r22171.

The following Trac tickets were found above:
  Ticket 2079 --> https://svn.open-mpi.org/trac/ompi/ticket/2079
2009-10-29 22:05:26 +00:00
Rainer Keller
0758127b2c - On Jaguar using the Cray compiler, choose the gettimeofday() timer,
as craycc doesn't cope with asm inline too well.

   This should be cmr:v1.5 

This commit was SVN r22170.
2009-10-29 21:31:10 +00:00
Rainer Keller
7dfe709ac1 - Initialize n before usage.
This commit was SVN r22169.
2009-10-29 15:52:53 +00:00
Rainer Keller
954d43a5dd - Finalize the compilation script for Jaguar.
Cray compiler seems to work (with a VT pachlet).
   In case ADD_* are not defined, don't have a "space" at the beginning of strings

   Fits into sw_install_new_version.sh and NCCS swtools (rebuild,retest)

This commit was SVN r22168.
2009-10-29 15:02:18 +00:00
Rainer Keller
4c437d6586 - OPAL function returns OPAL error codes...
This commit was SVN r22167.
2009-10-29 14:49:01 +00:00
Jeff Squyres
ba46010a56 Fix the nightly build.
This commit was SVN r22166.
2009-10-29 01:54:11 +00:00
Rainer Keller
507e22a7d3 - As promised in
http://www.open-mpi.org/faq/?category=debugging#valgrind_clean
   provide openmpi-valgrind.supp suppression file

This commit was SVN r22164.
2009-10-28 23:33:16 +00:00
Rainer Keller
63e540366b - Include the datatype tests again
make distcheck works
   contrib/dist/make_tarball succeeds too
   make checks shows all 5 tests passing.

This commit was SVN r22163.
2009-10-28 23:19:04 +00:00