1
1
Граф коммитов

557 Коммитов

Автор SHA1 Сообщение Дата
Shiqing Fan
f5792bbda5 merging the memchecker into trunk.
This commit was SVN r17424.
2008-02-12 08:46:27 +00:00
Sharon Melamed
98e8de264d Wraped the carto API in carto_base_wrapers.c
This commit was SVN r17380.
2008-02-05 19:29:16 +00:00
Sharon Melamed
025b68becf Move the carto framework to the trunk.
This commit was SVN r17177.
2008-01-23 09:20:34 +00:00
George Bosilca
7eca186568 Fix a typo related to the conversion from ompi_pointer_array
to opal_pointer_array.

This commit was SVN r17023.
2007-12-22 05:32:40 +00:00
George Bosilca
906e8bf1d1 Replace the ompi_pointer_array with opal_pointer_array. The next step
(sometimes after the merge with the ORTE branch), the opal_pointer_array
will became the only pointer_array implementation (the orte_pointer_array
will be removed).

This commit was SVN r17007.
2007-12-21 06:02:00 +00:00
Rich Graham
27a748e7eb change all instances of ompi_free_list_init to ompi_free_list_init_new. Header
and payload data are specified separately at this stage.

This commit was SVN r16633.
2007-11-01 23:38:50 +00:00
Ralph Castain
54b2cf747e These changes were mostly captured in a prior RFC (except for #2 below) and are aimed specifically at improving startup performance and setting up the remaining modifications described in that RFC.
The commit has been tested for C/R and Cray operations, and on Odin (SLURM, rsh) and RoadRunner (TM). I tried to update all environments, but obviously could not test them. I know that Windows needs some work, and have highlighted what is know to be needed in the odls process component.

This represents a lot of work by Brian, Tim P, Josh, and myself, with much advice from Jeff and others. For posterity, I have appended a copy of the email describing the work that was done:

As we have repeatedly noted, the modex operation in MPI_Init is the single greatest consumer of time during startup. To-date, we have executed that operation as an ORTE stage gate that held the process until a startup message containing all required modex (and OOB contact info - see #3 below) info could be sent to it. Each process would send its data to the HNP's registry, which assembled and sent the message when all processes had reported in.

In addition, ORTE had taken responsibility for monitoring process status as it progressed through a series of "stage gates". The process reported its status at each gate, and ORTE would then send a "release" message once all procs had reported in.

The incoming changes revamp these procedures in three ways:

1. eliminating the ORTE stage gate system and cleanly delineating responsibility between the OMPI and ORTE layers for MPI init/finalize. The modex stage gate (STG1) has been replaced by a collective operation in the modex itself that performs an allgather on the required modex info. The allgather is implemented using the orte_grpcomm framework since the BTL's are not active at that point. At the moment, the grpcomm framework only has a "basic" component analogous to OMPI's "basic" coll framework - I would recommend that the MPI team create additional, more advanced components to improve performance of this step.

The other stage gates have been replaced by orte_grpcomm barrier functions. We tried to use MPI barriers instead (since the BTL's are active at that point), but - as we discussed on the telecon - these are not currently true barriers so the job would hang when we fell through while messages were still in process. Note that the grpcomm barrier doesn't actually resolve that problem, but Brian has pointed out that we are unlikely to ever see it violated. Again, you might want to spend a little time on an advanced barrier algorithm as the one in "basic" is very simplistic.

Summarizing this change: ORTE no longer tracks process state nor has direct responsibility for synchronizing jobs. This is now done via collective operations within the MPI layer, albeit using ORTE collective communication services. I -strongly- urge the MPI team to implement advanced collective algorithms to improve the performance of this critical procedure.


2. reducing the volume of data exchanged during modex. Data in the modex consisted of the process name, the name of the node where that process is located (expressed as a string), plus a string representation of all contact info. The nodename was required in order for the modex to determine if the process was local or not - in addition, some people like to have it to print pretty error messages when a connection failed.

The size of this data has been reduced in three ways:

(a) reducing the size of the process name itself. The process name consisted of two 32-bit fields for the jobid and vpid. This is far larger than any current system, or system likely to exist in the near future, can support. Accordingly, the default size of these fields has been reduced to 16-bits, which means you can have 32k procs in each of 32k jobs. Since the daemons must have a vpid, and we require one daemon/node, this also restricts the default configuration to 32k nodes.

To support any future "mega-clusters", a configuration option --enable-jumbo-apps has been added. This option increases the jobid and vpid field sizes to 32-bits. Someday, if necessary, someone can add yet another option to increase them to 64-bits, I suppose.

(b) replacing the string nodename with an integer nodeid. Since we have one daemon/node, the nodeid corresponds to the local daemon's vpid. This replaces an often lengthy string with only 2 (or at most 4) bytes, a substantial reduction.

(c) when the mca param requesting that nodenames be sent to support pretty error messages, a second mca param is now used to request FQDN - otherwise, the domain name is stripped (by default) from the message to save space. If someone wants to combine those into a single param somehow (perhaps with an argument?), they are welcome to do so - I didn't want to alter what people are already using.

While these may seem like small savings, they actually amount to a significant impact when aggregated across the entire modex operation. Since every proc must receive the modex data regardless of the collective used to send it, just reducing the size of the process name removes nearly 400MBytes of communication from a 32k proc job (admittedly, much of this comm may occur in parallel). So it does add up pretty quickly.


3. routing RML messages to reduce connections. The default messaging system remains point-to-point - i.e., each proc opens a socket to every proc it communicates with and sends its messages directly. A new option uses the orteds as routers - i.e., each proc only opens a single socket to its local orted. All messages are sent from the proc to the orted, which forwards the message to the orted on the node where the intended recipient proc is located - that orted then forwards the message to its local proc (the recipient). This greatly reduces the connection storm we have encountered during startup.

It also has the benefit of removing the sharing of every proc's OOB contact with every other proc. The orted routing tables are populated during launch since every orted gets a map of where every proc is being placed. Each proc, therefore, only needs to know the contact info for its local daemon, which is passed in via the environment when the proc is fork/exec'd by the daemon. This alone removes ~50 bytes/process of communication that was in the current STG1 startup message - so for our 32k proc job, this saves us roughly 32k*50 = 1.6MBytes sent to 32k procs = 51GBytes of messaging.

Note that you can use the new routing method by specifying -mca routed tree - if you so desire. This mode will become the default at some point in the future.


There are a few minor additional changes in the commit that I'll just note in passing:

* propagation of command line mca params to the orteds - fixes ticket #1073. See note there for details.

* requiring of "finalize" prior to "exit" for MPI procs - fixes ticket #1144. See note there for details.

* cleanup of some stale header files

This commit was SVN r16364.
2007-10-05 19:48:23 +00:00
Shiqing Fan
0f468f3668 - Remove the solution and project files, will commit them later.
This commit was SVN r15705.
2007-07-31 17:07:02 +00:00
Shiqing Fan
4d7b349cdb - Add VC8 solution and project files.
- If one wants to use this solution, remember to unload the project 'orte-restart' which is currently not working for Windows.

This commit was SVN r15680.
2007-07-30 11:05:34 +00:00
Tim Prins
7445a11f61 Remove duplicate tests. The current version of the dss tests are in orte/test/unit/dss
Remove defunct testing matrix

This commit was SVN r15535.
2007-07-20 13:37:44 +00:00
Ralph Castain
511457feb5 Remove stale test code. At least we were wise enough to have eliminated this code from the "make check" tree, but almost none of it compiles and of what does compile, nothing seems to really work.
This commit was SVN r15446.
2007-07-16 16:34:14 +00:00
Jeff Squyres
f72b52bb1d s/ifdef/if/ fro OMPI_C_HAVE_VISIBILITY to enable static builds.
This commit was SVN r14985.
2007-06-11 13:20:56 +00:00
George Bosilca
29dd535c01 Remove all references to the orte_bitmap as well as the files.
This commit was SVN r14928.
2007-06-06 20:24:07 +00:00
Brian Barrett
42c74b2cf7 fix test case so that condition variables work right, at least on PTHREADS.
I'm pretty sure condition variables are wrong for Solaris threads.

This commit was SVN r14877.
2007-06-05 19:24:17 +00:00
Brian Barrett
60571567a4 Better fix than r14831 -- ddt_pack was removed from TESTS because it
calls MPI_INIT and that causes problems during make distcheck.  Instead
put it in check_PROGRAMS which lets it get built, but doesn't run it.

This commit was SVN r14832.

The following SVN revision numbers were found above:
  r14831 --> open-mpi/ompi@9258c5200a
2007-06-01 14:34:06 +00:00
Rainer Keller
9258c5200a - As we need to reconfigure anyhow, get rid of autogen warning.
This commit was SVN r14831.
2007-06-01 08:20:38 +00:00
Brian Barrett
c7937ec02e until I figure out why MPI_INIT failed during make distcheck
This commit was SVN r14816.
2007-05-31 02:31:12 +00:00
George Bosilca
07f51ae5dc Make the test a little bit more difficult.
This commit was SVN r14814.
2007-05-30 22:40:16 +00:00
Brian Barrett
f02a9525dc add pack / unpack test
This commit was SVN r14801.
2007-05-30 17:41:15 +00:00
Sven Stork
fc932f1fb4 - changes to get the tests running with visibility enabled
This commit was SVN r14730.
2007-05-23 15:02:36 +00:00
Brian Barrett
21e00f6f0c Clean up a couple of configure things:
* Require Autoconf 2.60 or higher and remove some cruft
    required for AC 2.59 or the AC 2.59 / AC 2.60 mix
  * Remove a bunch of now unnecessary AC_SUBST calls
  * Use the libtool-provided variables for the -I and
    library to use when compiling against ltdl

Fixes trac:1000

This commit was SVN r14652.

The following Trac tickets were found above:
  Ticket 1000 --> https://svn.open-mpi.org/trac/ompi/ticket/1000
2007-05-15 04:23:48 +00:00
George Bosilca
f2a6b9394f Deal with the include spree. Protect "environ" on Windows.
Some others minors modifications in order to make it
compile [again] on Windows.

This commit was SVN r14188.
2007-04-01 16:16:54 +00:00
Jeff Squyres
4d8ee3d1e1 Add missing #include; fix the build for some picky compilers.
This commit was SVN r13696.
2007-02-17 11:54:40 +00:00
George Bosilca
beb9be3fe4 Don't import the datatype debug output if we're not in debug mode.
This commit was SVN r13650.
2007-02-14 16:47:12 +00:00
George Bosilca
06044db69a Add another test for the data-type engine. This test pack and unpack
the data in a way similar to the multi-network OB1 PML.

This commit was SVN r13632.
2007-02-13 09:30:19 +00:00
Jeff Squyres
9fb004ab8e remove the legal_numbits tests
This commit was SVN r13575.
2007-02-09 03:18:33 +00:00
Brian Barrett
6f8b366acb Rename liborte to libopen-rte and libopal to libopen-pal per telecon today
and bug #632.

Refs trac:632

This commit was SVN r12762.

The following Trac tickets were found above:
  Ticket 632 --> https://svn.open-mpi.org/trac/ompi/ticket/632
2006-12-05 18:27:24 +00:00
George Bosilca
56748d5f57 Correctly initialize the unpack buffer.
This commit was SVN r12529.
2006-11-10 05:11:02 +00:00
Sven Stork
9cf5b3709c - Add comment for volatile.
This commit was SVN r12436.
2006-11-06 14:00:43 +00:00
Sven Stork
27420fbda3 - Make counter volatile to prohibit compiler to perform optimisations.
Without this a compiler could assume that the counter is not updated
  my the malloc call and remove the test in the assert and always trigger
  the assertion.

This commit was SVN r12419.
2006-11-03 10:46:18 +00:00
George Bosilca
14c49b226a The data-type tests have to be updated too.
This commit was SVN r12334.
2006-10-27 05:34:26 +00:00
George Bosilca
a1c9a374eb Remove all the warnings from the data-type engine testing.
This commit was SVN r12167.
2006-10-18 17:00:43 +00:00
Jeff Squyres
5662122885 Fix "make dist". Temporarily snip some tests from the tarball until
they can be repaired (more changes coming in from the mad branch that
will break them).

This commit was SVN r11560.
2006-09-08 00:09:37 +00:00
Ralph Castain
7088c1a8a1 Remove stale tests from the "make check" routine
This commit was SVN r11525.
2006-09-05 13:05:03 +00:00
Jeff Squyres
d4a2a51921 Remove a bunch of compiler warnings and make the test a litle more
correct.

This commit was SVN r11521.
2006-09-03 14:23:02 +00:00
George Bosilca
76d2a0bb74 Remove the reference to the path_sep field from the test.
This commit was SVN r11396.
2006-08-24 16:17:33 +00:00
Ralph Castain
8c7f0ed9ae Change the SOH to the new State Monitoring and Reporting (SMR) framework. New API's will be appearing in the new framework shortly - this just gets the name change into the system.
Other changes:

1. Remove the old xcpu components as they are not functional.

2. Fix a "bug" in orterun whereby we called dump_aborted_procs even when we normally terminated. There is still some kind of bug in this procedure, however, as we appear to be calling the orterun job_state_callback function every time a process terminates (instead of only once when they have all terminated). I'll continue digging into that one.

This will require an autogen/configure, I'm afraid.

This commit was SVN r11228.
2006-08-16 16:35:09 +00:00
Terry Dontje
67980d7f97 Removed include of stdbool.h since it was not being used and was causing the
Sun compilers to abort when make check was done.

This commit was SVN r11145.
2006-08-10 14:25:45 +00:00
Josh Hursey
d082a63734 Add some new OPAL functionality.
After seeing the uglyness that is removing directories in the
codebase I decided to push down this to the OPAL by extending the
opal/os_create_dirpath.(c|h) to contain some more functionality.

In this process I renamed 'os_create_dirpath' to 'os_dirpath' since it
is a bit more general now.

Added a few functions to:
 - check if an directory is empty
 - check to see if the access permissions are set correctly
 - destroy the directory at the end of the dirpath
   - By using a caller callback function (a la Perl, I believe)
     for every file, the caller can have fine grained control over
     whether a specific file is deleted or not.

This simplifies things a bit for orte_session_dir_(finalize|cleanup)
as it should no longer contain any of this functionality, but uses
these functions to do the work.

From the external perspective nothing has changed, from the 
developer point of view we have some cleaner, more generic code.

This commit was SVN r10640.
2006-07-03 22:23:07 +00:00
Brian Barrett
0482bbd94b * add event into dist_subdirs so that the Makefile.in gets created
This commit was SVN r10559.
2006-06-28 22:53:21 +00:00
Brian Barrett
b34768962c * put event library tests into the testing infrastructure so that they can
be built without heroic effort

This commit was SVN r10517.
2006-06-26 22:28:59 +00:00
George Bosilca
379b170a29 Update the datatype tests.
This commit was SVN r10511.
2006-06-26 20:10:27 +00:00
George Bosilca
1ab7dcc632 Cleanups.
This commit was SVN r10509.
2006-06-26 20:09:04 +00:00
Jeff Squyres
f08e54029c - Update svn:ignore
- Built to_self, but don't run it during "make check" (because it
  calls MPI_INIT, which requires a functional install)

This commit was SVN r10491.
2006-06-23 02:14:27 +00:00
Jeff Squyres
67b07ba4fc AM complains if we define names with specific suffixes and the
executable name is not listed anywhere -- so just comment them out for
now.

This commit was SVN r10472.
2006-06-22 11:56:18 +00:00
Jeff Squyres
fa6b6c6098 This test calls MPI_INIT -- can't do that in the unit tests because
that assumes that OMPI has been fully installed (e.g., that may not be
valid during "make distcheck")

This commit was SVN r10470.
2006-06-22 11:47:31 +00:00
George Bosilca
efb987f156 Output the right message.
This commit was SVN r10457.
2006-06-21 16:25:02 +00:00
George Bosilca
9436729bee Impoving the checksum test.
This commit was SVN r10435.
2006-06-20 15:55:54 +00:00
George Bosilca
1e4199ee61 Add more datatype tests. One to check all communications to self, it is used
to compute the overhead of the convertor (and all convertor related operations).
The second will check the position setting on the convertor. Not yet completed ...

This commit was SVN r10432.
2006-06-20 14:19:44 +00:00
George Bosilca
338ef1dc96 The convertor should be prepared before calling personalize. Otherwise, no
datatype is attached to it.

This commit was SVN r10419.
2006-06-19 15:57:33 +00:00
George Bosilca
ba914bfb52 Don't use srandomdev (it's BSD specific). Instead use srandom with the time ...
This commit was SVN r10391.
2006-06-16 06:47:35 +00:00
George Bosilca
ad1065d572 Even moe complex. Now we do the unpacking using 2 iovecs. And still working ...
This commit was SVN r10374.
2006-06-15 06:21:16 +00:00
George Bosilca
d55783643d An updated version with a behavior closer to the buffered send.
This commit was SVN r10373.
2006-06-15 06:07:11 +00:00
George Bosilca
cb2f35b875 Add a checksum test. It allow to check if the same operation (pack/unpack)
done with the same values on 2 different types return the same value. The 2
types belong to 2 differents classes: contiguous and sparse. With this test
I simulate the behavior of the buffered send, where the sender send the
data from the user attached buffer (which is contiguous) and the receiver
receive it in a sparse type.

This commit was SVN r10372.
2006-06-15 05:28:17 +00:00
Brian Barrett
2e864470d4 * add include now needed with rearranging of includes in ompi class code
This commit was SVN r10361.
2006-06-14 21:30:17 +00:00
Galen Shipman
18dda70fd0 make ompi_free_list_item_t a class..
This will go to the 1.1 branch but will probably require a few changes as
ompi_free_list_t is different in the branch.. 

This commit was SVN r10306.
2006-06-12 16:44:00 +00:00
George Bosilca
9da7af4c96 Remove all warnings except the missing prototypes.
This commit was SVN r10108.
2006-05-26 20:53:35 +00:00
George Bosilca
6df7bf1a0f Remove one useless test.
This commit was SVN r10004.
2006-05-22 06:13:49 +00:00
George Bosilca
eb149cb9c8 Move the datatype tests in its own directory.
This commit was SVN r10003.
2006-05-22 06:12:43 +00:00
Rainer Keller
0f9b10ff8e - Update test dup MPI_COMM_WORLD -- so that we may
have additional Barriers for output.

This commit was SVN r9831.
2006-05-05 07:42:33 +00:00
George Bosilca
4f465967c7 Don't run if there are not at least 2 processes.
This commit was SVN r9668.
2006-04-20 19:12:09 +00:00
Brian Barrett
29c70291a9 * properly distribute the peruse test directory so that nightly builds
happen

This commit was SVN r9412.
2006-03-24 13:42:01 +00:00
George Bosilca
1439fb6e33 Look like finally we manage to do it correctly. Thanks Jeff.
This commit was SVN r9376.
2006-03-23 06:58:49 +00:00
George Bosilca
51df8175d7 Allow conditional compilation of the PERUSE test while still adding PERUSE to
the make dist target.

This commit was SVN r9375.
2006-03-23 05:53:38 +00:00
George Bosilca
57f0eeccff Activate the PERUSE test if we compile with PERUSE support.
This commit was SVN r9374.
2006-03-23 05:10:08 +00:00
George Bosilca
aef1358808 First import or peruse. As it look like SVN doesn't like to import simultaneously
2 directories having the same name I have to split the import in 2. I start with
the test and the configure.

This commit was SVN r9372.
2006-03-23 04:54:10 +00:00
Brian Barrett
89a22615ce * the .c files created by symlinking get shipped in the tarball as
actual files, so we should not have a clean rule for them - instead,
  make it maintainer-clean.  Neither clean nor distclean should
  remove files that were in the tarball...

This commit was SVN r9351.
2006-03-21 03:14:49 +00:00
Brian Barrett
1398169700 * forgot to fix up includes in the test directory with yesterday's commit.
This commit was SVN r8996.
2006-02-12 19:51:24 +00:00
Brian Barrett
566a050c23 Next step in the project split, mainly source code re-arranging
- move files out of toplevel include/ and etc/, moving it into the
    sub-projects
  - rather than including config headers with <project>/include, 
    have them as <project>
  - require all headers to be included with a project prefix, with
    the exception of the config headers ({opal,orte,ompi}_config.h
    mpi.h, and mpif.h)

This commit was SVN r8985.
2006-02-12 01:33:29 +00:00
Brian Barrett
39aadc3d9c * fix makefile so that make dist works again
This commit was SVN r8919.
2006-02-07 13:20:07 +00:00
Ralph Castain
4b9f015c0b Merge in the new data support subsystem for ORTE. MPI folks should not notice a difference. Longer explanation will be sent to developers mailing list.
This commit was SVN r8912.
2006-02-07 03:32:36 +00:00
Brian Barrett
f23356aa02 * if we don't have mmap hook support, don't expect to get a callback from
mmap().

This commit was SVN r8609.
2005-12-24 05:34:51 +00:00
Brian Barrett
a5af07cd6b fixes suggested by Ralf for supporting both Libtool 1 and 2 in Open MPI...
This commit was SVN r8538.
2005-12-19 03:10:23 +00:00
Brian Barrett
902122e248 * update tests to match memory api changes
This commit was SVN r8531.
2005-12-16 21:26:13 +00:00
George Bosilca
e77f60abfe Use __WINDOWS__ instead of _WIN32.
This commit was SVN r8479.
2005-12-13 06:05:37 +00:00
Brian Barrett
f44bd9e067 * Intercept both allocations and deallocations from ptmalloc2 (including
both mmap and munmap), adjusting the configure script so that the
  component will only be activated on systems that use ptmalloc2 in the
  first place -- ie, Linux
* Remove the malloc_hooks component - it became an unworkable solution
  once threads and such were considered.
* Remove malloc_interpose component - it never worked quite right and
  was not going to be able to intercept malloc, so it wasn't going to
  be useful for OMPI's purposes.
* Update tests a little bit to match recent memory hooks api
  issues - still needs a bit of work.

This commit was SVN r8381.
2005-12-06 00:44:50 +00:00
Brian Barrett
f301d06fd4 * clean up the memory check test a little bit - still needs some work
This commit was SVN r8343.
2005-11-30 22:32:18 +00:00
Brian Barrett
79bf8843d2 * update memory hooks interface to allow for callbacks on both allocations
and dealllocations, per request from Galen and Tim

This commit was SVN r8303.
2005-11-29 04:46:14 +00:00
Brian Barrett
c6fb3217f8 always include ltdl.h with the full opal path, so that we always grab our
version instead of the (possible) system installed version.

This commit was SVN r8248.
2005-11-23 19:30:12 +00:00
Jeff Squyres
4a208939f3 Don't run ompi_fifo and ompi_circular_buffer tests; the interfaces
have changed and the tests have not changed with them.

This commit was SVN r8137.
2005-11-13 11:33:23 +00:00
Brian Barrett
878676218e Rename opal/memory to opal/memoryhooks because XLC++ on Mac OS X is broken.
When compiling C++ code that includes something that looks for the C++
header file "memory" (stupid C++ headers not having .h extensions), it
goes through the header file search path, which includes $(topsrcdir)/opal,
so it finds the directory $(topsrcdir)/opal/memory/ and tries to load
that as the memory header file and all goes downhill.

This commit was SVN r8111.
2005-11-11 00:26:27 +00:00
Brian Barrett
586a9be557 * make it easier to compare free timings with / without memory hooks enabled
This commit was SVN r8059.
2005-11-09 18:20:08 +00:00
Jeff Squyres
42ec26e640 Update the copyright notices for IU and UTK.
This commit was SVN r7999.
2005-11-05 19:57:48 +00:00
Ralph Castain
eebda71a0b Add a new API to the registry for conditional data retrievals. The new API allows you to retrieve data from registry containers that have key-value pairs where the value matches the specified one. The requested keys are then retrived from that container.
This commit was SVN r7907.
2005-10-28 00:30:58 +00:00
Jeff Squyres
94d38b1812 Add header file for non-g++ compilers
This commit was SVN r7875.
2005-10-25 21:59:01 +00:00
Brian Barrett
7c924dc221 * don't try to fire up orte - nothing good comes from trying to open all those
components...

This commit was SVN r7517.
2005-09-27 14:48:41 +00:00
Galen Shipman
51f1c7a8e4 bring ompi_fifo up to new mpool interface,, looks like this has been stale for
some time. Comment out an incorrect test in ompi_rb_tree.c 

This commit was SVN r7516.
2005-09-27 14:36:53 +00:00
Galen Shipman
c728c0acc5 more changes to test..
seems to have errors inserting some items, these are apparently duplicates.. 

This commit was SVN r7505.
2005-09-24 17:07:08 +00:00
Galen Shipman
ac935f3a51 a very basic test for using the tree with base and bound segments..
- FAILS..

This commit was SVN r7502.
2005-09-24 16:19:44 +00:00
Brian Barrett
4efd0405c3 * fixes to make tests compile again - dunno when these broke :/
This commit was SVN r7500.
2005-09-24 01:17:32 +00:00
Brian Barrett
ed56e743b7 * update configure.ac to use the modern version of AC_INIT and
AM_INIT_AUTOMAKE, instead of the deprecated version.
* Work around dumbness in modern AC_INIT that requires the version
  number to be set at autoconf time (instead of at configure time, as
  it was before).  Set the version number, minus the subversion r number,
  at autoconf time.  Override the internal variables to include the r
  number (if needed) at configure time.  Basically, the right thing
  should always happen.  The only place it might not is the version
  reported as part of configure --help will not have an r number.
* Since AM_INIT_AUTOMAKE taks a list of options, no need to specify
  them in all the Makefile.am files.
* Addes support for subdir-objects, meaning that object files are put
  in the directory containing source files, even if the Makefile.am is
  in another directory.  This should start making it feasible to
  reduce the number of Makefile.am files we have in the tree, which
  will greatly reduce the time to run autogen and configure.

This commit was SVN r7211.
2005-09-07 05:54:53 +00:00
Jeff Squyres
dc7326a5e5 Minor fix to find header files.
This commit was SVN r7191.
2005-09-06 10:35:17 +00:00
Brian Barrett
827122fc62 * order libraries in the correct order so that opal_error test works
properly

This commit was SVN r7141.
2005-09-02 03:28:38 +00:00
Ralph Castain
66a215eae1 More memory cleanup...
1. Valgrind is good for something - chasing down memory leaks in registry led me to re-visit the dictionary functions and discover that I wasn't keeping track of the number of dictionary entries on each segment! Resulted in wasted time searching blank entries as well as leaked memory. This has now been fixed.

2. Fixed the orte_bitmap test. The init function for that class has been eliminated and the constructor adjusted to provide that functionality.

This commit was SVN r7136.
2005-09-02 00:26:58 +00:00
Ralph Castain
76e622a552 Clean up a few memory leaks - more to go...
This commit was SVN r7134.
2005-09-01 17:38:04 +00:00
Jeff Squyres
3962c53e2e - Add to AM_CPPFLAGS $(OPAL_LTDL_CPPFLAGS) where necessary in order to
add a -I to find the included ltdl.h (vs. a system-installed ltdl.h)
- Clean up kruft in a bunch of Makefile.am's to remove now-unnecessary
  AM_CPPFLAGS settings to get static-components.h for each framework
- Move the component_repository API functions out of opal/mca/base/base.h
  and into opal/mca/base/mca_base_component_repository.h in order to
  decrease unnecessary dependencies (e.g., before this, almost
  everything in the tree depended on ltdl.h, which is unnecessary --
  only a small number of files really need ltdl.h)

This commit was SVN r7127.
2005-09-01 12:16:36 +00:00
Brian Barrett
77ebdf1c6f * Add some debugging output Ralph asked for when an unknown error code is
passed to opal_error

This commit was SVN r7087.
2005-08-29 23:36:53 +00:00
Brian Barrett
07b589100e * add test for init_finalize of orte (useful for memory leak checks)
* update ORTE tests to cope with change in prototype for orte_init()

This commit was SVN r7081.
2005-08-29 19:32:46 +00:00
Brian Barrett
660ce0a486 * update tests to reflect moving path_sep out of orte_sys_info and moving
orte_sys_info out of OPAL and into ORTE

This commit was SVN r7080.
2005-08-29 19:23:25 +00:00
Brian Barrett
2143ed4c81 * move error -> string converter registration from orte_init to
orte_init_stage1(), since not all ORTE processes call orte_init().
* Expad opal_error test case to make sure ORTE error codes print
  properly
* Make project error codes start at easy values (OPAL is -1 to -100,
  ORTE is -101 to -200, OMPI is less than -201) to make it easier
  to figure out what an error code as an integer means.  Also has
  the nice property of not changing the values of error codes ever
  time a new error code is added.

This commit was SVN r7061.
2005-08-26 23:36:57 +00:00
Jeff Squyres
c9cdb36b0b Finally get this right: move orte_sys_info.[ch] back into the orte
tree.
- fix up #include's throughout the tree (yay contrib/search_replace.pl!)
- remove a few extraneous #include's
- remove orte_sys_info*() from opal_init()/opal_finalize() (it's
  already in orte_init_stage1() and orte_system_finalize())
- remove dependencies in opal on orte_system_info -- util/os_path.c
  and util/os_create_dirpath.c (they only used path_sep, anyway --
  easily changed to #defines)

This commit was SVN r7059.
2005-08-26 21:03:41 +00:00
Brian Barrett
17c1bb355e * more memory leak fixes - mainly string params not being freed at end of
time
* Added code to free dps structures at shutdown

This commit was SVN r7043.
2005-08-26 02:08:23 +00:00
Brian Barrett
7a8e646fff * add missing OBJ_RELEASE in memory manager code
* fix off-by-one error in os_path code
* Bunch of memory fixes for OPAL unit tests

This commit was SVN r7033.
2005-08-25 16:28:41 +00:00
Brian Barrett
e2e18d49a3 * add trivial opal init/finalize app
This commit was SVN r7011.
2005-08-24 20:20:21 +00:00
Brian Barrett
1454e40f0e * for some reason, was having issues with C bool vs C++ bool on odin.
cast the return to an int in the C++ test case, just in case.
* C++ sucks.  If compiling with C++ on some GNU compiler/linker
  combos, the initialize hook isn't automagically fired for the
  malloc code.  Add a backup setting during opal_init, which is
  early enough not to cause any damage. 

This commit was SVN r6983.
2005-08-23 16:03:16 +00:00
Brian Barrett
f48968d8f4 clean up the error code situation - ensure that OMPI_ERROR == ORTE_ERROR ==
OPAL_ERROR, same for all the other error codes.  Also, make sure that there
are never conflicts between OPAL anr ORTE error codes (for example).
Finally, provide opal_perror(), opal_strerror(), and opal_strerror_r() to
give stringified error messages for the different error codes

This commit was SVN r6969.
2005-08-22 03:05:39 +00:00
Brian Barrett
8d15ee8b2f * remove pml direct call header file as part of make distclean
* remove output files for tests as part of make clean

This commit was SVN r6966.
2005-08-21 23:48:12 +00:00
Brian Barrett
948d8963e2 * must add SUBDIRS to DIST_SUBDIRS explicitly. duh.
This commit was SVN r6965.
2005-08-21 23:37:20 +00:00
Brian Barrett
b70470d693 * oops - meant to move runtime to DIST_SUBDIRS, not remove it completely.
This commit was SVN r6964.
2005-08-21 23:02:39 +00:00
Brian Barrett
2191d4a6b9 * don't do "make check" into runtime/ automatically because the start/stop
test requires MCA be setup properly and the sigchld test really isn't that
  important at this point

This commit was SVN r6962.
2005-08-21 22:40:11 +00:00
Brian Barrett
85beffc8f9 * don't try to test functionality the system told you wasn't there ;)
This commit was SVN r6959.
2005-08-21 21:37:28 +00:00
Brian Barrett
25701c739f * remove debugging #if 0 that accidently sneaked into the test
This commit was SVN r6958.
2005-08-21 21:31:17 +00:00
Brian Barrett
eeeaa7f162 * bunch of header file and constant changes to make test tree run
"make check" after the opal/orte/ompi constants renames

This commit was SVN r6951.
2005-08-21 19:21:09 +00:00
Brian Barrett
4da11a4851 * add simple C++ test to make sure that new/delete trigger the correct
callbacks

This commit was SVN r6924.
2005-08-18 15:45:09 +00:00
Brian Barrett
dfdb5dc12a * high resolution, low latency timers for a number of platforms, plus mods
to opal_progress() to use the timers instead of a tick count for deciding
  whether to call the event loop or not.  Currently supported platforms are:

     - solaris (x86 / sparc)
     - Linux (x86 / x86_64 / IA64)
     - Mac OS X (x86 / Power PC)

This commit was SVN r6922.
2005-08-18 05:34:22 +00:00
Brian Barrett
1134d9b7d7 * remove old, broken, horrible hack for doing memory intercepts on Darwin
* Add memory intercept routines for Darwin using the official Darwin
  API (thanks to Drew Gallatin from Myricom for pointing me to some
  information from Apple engineers about how to make this work)
* add debugging output to functionality test

This commit was SVN r6920.
2005-08-18 02:59:02 +00:00
Brian Barrett
4b3969d2ab * add header file so that this compiles again
This commit was SVN r6917.
2005-08-17 22:44:13 +00:00
Jeff Squyres
c465eb8567 Rename opal/threads/thread.h -> opal/threads/threads.h to avoid a
naming conflict with Solaris' <thread.h>

This commit was SVN r6879.
2005-08-15 11:02:01 +00:00
Brian Barrett
b514fc10da * add WRAPPER_LDFLAGS to LDFLAGS, since we might be adding random hooks in
there

This commit was SVN r6871.
2005-08-14 17:22:54 +00:00
Brian Barrett
683e80c48b * fix basic test to work better when using memory intercept methods that
only catch actual releases back to the OS

This commit was SVN r6866.
2005-08-14 03:10:51 +00:00
Brian Barrett
1b830beddb * move over changes from the /tmp/bwb-memory-hooks/copy-1 into the trunk.
This includes updates to the malloc_hook method and making everything
  components.

This commit was SVN r6852.
2005-08-13 01:08:34 +00:00
Brian Barrett
c9f5e591b1 * make sure to try munmap when testing the hooks
* add check to see impact of our hooks with malloc/free timings

This commit was SVN r6817.
2005-08-12 13:29:26 +00:00
Brian Barrett
f707ba2dd3 * Add memory dispatching code for OPAL. This allows anyone to register
callbacks to be triggered when memory is about to leave the current
  process.  The system is designed to allow a variety of interfaces,
  hopefully including whole-sale replacement of the memory manager,
  ld preload tricks, and hooks into the system memory manager.  Since
  some of these may or may not be available at runtime and we won't know
  until runtime, there is a query funtion to look for availability of
  such a setup.
* Added ptmalloc2 memory manager replacement code.  Not turned on by
  default, can be enabled with --with-memory-manager=ptmalloc2.
  Only tested on Linux, not even compiled elsewhere.  Do not use
  on OS X, or you will never see your process again.
* Added AM_CONDITIONAL for threads test to support ptmalloc2's build
  system

This commit was SVN r6790.
2005-08-09 22:40:42 +00:00
Ralph Castain
5208f9001d Update the gpr unit tests
This commit was SVN r6758.
2005-08-07 13:09:34 +00:00
Ralph Castain
4e1837687b Finish simplified interfaces for put and subscribe - more details to come.
This commit was SVN r6713.
2005-08-02 19:43:29 +00:00
Ralph Castain
ed1022afd3 Update the unit test for the new put functions.
I'll send out a general note about this in the morning, but for now I'll just notify people through this note that the new simplified "put" commands have been debugged and work just fine. I'll add documentation to the gpr.h file later - only think to really be aware of is that the tokens array must be NULL terminated. Other than that, things work pretty much as you'd expect.

This commit was SVN r6700.
2005-08-02 02:31:53 +00:00
Ralph Castain
63cef99bcd Add unit test for quick put function - not fully ready yet
This commit was SVN r6693.
2005-08-01 21:45:39 +00:00
Jeff Squyres
28d6651350 Oops -- that should not have been committed.
This commit was SVN r6599.
2005-07-24 11:27:56 +00:00
Jeff Squyres
9dab81d86b A bunch of updates to the unit tests
- Update svn:ignore's to match new exectuable names
- Consolidate the unit test Makefile.am flags into a testing
  Makefile.options 
- Remove a bunch of SUBDIRS from test/mca/Makefile so that they don't
  run by default, but can be invoked manually (they're still in
  DIST_SUBDIRS) 

This commit was SVN r6598.
2005-07-23 11:11:19 +00:00
Ralph Castain
13fdcff66b Fix a bug Greg was seeing on subscription returns - problem in pointer arithmetic
This commit was SVN r6594.
2005-07-22 20:46:07 +00:00
Ralph Castain
daf3ee8172 fix the dps tests to support new notify_data type definition
This commit was SVN r6568.
2005-07-20 19:00:54 +00:00
Brian Barrett
b04c726ad1 Fix up tests so that they all compile and (mostly) run
This commit was SVN r6338.
2005-07-04 14:53:10 +00:00
Brian Barrett
46245aaac1 * rename orte_os_create_dirpath to opal_os_create_dirpath
* rename orte_os_path to opal_os_path
* rename ompi_path_find to opal_path_find
* rename ompi_pow2 to opal_pow2

This commit was SVN r6334.
2005-07-04 01:59:52 +00:00
Brian Barrett
e55f99d23a * rename ompi_if to opal_if
* rename ompi_malloc to opal_malloc
* rename ompi_numtostr to opal_numtostr
* start of rename of ompi_environ to opal_environ

This commit was SVN r6332.
2005-07-04 01:36:20 +00:00
Brian Barrett
9f44b80291 * rename ompi_argv to opal_argv
* rename ompi_basename to opal_basename
* rename ompi bitop functions to opal
* rename ompi_cmd_line to opal_cmd_line
* rename ompi_sizet2int to opal_sizet2int
* rename orte_daemon_init to opal_daemon_init
* rename ompi_few to opal_few

This commit was SVN r6330.
2005-07-04 00:13:44 +00:00
Brian Barrett
a13166b500 * rename ompi_output to opal_output
This commit was SVN r6329.
2005-07-03 23:31:27 +00:00
Brian Barrett
23b687b0f4 * rename ompi_event to opal_event
This commit was SVN r6328.
2005-07-03 23:09:55 +00:00
Brian Barrett
39dbeeedfb * rename locking code from ompi to opal
This commit was SVN r6327.
2005-07-03 22:45:48 +00:00
Brian Barrett
ccd2624e3f * rename ompi_progress to opal_progress
This commit was SVN r6326.
2005-07-03 21:57:43 +00:00
Brian Barrett
9da0b4fe1d * rename all the atomic functions from ompi to opal
This commit was SVN r6325.
2005-07-03 21:38:51 +00:00
Brian Barrett
9f0c969bb4 * rename ompi_hash_table opal_hash_table
This commit was SVN r6324.
2005-07-03 16:52:32 +00:00
Brian Barrett
764a9314db * rename ompi_value_array opal_value_array
This commit was SVN r6323.
2005-07-03 16:38:52 +00:00
Brian Barrett
761402f95f * rename ompi_list to opal_list
This commit was SVN r6322.
2005-07-03 16:22:16 +00:00
Brian Barrett
499e4de1e7 * rename ompi_object and ompi_class to opal_object and opal_class
This commit was SVN r6321.
2005-07-03 16:06:07 +00:00
Jeff Squyres
76ba66734d I gave bad advice to Ralph yesterday; he asked how to disable the gpr
unit tests without screwing up the nightly builds.

These changes fix the problem of not including the test/mca/gpr
directory in the nightly tarball, prevent the tests from being
compiled, but leave the door open for manual compilation when the time
comes to start the work to re-enable them (e.g., uncomment a few
lines in gpr/Makefile.am).

This commit was SVN r6175.
2005-06-25 11:21:59 +00:00
Ralph Castain
8271d3f30e Okay, here is the massive checkin that restructures the registry trigger system for scalability. Actually, it isn't "quite" as large as it looks - it just touches a bunch of files.
Also included is a fix to the attribute problem for singletons.

Short explanation:
The prior system placed triggers and subscriptions on the registry for each process - approximately eight/process. Each of these had to be checked every time there was a registry operation such as a "put" or "increment-value". For large numbers of processes, this repetitive checking consumed some significant time.

The new system allows processes to "attach" to existing triggers and subscriptions, without creating a new one. Thus, there are now only eight triggers and five subscriptions on a job - *regardless of how many processes are being run*. This means that the registry now takes the same amount of time (which is pretty darn short) to process an operation regardless of how many processes are in a job.

I'll provide some startup times from scalability tests shortly - need to complete the commit so I can move the system to an appropriate cluster.

This commit was SVN r6164.
2005-06-24 16:59:37 +00:00
Jeff Squyres
d9b0aa9654 Temporarily comment out the test_rds2 test because all it does is test
the RDS selection logic, which is, unfortunately, not yet well
supported by the testing infrastructure (it causes false failures in
the nightly build).

This commit was SVN r6073.
2005-06-16 11:25:27 +00:00
Ralph Castain
83cba7f7cf Checkpoint. Fixed a logic problem that removed one-shot subscriptions even though the notifiers were supposed to stay.
This commit was SVN r6052.
2005-06-13 20:43:05 +00:00
Ralph Castain
098cc8cf3a Bring the rest of the notification modes online. Update the unit test to cover notify-on-change.
This commit was SVN r6043.
2005-06-13 14:37:02 +00:00
Ralph Castain
1c57ae20b0 Checkpoint the notifier work - notify when something is added now works, need to simply turn on the other checks.
Existing code shouldn't see any impacts. Tested on up to 125 processes.

This commit was SVN r6020.
2005-06-09 20:37:25 +00:00