1
1

312 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
8271d3f30e Okay, here is the massive checkin that restructures the registry trigger system for scalability. Actually, it isn't "quite" as large as it looks - it just touches a bunch of files.
Also included is a fix to the attribute problem for singletons.

Short explanation:
The prior system placed triggers and subscriptions on the registry for each process - approximately eight/process. Each of these had to be checked every time there was a registry operation such as a "put" or "increment-value". For large numbers of processes, this repetitive checking consumed some significant time.

The new system allows processes to "attach" to existing triggers and subscriptions, without creating a new one. Thus, there are now only eight triggers and five subscriptions on a job - *regardless of how many processes are being run*. This means that the registry now takes the same amount of time (which is pretty darn short) to process an operation regardless of how many processes are in a job.

I'll provide some startup times from scalability tests shortly - need to complete the commit so I can move the system to an appropriate cluster.

This commit was SVN r6164.
2005-06-24 16:59:37 +00:00
Jeff Squyres
d9b0aa9654 Temporarily comment out the test_rds2 test because all it does is test
the RDS selection logic, which is, unfortunately, not yet well
supported by the testing infrastructure (it causes false failures in
the nightly build).

This commit was SVN r6073.
2005-06-16 11:25:27 +00:00
Ralph Castain
83cba7f7cf Checkpoint. Fixed a logic problem that removed one-shot subscriptions even though the notifiers were supposed to stay.
This commit was SVN r6052.
2005-06-13 20:43:05 +00:00
Ralph Castain
098cc8cf3a Bring the rest of the notification modes online. Update the unit test to cover notify-on-change.
This commit was SVN r6043.
2005-06-13 14:37:02 +00:00
Ralph Castain
1c57ae20b0 Checkpoint the notifier work - notify when something is added now works, need to simply turn on the other checks.
Existing code shouldn't see any impacts. Tested on up to 125 processes.

This commit was SVN r6020.
2005-06-09 20:37:25 +00:00
Ralph Castain
51380eba13 Checkpoint the continuing re-enablement of the notifiers.
Also added a check to protect the callback system from an error being seen by Tim P. - should help with debugging.

This commit was SVN r6010.
2005-06-09 13:35:35 +00:00
Ralph Castain
7306b9d7b9 Fix the registry search routine to remove a buffer that wasn't expanding as it should - cause of recent problems observed when spawning larger numbers of processes.
For anyone interested, the problem stemmed from two things:

1. a bug in the ompi_bitmap utility (which I copied to orte_bitmap to avoid unintentionally disturbing something else) that causes the bitmap NOT to expand unless the caller asks for a bit that is more than one byte outside the current array size. The unit test didn't pick it up because it doesn't check that close to the boundary.

2. a "feature" in the ompi_bitmap utility that only expands the array if you try to SET a bit outside the current boundary, but NOT if you try to CLEAR a bit outside the array limit. This appears intentional as the unit test checks for this behavior, but I hadn't been expecting the asymmetry.

The orte_bitmap utility now appropriately expands in both circumstances. I also added a function to expand the array so it "covers" a bit location without setting or clearing it. The function allows you to ensure the array is big enough to handle the specified bit, but leave the bit alone if it already is there (the other functions would set/clear it if it was).

I've tested it with up to 100 processes without problem.

This commit was SVN r5980.
2005-06-08 15:48:38 +00:00
Galen Shipman
aaa236052d changed function signitures to match the changes in mpool
This commit was SVN r5911.
2005-06-01 15:25:17 +00:00
Tim Prins
75b0b519d8 - Added functionality to MPI_Alloc_mem and MPI_Free_mem so that they
call the memory pool to do special memory allocations, and extended 
the mpool so that it will do the allocations and keep tack of them in
a tree. Currently, if you pass MPI_INFO_NULL to MPI_Alloc_mem, we will 
try to allocate the memory and register it with as many mpools as 
possible. Alternatively, one can pass an info object with the names of 
the mpools as keys, and from these we decide which mpools to register 
the new memory with.

- fixed some comments in the allocator and fixed a minor bug

- extended the red black tree test and made a minor correction

This commit was SVN r5902.
2005-05-31 19:07:27 +00:00
Ralph Castain
93eb0d4b40 Checkpoint
This commit was SVN r5814.
2005-05-23 14:22:35 +00:00
Ralph Castain
689a290711 Add one further degree of separation between opal and orte - allow separate init of the two systems. This allows the restart capability to avoid hitting opal utilities (e.g., mca_base_open, ompi_output_init) repeatedly.
Clean up the ignores as well.

This commit was SVN r5811.
2005-05-22 18:40:03 +00:00
Ralph Castain
7b6db8a18f Can now start/finalize/restart the run-time without crashing.
Add a unit test for that functionality - will test more fully next week.

This commit was SVN r5806.
2005-05-22 03:11:33 +00:00
Ralph Castain
54a481cc14 Fix an incorrect free...
This commit was SVN r5724.
2005-05-16 21:06:09 +00:00
Ralph Castain
89b6a97f0f Bring the resource discovery system's resource file component online so I can find the node I need to launch upon. I removed all reference to the xml library that was causing trouble, and wrote my own limited xml parser instead, so this will now compile just fine anywhere.
Need to do some refining of the component, but it meets basic requirements right now. Nobody else should notice any change - system basically ignores it unless you tell it to do something.

This commit was SVN r5723.
2005-05-16 21:01:09 +00:00
Jeff Squyres
722ee2103b Fix to the fix -- Brian and I agree that this is a better fix.
This commit was SVN r5693.
2005-05-12 02:44:20 +00:00
Jeff Squyres
2b2f2f3c04 Fix a bunch of compiler warnings, mostly on 64 bit:
- some union { void*; int; } fixes for asm tests
- size_t / %lu fixes for a bunch of others

This commit was SVN r5677.
2005-05-10 23:28:31 +00:00
Jeff Squyres
a28b5ae43b Fix for a bunch of size_t issues; reviewed by George and Ralph.
- Change all uses of *printf'ing a size_t to use an explicit cast to
  (unsigned long) and the %lu escape
- change ORTE_GPR_REPLICA_MAX_SIZE to INT_MAX until bug 1345 is fixed
  (i.e., until we allow size_t in MCA params)
- ns_base_local_fns.c:orte_ns_base_get_proc_name_string(): changed
  from %0X -> %lu
- ORTE_NAME_ARGS added explicit (unsigned long) casts, and changed all
  usages of ORTE_NAME_ARGS to use %lu's

This commit was SVN r5644.
2005-05-08 13:22:55 +00:00
Ralph Castain
659d57f300 Several things in this commit - shouldn't impact any existing work:
1. Added pid_t to the dps

2. Processes now "register" their local pid and update their location (i.e., nodename) on the registry during mpi_init

3. Added a new error code for values that exceed maximum for their data type (useful when transitioning a value from one variable to another of different size)

4. Fixed a few places where size_t was being incorrectly handled

5. Updated dps_test to cover pid_t types

This should now provide support for TotalView connection - which David is pursuing.

This commit was SVN r5623.
2005-05-06 17:00:06 +00:00
Jeff Squyres
e1ab50d5e9 Add missing header files
This commit was SVN r5583.
2005-05-04 00:13:40 +00:00
Ralph Castain
44b83e73ef Fix the print warnings for the name services conversions on names from their binary value to a string.
HEADS UP: string versions of names are now presented in DECIMAL format - not HEX as they previously were. If you used the name services functions (as you were supposed to do) to access these names, you will not have any problems. If you did it yourself, then you need to fix it - my suggestion would be that you fix your code by using the name service functions to avoid future problems.

This commit was SVN r5571.
2005-05-02 15:06:13 +00:00
Ralph Castain
931924397c Fix several minor things:
1. *correctly* fix the printing of size_t variables. Need to do this through a #define, not just typecast things. Thanks to Jeff/Brian for suggesting a cleaner way to do it (as opposed to just doing the #define at the print location). Note that not ALL of the prints have been "fixed" yet - will continue to identify them.

2. Add int64 and size_t to the pack/unpack unit tests.

3. Fix a bug in the int64 pack/unpack system.

This commit was SVN r5570.
2005-05-02 14:48:57 +00:00
Jeff Squyres
bcd4797389 Commit 4 of 4 for bringing the changes over from the hetero branch.
Merged in from:

svn merge -r5506:5553 https://svn.open-mpi.org/svn/ompi/tmp/hetero .

This commit was SVN r5552.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r5506
  r5553
2005-05-01 00:58:06 +00:00
Jeff Squyres
aa70022dc2 Commit 2 of 4 for bringing the changes over from the hetero branch.
Merged in from:

svn merge -r5448:5496 https://svn.open-mpi.org/svn/ompi/tmp/hetero .

This commit was SVN r5550.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r5448
  r5496
2005-05-01 00:53:00 +00:00
Jeff Squyres
462adee81a Commit 1 of 4 to bring in the hetero branch to the trunk. Merged in
from:

svn merge -r5440:5448 https://svn.open-mpi.org/svn/ompi/tmp/hetero .

This commit was SVN r5549.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r5440
  r5448
2005-05-01 00:47:35 +00:00
Rainer Keller
ebfee139e0 Small updates to build_tarball:
- allow bz2 uncompression
 - do not try to detect download-program, if file_arg

Allow VPATH-build in test of asm-check

This commit was SVN r5522.
2005-04-28 15:04:00 +00:00
Brian Barrett
de128a69fb Skip test when on old LinuxThreads machines and using progress threads
since you can't fork() in one thread and waitpid() on the child in another,
which is what this test expects you to do.  If Linux would just implement
the stupid POSIX standard already, this wouldn't be a problem.

This commit was SVN r5482.
2005-04-21 19:33:18 +00:00
Brian Barrett
0964152893 clean up the OMPI_BUILDING #define. Rather than being defined to 1 if
we are part of the source tree and not defined otherwise, we are going
with an always defined if ompi_config.h is included policy.  If
ompi_config.h is included before mpi.h or before OMPI_BUILDING is set,
it will set OMPI_BUILDING to 1 and enable all the internal code that
is in ompi_config_bottom.h.  Otherwise, it will only include the
system configuration data (enough for defining the C and C++ interfaces
to MPI, but not perturbing the user environment).

This should fix the problems with bool and the like that the Eclipse
folks were seeing.  It also cleans up some build system hacks that
we had along the way.

Also, don't use int64_t as the default size of MPI_Offset, because it
requires us including stdint.h in mpi.h, which is something we really
shouldn't be doing.

And finally, fix a ROMIO Makefile that didn't set -DOMPI_BUILDING=1,
as ROMIO includes mpi.h, but not ompi_config.h

This commit was SVN r5430.
2005-04-19 03:51:20 +00:00
Brian Barrett
5f4a433086 * remove unused file
This commit was SVN r5428.
2005-04-19 03:43:42 +00:00
Brian Barrett
cd76153a74 * dumb bug fixes
This commit was SVN r5422.
2005-04-18 19:52:39 +00:00
Brian Barrett
63bd314a0b * Update ASM tests to do more thread testing (which should help find bugs)
* Update cmpset test to call memory barrier when needed before checking the
  results
* remove unneeded sync from cmpset_32 on Power PC

This commit was SVN r5420.
2005-04-18 19:33:23 +00:00
Jeff Squyres
41db769781 Add a whole bunch of missing <util/output.h> files
This commit was SVN r5408.
2005-04-16 14:14:51 +00:00
Jeff Squyres
61f55f1011 Fix the problem with the nightly unit tests -- do *not* override
OMPI_ENABLE_DEBUG because that changes the size of struct's (e.g.,
ompi_object) in the unit tests as compared to what may have been
compiled in the library.

This commit was SVN r5373.
2005-04-15 13:32:18 +00:00
Jeff Squyres
0cdcba3403 Add new utility function ompi_basename() for a portable version of
basename with stronger guarantees on its semantics.  See the doxygen
comments for how to call and use it.

This commit was SVN r5329.
2005-04-14 14:04:41 +00:00
Jeff Squyres
2be1a1a2f3 Fixes provided by Ralph to help cleanup of the session directory,
especially upon abnormal termination of a process.  Not yet integrated
into the fork pls; pending more discussion with other developers.

This commit was SVN r5326.
2005-04-14 01:04:26 +00:00
Jeff Squyres
e15d0778ae Don't test for a bad condition
This commit was SVN r5323.
2005-04-13 21:25:25 +00:00
Jeff Squyres
2aea9cd484 Add some missing header files.
This commit was SVN r5242.
2005-04-09 14:29:29 +00:00
Jeff Squyres
f65cc5febf Fix ompi_list test to properly initialize its list items
This commit was SVN r5229.
2005-04-08 18:05:13 +00:00
Jeff Squyres
88b1326554 Double drat! Forgot to commit the support library updates. :-(
This commit was SVN r5228.
2005-04-08 17:56:16 +00:00
Jeff Squyres
2d802bafab Convert over the rest of the unit tests to dynamically open components
(oops -- didn't realize there were so many that needed it)

This commit was SVN r5223.
2005-04-08 11:21:43 +00:00
Ralph Castain
c52a21e1b3 Fix a couple of minor bugs that were preventing the session directory system from completely cleaning up. Unit test now shows that it will cleanup all session directory levels IF no files are present.
This commit was SVN r5210.
2005-04-07 19:19:48 +00:00
Jeff Squyres
2bd8347a7e Fix compiler warnings
This commit was SVN r5189.
2005-04-06 01:46:39 +00:00
Jeff Squyres
5946d303d8 Make this test compile again
This commit was SVN r5186.
2005-04-05 20:40:22 +00:00
Jeff Squyres
c36eab4749 32/64 fixes (ensure that our classes really at 64 bit clean -- and
they are!  Mostly the tests needed to be adjusted to run on 32 and 64
systems)

This commit was SVN r5185.
2005-04-05 20:34:56 +00:00
Jeff Squyres
c19182a0d7 Remove compiler warning
This commit was SVN r5184.
2005-04-05 19:54:56 +00:00
Jeff Squyres
066dd5cbac Fix some minor issues:
- (void*) <--> function pointer casting
- missing <string.h>

This commit was SVN r5183.
2005-04-05 19:53:56 +00:00
Jeff Squyres
b5459e9da6 Convert the rest of the GPR tests to use loading DSO's and function
pointers instead of direct function invocations.

This commit was SVN r5180.
2005-04-05 18:18:27 +00:00
Jeff Squyres
394be4e6cb Ensure that the output is printed
This commit was SVN r5173.
2005-04-05 02:32:57 +00:00
Jeff Squyres
205ed1d9d9 @#@#$ Accidentally reverse-backed out the patch in this file, causing
replicated code.  Doh!

This commit was SVN r5171.
2005-04-05 02:12:44 +00:00
Jeff Squyres
1701010301 Back out this patch; it appears to break in at least 64 bit
environments.  Working on the fix, but don't break everyone's unit
tests while I'm working on it -- will re-commit once ompi_setenv() and
ompi_unsetenv() are fixed.

This commit was SVN r5166.
2005-04-04 22:55:26 +00:00
Jeff Squyres
38b814b0cc Convert to use portable ompi_setenv() and ompi_unsetenv()
This commit was SVN r5164.
2005-04-04 22:17:48 +00:00