1
1
Граф коммитов

874 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
dd27622814 Fix fd leak noted by Paul Hargrove.
http://www.open-mpi.org/community/lists/devel/2007/10/2493.php

This commit was SVN r16564.
2007-10-25 16:03:21 +00:00
Josh Hursey
0bf61a1b84 Move in some accumulated small features and minor bug fixes for C/R support.
{{{
svn merge -r 16447:16475 https://svn.open-mpi.org/svn/ompi/tmp/jjh-fgs .
}}}

This commit was SVN r16478.
2007-10-17 13:47:36 +00:00
Tim Prins
12d3ad4c5c remove unused and outdated opal message buffer code
This commit was SVN r16436.
2007-10-11 22:09:01 +00:00
Josh Hursey
06a30e7f3a Add a quick check to make sure the BLCR being used has a working cr_request.
If it doesn (version < 0.6.0) then fallback to fork/exec of cr_checkpoint
command.

This commit was SVN r16400.
2007-10-09 13:51:28 +00:00
Josh Hursey
7437f37e96 This commit contains the following:
* Fix some missing includes in a few places.
 * Add the cr_request() functionality to the BLCR CRS component.
   We are now dependent upon the 0.6.* series of BLCR.
 * Made the CR notification mechanism a registered function.
   This way we can have an OPAL-only version and it can be replaced at
   runtime with the ORTE version.
 * Add a 'opal_cr_allow_opal_only' parameter that will enable OPAL-only
   CR functionality when the user wants it. Default: Disabled.
 * Fix the placement of a checkpoint request check in MPI_Init
 * Pull the OPAL notification mechanism into the SnapC framework.
   * We no longer fork/exec the 'opal-checkpoint' command for local
   checkpointing, the Local coordinator in the orted does this directly.
   * The Local and Application coordinator talk together bypassing the OPAL
   notifiation mechanism.
   * Optimized the Local <-> App Coordinator communication.
   * Improved the structure used to track vpid_snapshots in the local coord.
 * Fix a race condition in which an application under heavy communication load
   may produce an inconsistent global checkpoint.

This commit was SVN r16389.
2007-10-08 20:53:02 +00:00
Torsten Hoefler
e985812e1f fixing a comment to be more detailed about opal_output_open
functionality ...

This commit was SVN r16370.
2007-10-06 17:33:57 +00:00
Ralph Castain
54b2cf747e These changes were mostly captured in a prior RFC (except for #2 below) and are aimed specifically at improving startup performance and setting up the remaining modifications described in that RFC.
The commit has been tested for C/R and Cray operations, and on Odin (SLURM, rsh) and RoadRunner (TM). I tried to update all environments, but obviously could not test them. I know that Windows needs some work, and have highlighted what is know to be needed in the odls process component.

This represents a lot of work by Brian, Tim P, Josh, and myself, with much advice from Jeff and others. For posterity, I have appended a copy of the email describing the work that was done:

As we have repeatedly noted, the modex operation in MPI_Init is the single greatest consumer of time during startup. To-date, we have executed that operation as an ORTE stage gate that held the process until a startup message containing all required modex (and OOB contact info - see #3 below) info could be sent to it. Each process would send its data to the HNP's registry, which assembled and sent the message when all processes had reported in.

In addition, ORTE had taken responsibility for monitoring process status as it progressed through a series of "stage gates". The process reported its status at each gate, and ORTE would then send a "release" message once all procs had reported in.

The incoming changes revamp these procedures in three ways:

1. eliminating the ORTE stage gate system and cleanly delineating responsibility between the OMPI and ORTE layers for MPI init/finalize. The modex stage gate (STG1) has been replaced by a collective operation in the modex itself that performs an allgather on the required modex info. The allgather is implemented using the orte_grpcomm framework since the BTL's are not active at that point. At the moment, the grpcomm framework only has a "basic" component analogous to OMPI's "basic" coll framework - I would recommend that the MPI team create additional, more advanced components to improve performance of this step.

The other stage gates have been replaced by orte_grpcomm barrier functions. We tried to use MPI barriers instead (since the BTL's are active at that point), but - as we discussed on the telecon - these are not currently true barriers so the job would hang when we fell through while messages were still in process. Note that the grpcomm barrier doesn't actually resolve that problem, but Brian has pointed out that we are unlikely to ever see it violated. Again, you might want to spend a little time on an advanced barrier algorithm as the one in "basic" is very simplistic.

Summarizing this change: ORTE no longer tracks process state nor has direct responsibility for synchronizing jobs. This is now done via collective operations within the MPI layer, albeit using ORTE collective communication services. I -strongly- urge the MPI team to implement advanced collective algorithms to improve the performance of this critical procedure.


2. reducing the volume of data exchanged during modex. Data in the modex consisted of the process name, the name of the node where that process is located (expressed as a string), plus a string representation of all contact info. The nodename was required in order for the modex to determine if the process was local or not - in addition, some people like to have it to print pretty error messages when a connection failed.

The size of this data has been reduced in three ways:

(a) reducing the size of the process name itself. The process name consisted of two 32-bit fields for the jobid and vpid. This is far larger than any current system, or system likely to exist in the near future, can support. Accordingly, the default size of these fields has been reduced to 16-bits, which means you can have 32k procs in each of 32k jobs. Since the daemons must have a vpid, and we require one daemon/node, this also restricts the default configuration to 32k nodes.

To support any future "mega-clusters", a configuration option --enable-jumbo-apps has been added. This option increases the jobid and vpid field sizes to 32-bits. Someday, if necessary, someone can add yet another option to increase them to 64-bits, I suppose.

(b) replacing the string nodename with an integer nodeid. Since we have one daemon/node, the nodeid corresponds to the local daemon's vpid. This replaces an often lengthy string with only 2 (or at most 4) bytes, a substantial reduction.

(c) when the mca param requesting that nodenames be sent to support pretty error messages, a second mca param is now used to request FQDN - otherwise, the domain name is stripped (by default) from the message to save space. If someone wants to combine those into a single param somehow (perhaps with an argument?), they are welcome to do so - I didn't want to alter what people are already using.

While these may seem like small savings, they actually amount to a significant impact when aggregated across the entire modex operation. Since every proc must receive the modex data regardless of the collective used to send it, just reducing the size of the process name removes nearly 400MBytes of communication from a 32k proc job (admittedly, much of this comm may occur in parallel). So it does add up pretty quickly.


3. routing RML messages to reduce connections. The default messaging system remains point-to-point - i.e., each proc opens a socket to every proc it communicates with and sends its messages directly. A new option uses the orteds as routers - i.e., each proc only opens a single socket to its local orted. All messages are sent from the proc to the orted, which forwards the message to the orted on the node where the intended recipient proc is located - that orted then forwards the message to its local proc (the recipient). This greatly reduces the connection storm we have encountered during startup.

It also has the benefit of removing the sharing of every proc's OOB contact with every other proc. The orted routing tables are populated during launch since every orted gets a map of where every proc is being placed. Each proc, therefore, only needs to know the contact info for its local daemon, which is passed in via the environment when the proc is fork/exec'd by the daemon. This alone removes ~50 bytes/process of communication that was in the current STG1 startup message - so for our 32k proc job, this saves us roughly 32k*50 = 1.6MBytes sent to 32k procs = 51GBytes of messaging.

Note that you can use the new routing method by specifying -mca routed tree - if you so desire. This mode will become the default at some point in the future.


There are a few minor additional changes in the commit that I'll just note in passing:

* propagation of command line mca params to the orteds - fixes ticket #1073. See note there for details.

* requiring of "finalize" prior to "exit" for MPI procs - fixes ticket #1144. See note there for details.

* cleanup of some stale header files

This commit was SVN r16364.
2007-10-05 19:48:23 +00:00
Josh Hursey
e10f476c87 Bring over the jjh-filem branch which contains a non-blocking FileM interface
and implementation. This has shown drastic performance benefit when
transferring Many files at roughly the same time.

I tested this for many different filem operations and everything was working
fine. Let me know if you have any problems with this functionality.

Some Notes:
 - opal-checkpoint now has a 'quiet' flag to keep it from being too verbose.

 - FileM RSH component is fully non-blocking.

 - FileM RSH component has incomming connection throttling since by default
   ssh only allows 10 concurrent scp connections to any single host. This
   default can be adjusted via an MCA parameter.
    {{{-mca filem_rsh_max_incomming 10}}}

 - There is an MCA parameter for max outgoing connections, but it is currently
   not implemented. If someone needs it then it should not be hard to implement.
    {{{-mca filem_rsh_max_outgoing 10}}}

 - Changed the FileM request structure so that it is a bit more explicit and
   flexible.

 - Moved the 'preload-binary' and 'preload-files' functionality into odls/base
   allowing for code reuse in the 'process' and 'default' ODLS components.

 - Fixed a bug in the process name resolution which broke the 'preload-*'
   functionality due to GPR table structure changes.

 - The FileM RSH component might be able to see even more speedup from using a
   thread pool to operate on the work_pool structures, but that is for future
   work.

 - Added a 'opal-show-help' file to ODLS Base

This commit was SVN r16252.
2007-09-27 13:13:29 +00:00
Tim Prins
e25bb7f187 Some platforms (such as FreeBSD) need libutil.h included for openpty.
Thanks to Karol Mroz for pointing this out.

This commit was SVN r16163.
2007-09-19 21:59:22 +00:00
George Bosilca
d1364c53de Don't allocate the temporary buffer on the stack. It get way too much
space.

This commit was SVN r16127.
2007-09-14 02:09:38 +00:00
George Bosilca
2c8c75ef94 Coverty blame list:
- Remove memory leaks
 - uninitialized return

This commit was SVN r16126.
2007-09-14 02:08:37 +00:00
George Bosilca
921d79c2b8 Remove few memory leaks. Close the files where we're done with them.
This commit was SVN r16125.
2007-09-14 02:06:26 +00:00
George Bosilca
41ed50f901 Use secure version of strncpy and srtncat. Release the temporary
resources on error.

This commit was SVN r16124.
2007-09-14 02:04:34 +00:00
George Bosilca
61989cc4d4 Don't hardcode the length, there is an argument for that. Don't
do the NULL check as we already know thaty tmp cannot be NULL.

This commit was SVN r16123.
2007-09-14 02:02:03 +00:00
Josh Hursey
b4735c9719 Remove an old workaround in which we had to 'mv' the checkpoint file after it
was taken form the $CWD to the storage directory. Now we just store directly
to the storage directory which can reduce NFS traffic if working in that mode.

A slight performance boost, but at the point you are using NFS you are paying
a penalty anyway. Now you just don't have to pay it twice :)

This commit was SVN r16099.
2007-09-12 15:03:21 +00:00
Gleb Natapov
140dce7614 Fix ABA problem in atomic_lifo code. This is temporary solution for now. We
are looking for a better one.

This commit was SVN r16091.
2007-09-11 15:40:30 +00:00
Shiqing Fan
a389e61330 - Add some type casts, required by MS compiler.
This commit was SVN r16085.
2007-09-11 09:32:11 +00:00
Gleb Natapov
febdade113 Make non threaded OPAL_ATOMIC_CMPSET macros work correctly.
This commit was SVN r16071.
2007-09-09 08:00:16 +00:00
Jeff Squyres
3653bfcbe7 This function returns void.
This commit was SVN r15934.
2007-08-20 13:12:38 +00:00
Brian Barrett
2b8af283de Add ability to completely turn off MPI one-sided support, so that users
can experiment with using ROMIO directly.

This commit was SVN r15922.
2007-08-18 21:35:51 +00:00
Josh Hursey
729c63cf9d Fix invalid MCA 'base' names so they appear in ompi_info.
A subset of this patch needs to be applied to v1.2

Refs trac:928

This commit was SVN r15918.

The following Trac tickets were found above:
  Ticket 928 --> https://svn.open-mpi.org/trac/ompi/ticket/928
2007-08-18 03:05:45 +00:00
Brian Barrett
2d4918b09d Support versions of the Libtool 2.1a snapshots after the lt_dladvise code
was brought in.  This supercedes the GLOBL patch that we had been using
with Libtool 2.1a versions prior to the lt_dladvise code.  Autogen
tries to figure out which version you're on, so either will now work with
the trunk.

This commit was SVN r15903.
2007-08-17 04:08:23 +00:00
Brian Barrett
20fe0952f7 compare should compare the framework names as well. Fixes a potential bug in
the modex component compare code (thanks to Tim P. for finding the problem)

This commit was SVN r15885.
2007-08-16 16:51:41 +00:00
Adrian Knoth
3115816733 Poor off-by-one line error. This now really builds on kFreeBSD.
Re #1105

This commit was SVN r15842.
2007-08-13 19:00:18 +00:00
Tim Prins
188771901d Fix typo.
This commit was SVN r15802.
2007-08-08 14:37:50 +00:00
Sven Stork
f22ab47f84 - one more required symbol
This commit was SVN r15801.
2007-08-08 13:02:10 +00:00
Sven Stork
3c753a4cf7 - export required symbol
This commit was SVN r15800.
2007-08-08 12:57:53 +00:00
Brian Barrett
a48f07b1d9 If we don't have event ops, we don't have a current_base, so don't dereference
the pointer (fixes a segfault Josh was seeing).

This commit was SVN r15796.
2007-08-07 17:09:54 +00:00
Sven Stork
3a640603a4 - remove wrong va_end
This commit was SVN r15789.
2007-08-07 13:32:05 +00:00
Sven Stork
5e257fadbd - add missing va_end
This commit was SVN r15788.
2007-08-07 12:25:20 +00:00
George Bosilca
31dfa5592e Few clean-ups, few indentations. Nothing really important.
This commit was SVN r15767.
2007-08-04 00:44:23 +00:00
George Bosilca
629bacbb07 Don't include the atomic header file, if we're building a non threaded version.
This commit was SVN r15766.
2007-08-04 00:43:15 +00:00
George Bosilca
e2f6d69669 Only use one va_list, as it seems that only one is allowed.
This commit was SVN r15765.
2007-08-04 00:41:26 +00:00
Josh Hursey
6248b2bb51 Whoops. Make sure to include the opal_output header.
This commit was SVN r15755.
2007-08-03 19:20:23 +00:00
Josh Hursey
dc9644a2c2 Add a bit of error output so the user can figure out what
went wrong when we cannot create a directory.

This commit was SVN r15754.
2007-08-03 19:08:48 +00:00
Brian Barrett
951755f9fb no need to call gethostname twice to determine if a process is local
This commit was SVN r15742.
2007-08-02 16:25:25 +00:00
Gleb Natapov
dd8b0c925f Add OPAL_ATOMIC_CMPSET macros that became non atomic with only one threaded.
This commit was SVN r15720.
2007-08-01 12:13:34 +00:00
Gleb Natapov
072ebf0fb3 Add new opal_argv_split_with_empty() function. opal_argv_split() function
doesn't include empty string in the argv array if there are two delimiters
in a row in an input string.

This commit was SVN r15718.
2007-08-01 12:08:11 +00:00
George Bosilca
d52d21fae8 Don't forget to include the header file in the sources list.
This commit was SVN r15711.
2007-07-31 18:40:31 +00:00
Shiqing Fan
0f468f3668 - Remove the solution and project files, will commit them later.
This commit was SVN r15705.
2007-07-31 17:07:02 +00:00
Sven Stork
4c5836c2ee - add missing va_end found by coverity
This commit was SVN r15689.
2007-07-30 16:08:18 +00:00
Sven Stork
71915f269c - more coverity fixes
- use stncpy
  - comapring NULL against an array which is staically inside
    the structure will allways be true

This commit was SVN r15684.
2007-07-30 15:19:54 +00:00
Shiqing Fan
4d7b349cdb - Add VC8 solution and project files.
- If one wants to use this solution, remember to unload the project 'orte-restart' which is currently not working for Windows.

This commit was SVN r15680.
2007-07-30 11:05:34 +00:00
Josh Hursey
fb90a75fc9 A fix so that 'self' only compiles if --enable-dlopen (common case).
This is because internally 'self' uses dlopen to look at the application
running to determine if it can/should be used or not.

This commit was SVN r15673.
2007-07-29 17:40:17 +00:00
George Bosilca
8350ba19db Add protections against the shlwapi.h header file.
This commit was SVN r15597.
2007-07-25 03:49:31 +00:00
George Bosilca
0158806e4c Add the missing return.
This commit was SVN r15596.
2007-07-25 03:48:04 +00:00
Adrian Knoth
e6345aeac6 Fixes for building on kFreeBSD. Re #1105
This commit was SVN r15592.
2007-07-24 23:19:45 +00:00
Brian Barrett
de2c4deeda Fix deadlock in thread case exposed by ORTE message model -- if we are
in a callback from the event library and post an RML receive, we'll
deadlock because the event library wouldn't be entered until the
event library was not already entered.  Now just protect data structures
(which we were basically already doing) instead of code, like good
threading people ;).

This commit was SVN r15585.
2007-07-24 19:10:19 +00:00
Brian Barrett
c3be7376c5 * Mark some of the structures passed into the if and net code as const when
they actually are const.
  * Remove some dead code from the no IP support case
  * Add doxygen comment for opal_net_get_port()

This commit was SVN r15547.
2007-07-22 19:19:01 +00:00
Brian Barrett
39a6057fc6 A number of improvements / changes to the RML/OOB layers:
* General TCP cleanup for OPAL / ORTE
  * Simplifying the OOB by moving much of the logic into the RML
  * Allowing the OOB RML component to do routing of messages
  * Adding a component framework for handling routing tables
  * Moving the xcast functionality from the OOB base to its own framework

Includes merge from tmp/bwb-oob-rml-merge revisions:

    r15506, r15507, r15508, r15510, r15511, r15512, r15513

This commit was SVN r15528.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r15506
  r15507
  r15508
  r15510
  r15511
  r15512
  r15513
2007-07-20 01:34:02 +00:00
Josh Hursey
6026929490 Fix compiler error on Cray by adding in the std io/lib headers.
This commit was SVN r15515.
2007-07-19 18:26:10 +00:00
Jeff Squyres
7ae4a22ed5 Wow. How did that one get through?
This commit was SVN r15514.
2007-07-19 17:18:10 +00:00
Brian Barrett
6427c9f92a oops, need a return statement there...
This commit was SVN r15509.
2007-07-19 16:21:11 +00:00
Brian Barrett
52ee1cb5da fix missing ; in solaris functions
This commit was SVN r15505.
2007-07-19 15:15:41 +00:00
Ralph Castain
ccdb834574 Fix a couple of compile errors. Also, we need to ensure that we only attempt to call destructors on tsd keys that were defined.
This commit was SVN r15501.
2007-07-19 12:56:41 +00:00
Jeff Squyres
7c52a0ce17 Help track down when NULL is passed to %s for OPAL replacements of
asprintf and friends.  This is not a failsafe; there are many cases
where this check will not be used.  But at least it's something...

This commit was SVN r15500.
2007-07-19 12:28:43 +00:00
Brian Barrett
5c1c3cdf1c remove debugging output
This commit was SVN r15497.
2007-07-18 21:57:58 +00:00
Brian Barrett
8a7b6656b3 Reference count calls to the util access as well as the main initialized
code

This commit was SVN r15495.
2007-07-18 20:28:19 +00:00
Brian Barrett
916397f358 Use thread specific data and static buffers for the return type of
opal_net_get_hostname() rather than malloc, because no one was freeing
the buffer and the common use case was for printfs, where calling
free is a pain.

This commit was SVN r15494.
2007-07-18 20:25:01 +00:00
Brian Barrett
c5d0066c27 add ability to have thread-specific data on windows, pthreads, solaris threads,
and non-threaded builds

This commit was SVN r15492.
2007-07-18 20:23:45 +00:00
Sven Stork
a6d04c60b4 - the use_component function is always present independent of OMPI_WANT_LIBLTDL
This commit was SVN r15481.
2007-07-18 14:25:51 +00:00
Sven Stork
92fce998fe - the use_component function is always present independent of OMPI_WANT_LIBLTDL
This commit was SVN r15480.
2007-07-18 14:19:24 +00:00
Jeff Squyres
8ace07efed This commit brings in two major things:
1. Galen's fine-grain control of queue pair resources in the openib
   BTL.
1. Pasha's new implementation of asychronous HCA event handling.

Pasha's new implementation doesn't take much explanation, but the new
"multifrag" stuff does.  

Note that "svn merge" was not used to bring this new code from the
/tmp/ib_multifrag branch -- something Bad happened in the periodic
trunk pulls on that branch making an actual merge back to the trunk
effectively impossible (i.e., lots and lots of arbitrary conflicts and
artifical changes).  :-(

== Fine-grain control of queue pair resources ==

Galen's fine-grain control of queue pair resources to the OpenIB BTL
(thanks to Gleb for fixing broken code and providing additional
functionality, Pasha for finding broken code, and Jeff for doing all
the svn work and regression testing).

Prior to this commit, the OpenIB BTL created two queue pairs: one for
eager size fragments and one for max send size fragments.  When the
use of the shared receive queue (SRQ) was specified (via "-mca
btl_openib_use_srq 1"), these QPs would use a shared receive queue for
receive buffers instead of the default per-peer (PP) receive queues
and buffers.  One consequence of this design is that receive buffer
utilization (the size of the data received as a percentage of the
receive buffer used for the data) was quite poor for a number of
applications.

The new design allows multiple QPs to be specified at runtime.  Each
QP can be setup to use PP or SRQ receive buffers as well as giving
fine-grained control over receive buffer size, number of receive
buffers to post, when to replenish the receive queue (low water mark)
and for SRQ QPs, the number of outstanding sends can also be
specified.  The following is an example of the syntax to describe QPs
to the OpenIB BTL using the new MCA parameter btl_openib_receive_queues:

{{{
-mca btl_openib_receive_queues \
     "P,128,16,4;S,1024,256,128,32;S,4096,256,128,32;S,65536,256,128,32"
}}}

Each QP description is delimited by ";" (semicolon) with individual
fields of the QP description delimited by "," (comma).  The above
example therefore describes 4 QPs.

The first QP is:

    P,128,16,4

Meaning: per-peer receive buffer QPs are indicated by a starting field
of "P"; the first QP (shown above) is therefore a per-peer based QP.
The second field indicates the size of the receive buffer in bytes
(128 bytes).  The third field indicates the number of receive buffers
to allocate to the QP (16).  The fourth field indicates the low
watermark for receive buffers at which time the BTL will repost
receive buffers to the QP (4).

The second QP is:

    S,1024,256,128,32

Shared receive queue based QPs are indicated by a starting field of
"S"; the second QP (shown above) is therefore a shared receive queue
based QP.  The second, third and fourth fields are the same as in the
per-peer based QP.  The fifth field is the number of outstanding sends
that are allowed at a given time on the QP (32).  This provides a
"good enough" mechanism of flow control for some regular communication
patterns.

QPs MUST be specified in ascending receive buffer size order.  This
requirement may be removed prior to 1.3 release.

This commit was SVN r15474.
2007-07-18 01:15:59 +00:00
George Bosilca
e3ad495e7b Remove an unused variable.
This commit was SVN r15473.
2007-07-17 22:34:59 +00:00
Sven Stork
73f1d800cf - Make the component select with static build working.
Remove the matching logic out of dynamic path into an
  extra function. Add the corresponing check to the static
  component path.

This commit was SVN r15458.
2007-07-17 12:06:51 +00:00
Camille Coti
59dc6b95f3 Modifications in the way user-specified modules are loaded. Once we get the list of filenames from the libtool, we add the statically defined ones and then based on the include/exclude string we only keep the ones requested by the user. If no include/exclude was provided, we keep [as expected] everything. Once we have the list of whatever is requested we open them. Therefore, as an example, if the user specify "--mca pml ob1" we will never try to load/open/init DR nor CM.
There are several interesting things:
1. less NFS traffic [as we potentially access less files]
2. faster loading time [in case the user tune it's execution environment]
3. (1) + (2) -> faster startup time [at least everything which do not depend on the network]
4. MX bug will go away if the pml is specified.
5. No useless BTL will be opened, which will solve few others issues.

This commit was SVN r15402.
2007-07-13 14:54:01 +00:00
Josh Hursey
73273397f5 Quiet a warning on the Cray systems in which mkfifo does not work.
Thanks to Lisa Glendenning for mentioning this, and Brian for following
up with me on it.

This commit was SVN r15392.
2007-07-12 20:30:09 +00:00
Pak Lui
2aec8c527e fix a small leak. strdup used twice on retval.
This commit was SVN r15357.
2007-07-11 14:00:29 +00:00
Brian Barrett
1d02b9e7b5 Fix a bunch of issues exposed by Ken Cain in getting Open MPI to work with
VxWorks.  Still some issues remaining, I'm sure.

Refs trac:1010

This commit was SVN r15320.

The following Trac tickets were found above:
  Ticket 1010 --> https://svn.open-mpi.org/trac/ompi/ticket/1010
2007-07-10 03:46:57 +00:00
George Bosilca
1701013dd0 The real Windows solution. This generate one more dependence when
building on Windows (to the shlwapi.lib).

This commit was SVN r15267.
2007-07-02 18:11:14 +00:00
Jeff Squyres
c796d84d61 * Integrate man pages contributed by Dirk Eddelbuettel
* Make orted.1 man page be non-descriptive because it's really an
   internal command.
 * Re-work the opal_wrapper man page logic a bit so that we can have a
   real opal_wrapper.1 installed that says "don't look here -- look at
   mpicc (etc.)"

This commit was SVN r15264.
2007-07-02 15:27:39 +00:00
George Bosilca
610d7a0d2c A Windows specific solution around the real path function.
This commit was SVN r15263.
2007-07-02 14:55:10 +00:00
Adrian Knoth
83c7aab185 Removed annoying debug output
This commit was SVN r15262.
2007-07-02 13:17:27 +00:00
George Bosilca
0bf7463c6f This one was nice ... and it didn't even trigger a compile warning.
Update the copyright.

This commit was SVN r15256.
2007-07-01 17:59:46 +00:00
George Bosilca
b9251eb442 On Windows accept networked path (i.e. starting with \\). Add a new function
opal_find_absolute_path which take as argument a executable name and return
the absolute path if possible.

This commit was SVN r15255.
2007-07-01 17:53:19 +00:00
George Bosilca
003d1b8665 On windows make sure we add the initial path separator only if we are
creating a relative path. Use BEGIN_C_DECLS and END_C_DECLS on some
header files.

This commit was SVN r15254.
2007-07-01 17:51:34 +00:00
Brian Barrett
9687f70aea Add solaris condition variables
This commit was SVN r15225.
2007-06-27 16:48:30 +00:00
Brian Barrett
04c0fcccf0 fix silly warning with the Sun compilers
This commit was SVN r15224.
2007-06-27 15:47:47 +00:00
Brian Barrett
6b3cd84403 Fix issue Dan found -- we don't expand out the extra pre-processor, compiler
or linker flags.  Which makes the whole wrapper-extra-*flags thing slightly
less useful than perhaps it should be.

This commit was SVN r15222.
2007-06-27 14:17:57 +00:00
Josh Hursey
f88aa6c273 This commit cleans up the AMCA parameter implementation a bit.
* Remove the 'opal_mca_base_param_use_amca_sets' global variable
* Harness the fact that you can (read should) call the cmd_line functions
  before initializing opal_init_util(). This pushes the MCA/GMCA/AMCA
  command line options into the environment before OPAL inits and starts
  to use these values. By putting the cmd_line parse before opal_init_util
  in orterun and orted we only parse the *MCA parameter files once, and 
  correctly (alleviating the need to 'recache' the files on init.)
* Small bits of cleanup.

This commit was SVN r15219.
2007-06-27 01:03:31 +00:00
Josh Hursey
dada472c13 some additional debugging output
This commit was SVN r15218.
2007-06-27 00:54:18 +00:00
Sven Stork
813d4dc175 - let the mvapi btl and the rest of the world use our posix_memalign function.
This commit was SVN r15202.
2007-06-26 14:45:20 +00:00
Brian Barrett
249ddf8fff Only print warning message about condition variable useage if the MCA
param says we should  Also, check for != 0, rather than == 1, as there
are way too many double locks, but they'll get warned when we do the
double lock.  No need to warn again, in a meaningless way.

Originally part of r15167, reverted with r15172.

This commit was SVN r15173.

The following SVN revision numbers were found above:
  r15167 --> open-mpi/ompi@faa401dc47
  r15172 --> open-mpi/ompi@5f16251808
2007-06-22 15:28:12 +00:00
Brian Barrett
5f16251808 revert r15167. I don't know what I was thinking, but it was most definitely
"not right".

This commit was SVN r15172.

The following SVN revision numbers were found above:
  r15167 --> open-mpi/ompi@faa401dc47
2007-06-22 15:25:39 +00:00
Brian Barrett
faa401dc47 * Need to OBJ_RELEASE, not OBJ_DESTRUCT things that were created with
OBJ_NEW
  * Need to single when the passive unlock has left an expose epoch for
    the win_free case
  * Clean up some debugging output
  * fix missing variable initialization

This commit was SVN r15167.
2007-06-21 22:08:30 +00:00
Jeff Squyres
022bd30558 Back out r15158 because it apparently breaks with recent versions of
flex (which, incidentally, emit ''more'' warnings than earlier
versions).  Grumble.

This commit was SVN r15166.

The following SVN revision numbers were found above:
  r15158 --> open-mpi/ompi@57d09c10f7
2007-06-21 21:14:10 +00:00
Jeff Squyres
57d09c10f7 Avoid some compiler warnings that come up ''every day'' in MTT (and
have been for eons): make a symbol be used in a dumb but harmless way.

This commit was SVN r15158.
2007-06-21 15:42:06 +00:00
Shiqing Fan
66d9f8e516 Add braces.
This commit was SVN r15155.
2007-06-21 07:53:47 +00:00
Shiqing Fan
228d2b4996 Check if it's valid event handle. This will get rid of run time exception.
This commit was SVN r15154.
2007-06-21 07:49:11 +00:00
Josh Hursey
bfa8401c0c Fix some thread warnings that caught me being dumb with locks. :[
This commit was SVN r15146.
2007-06-20 14:18:33 +00:00
Jeff Squyres
57486f0b69 Re-enable threaded builds. Need to protect usage of mutex->m_*
variables to only use them a) when debugging, and b) when we don't
have real thread support.

This commit was SVN r15099.
2007-06-15 14:26:23 +00:00
Jeff Squyres
7e379aff10 Fixes trac:1057.
Ensure that the AM_CONDITIONALs are ''always'' run, even if we
--enable-mca-no-build the paffinity/linux component.

This commit was SVN r15095.

The following Trac tickets were found above:
  Ticket 1057 --> https://svn.open-mpi.org/trac/ompi/ticket/1057
2007-06-15 11:36:22 +00:00
George Bosilca
17f9893448 Add some attributes.
This commit was SVN r15094.
2007-06-15 05:07:49 +00:00
George Bosilca
8c5b63768c A working version of the mutex_trylock for Windows.
This commit was SVN r15084.
2007-06-14 22:28:28 +00:00
Ethan Mallove
a09aebc30b Fix typo. Change mutex_destory to mutex_destroy.
This commit was SVN r15083.
2007-06-14 19:02:37 +00:00
George Bosilca
b33086b941 Check for the right function. This is always supposed to
fails on anything than Windows.

This commit was SVN r15072.
2007-06-14 07:03:26 +00:00
George Bosilca
be179cea79 Windows component allowing to get the paths from the registry.
This commit was SVN r15071.
2007-06-14 06:47:03 +00:00
George Bosilca
76ff70e672 Allow a threaded build on Windows.
This commit was SVN r15070.
2007-06-14 04:52:37 +00:00
George Bosilca
c70566dfe7 Remove unused variables.
This commit was SVN r15029.
2007-06-12 22:50:43 +00:00
George Bosilca
286606d4c3 Allow access to the system wide registry (supposedly updated by
the administrator) as well as to the user registry.

This commit was SVN r15028.
2007-06-12 22:49:54 +00:00