1
1
Граф коммитов

1046 Коммитов

Автор SHA1 Сообщение Дата
Josh Hursey
a003fa7a50 C/R fix for broken CRS component selection resulting from r18707.
Make sure that if we ask for the 'none' component (which is not a 'real' component, but a component in crs/base) then we do not fail out of the box when using tools. We check for the {{{OPAL_ERR_NOT_FOUND}}} error.

Also make sure that component_open() returns {{{OPAL_ERR_NOT_FOUND}}} when it cannot find a value instead of {{{OPAL_ERROR}}} which means something quite a bit different.

C/R is working but the tools still print the warning below everytime they are ran:
{{{
--------------------------------------------------------------------------
A requested component was not found, or was unable to be opened.  This
means that this component is either not installed or is unable to be
used on your system (e.g., sometimes this means that shared libraries
that the component requires are unable to be found/loaded).  Note that
Open MPI stopped checking at the first component that it did not find.

Host:      odin.cs.indiana.edu
Framework: crs
Component: none
--------------------------------------------------------------------------
}}}

I'll have to figure out a work around for this warning (maybe work on the {{{MCA_NULL}}} Ticket #1291).

This commit was SVN r18739.

The following SVN revision numbers were found above:
  r18707 --> open-mpi/ompi@bdaaf01d8a
2008-06-25 14:55:09 +00:00
George Bosilca
2bc52a87d2 Related to my previous commit. The Sicortex is a MIPS machine, so
allow the assembly to understand this.

This commit was SVN r18732.
2008-06-25 03:09:02 +00:00
George Bosilca
872d957550 Allow Open MPI to configure correctly on the Sicortex machine.
This commit was SVN r18731.
2008-06-25 03:07:53 +00:00
Brian Barrett
e9c50a29ba Some (rare) platforms only have a #define for htonl and friends,
but not anything in libc.  Which causes an incorrect answer for
AC_CHECK_FUNCS.  Work around that by also checking for the
#define.

This commit was SVN r18730.
2008-06-24 23:20:25 +00:00
Brian Barrett
e7a299d046 Add timer support for Catamount
This commit was SVN r18729.
2008-06-24 22:13:34 +00:00
Rolf vandeVaart
95cd9758e5 Fix broken build on Solaris.
This commit was SVN r18719.
2008-06-24 14:57:12 +00:00
Ralph Castain
f70b7e51ce Fix a missing header file and ensure we use a portable name for a system limit
This commit was SVN r18712.
2008-06-23 22:32:26 +00:00
Jeff Squyres
bdaaf01d8a Fixes trac:1338: Have the MCA base specifically check for all requested
components.  If they are not found / able to be opened, a warning will
be printed and the mca_base_component_find() will return
OPAL_ERR_NOT_FOUND.  It is the upper-layer's responsibility to handle
this error appropriately.

This commit was SVN r18707.

The following Trac tickets were found above:
  Ticket 1338 --> https://svn.open-mpi.org/trac/ompi/ticket/1338
2008-06-23 16:14:05 +00:00
Ralph Castain
ccbf194e8f Visibility fix
This commit was SVN r18687.
2008-06-19 19:08:08 +00:00
Ralph Castain
26c9ad5799 Clean-up the DSS API to remove two functions that are supposed to be used solely internally to the DSS. These were likely exposed because we need to call them when packing/unpacking declared types, but this means that developers may accidentally use the wrong functions, causing the DSS buffer to get confused. Instead, return the system to the way it used to work and hide those functions.
This commit was SVN r18684.
2008-06-19 18:46:25 +00:00
Pak Lui
188c8bce5d Fix the SEGV when module_get finds that no proc is binded. Also make no-intr available for processor binding.
This commit was SVN r18671.
2008-06-18 16:03:08 +00:00
George Bosilca
f97a728dc6 Dont cast the int32_t pointer into a long pointer. This doesn't work on
64 bits architectures.

This commit was SVN r18667.
2008-06-18 08:33:58 +00:00
Ralph Castain
0532d799d6 Complete implementation of the --without-rte-support configure option. Working with Brian, this has been tested on RedStorm.
Some minor changes to help facilitate debugger support so that both mpirun and yod can operate with it. Still to be completed.

This commit was SVN r18664.
2008-06-18 03:15:56 +00:00
Jeff Squyres
16b2a50543 Slight clarification of help message.
This commit was SVN r18661.
2008-06-17 11:25:32 +00:00
Jeff Squyres
c1d1ffbc56 Fix compile problems on systems with older versions of libnuma (that
don't have MPOL_MF_MOVE).  I know that this is a configure change in
the middle of the US workday, but this compile problem is preventing
work on several kinds of systems (e.g., RHEL4).

This commit was SVN r18659.
2008-06-16 17:26:42 +00:00
Lenny Verkhovsky
dee2f1d175 Adding new functionality to Maffinity component to support NUMA awareness
This commit was SVN r18657.
2008-06-15 07:27:29 +00:00
Brian Barrett
7712b07ac4 Add perl based wrapper compilers for cross-compile environments. The default
is still to use the C based wrapper compilers (which have many more features
and are more well tested).  The Perl compilers are enabled with the option
--enable-script-wrapper-compilers, which also ignores the option
--disable-binaries (ie --enable-script-wrapper-compilers --disable-binaries
will result in perl-based wrapper compilers being installed, but no other
binaries being installed).

This commit was SVN r18655.
2008-06-13 22:52:25 +00:00
Brian Barrett
79ad6d983e - The ptmalloc2 memory manager component is now by default built as
a standalone library named libopenmpi-malloc.  Users wanting to
  use leave_pinned with ptmalloc2 will now need to link the library
  into their application explicitly.  All other users will use the
  libc-provided allocator instead of Open MPI's ptmalloc2.  This change
  may be overriden with the configure option enable-ptmalloc2-internal
- The leave_pinned options will now default to using mallopt on
  Linux in the cases where ptmalloc2 was not linked in.  mallopt
  will also only be available if munmap can be intercepted (the
  default whenever Open MPI is not compiled with --without-memory-
  manager.
- Open MPI will now complain and refuse to use leave_pinned if
  no memory intercept / mallopt option is available.

This commit was SVN r18654.
2008-06-13 22:32:49 +00:00
Ralph Castain
9613b3176c Effectively revert the orte_output system and return to direct use of opal_output at all levels. Retain the orte_show_help subsystem to allow aggregation of show_help messages at the HNP.
After much work by Jeff and myself, and quite a lot of discussion, it has become clear that we simply cannot resolve the infinite loops caused by RML-involved subsystems calling orte_output. The original rationale for the change to orte_output has also been reduced by shifting the output of XML-formatted vs human readable messages to an alternative approach.

I have globally replaced the orte_output/ORTE_OUTPUT calls in the code base, as well as the corresponding .h file name. I have test compiled and run this on the various environments within my reach, so hopefully this will prove minimally disruptive.

This commit was SVN r18619.
2008-06-09 14:53:58 +00:00
Ralph Castain
e1e224b81a Silence a couple of minor compiler warnings
This commit was SVN r18617.
2008-06-09 12:57:41 +00:00
Ralph Castain
7bee71aa59 Fix a potential, albeit perhaps esoteric, race condition that can occur for fast HNP's, slow orteds, and fast apps. Under those conditions, it is possible for the orted to be caught in its original send of contact info back to the HNP, and thus for the progress stack never to recover back to a high level. In those circumstances, the orted can "hang" when trying to exit.
Add a new function to opal_progress that tells us our recursion depth to support that solution.

Yes, I know this sounds picky, but good ol' Jeff managed to make it happen by driving his cluster near to death...

Also ensure that we declare "failed" for the daemon job when daemons fail instead of the application job. This is important so that orte knows that it cannot use xcast to tell daemons to "exit", nor should it expect all daemons to respond. Otherwise, it is possible to hang.

After lots of testing, decide to default (again) to slurm detecting failed orteds. This proved necessary to avoid rather annoying hangs that were difficult to recover from. There are conditions where slurm will fail to launch all daemons (slurm folks are working on it), and yet again, good ol' Jeff managed to find both of them.

Thanks you Jeff! :-/

This commit was SVN r18611.
2008-06-06 19:36:27 +00:00
George Bosilca
b2aa751c28 Remove a race condition in the threaded mode. As a callback is allowed
to modify the callback array (add or remove), make sure we don't call
the same callback twice if it get remove in another thread.

This commit was SVN r18608.
2008-06-06 15:54:40 +00:00
Josh Hursey
1de50b523c Fix some Coverity 'Event set_but_not_used' highlights.
Thanks to Jeff for bringing them to my attention.

This commit was SVN r18606.
2008-06-06 14:38:41 +00:00
Jeff Squyres
12a3fe57e1 As pointed out by Ralf
W. (http://www.open-mpi.org/community/lists/devel/2008/06/4095.php),
these dependencies don't need to be here.

This commit was SVN r18603.
2008-06-06 01:20:47 +00:00
Jeff Squyres
b123629e6a Fix CIDs 458, 716, 717: ensure that strings are long enough to always
be properly \0 terminated.

This commit was SVN r18602.
2008-06-06 00:59:08 +00:00
Jeff Squyres
e2b08aaca4 Fix bad free's found in CID 707 and CID 708.
This commit was SVN r18600.
2008-06-05 20:49:33 +00:00
Lenny Verkhovsky
a8b5dcb204 Added more output info about socket:core pair in paffinity / rankfile components
This commit was SVN r18589.
2008-06-05 10:28:44 +00:00
Ralph Castain
ca91ec525b Add a suffix to the opal_output stream descriptor object - we can now output both a prefix and a suffix for a given stream. Default the suffix to NULL.
Remove lingering references to a filtering system as this will no longer be implemented.

This commit was SVN r18586.
2008-06-04 20:52:20 +00:00
Josh Hursey
78f14b5255 Fix the none.checkpoint command.
orte-checkpoint/orte-restart seem to not seem to totally like orte_output so revert them to opal_output for now. Since we have no need for the additional complexity of orte_output we can drop it for now and revisit this if anyone needs it later.

It seems that if you set the verbose level on an output handle then try to call a normal orte_output() on it then the message will *not* be printed. This is the same for opal_output, and seems incorrect to me because it stops some error messages from being printed out if you do not directly specify opal_output(0, ...). Maybe someone should take a look a this.


orte-checkpoint would segv if passed an incorrect PID. Fixed the return code so it errors out properly.

Thanks to Eric Roman for bringing this to my attention.

This commit was SVN r18583.
2008-06-04 14:44:11 +00:00
Jeff Squyres
530a15baa4 Fix cross-compiling scenario with valgrind.m4.
This commit was SVN r18579.
2008-06-04 11:58:41 +00:00
Shiqing Fan
2dc812f720 Clean configure.m4 of memchecker/valgrind.
If Valgrind is requested but wrong version is supplied, print error messages and stop. 
Save the CPPFLAGS in opal_memchecker_valgrind_CPPFLAGS, which could be used in 
Makefile.am.

Many thanks to Jeff. 

This commit was SVN r18573.
2008-06-04 11:46:50 +00:00
Ralph Castain
9927b2445c Remove the filter framework - the xml support will have to be provided in a different manner that will be implemented shortly
This commit was SVN r18572.
2008-06-04 09:04:51 +00:00
Jeff Squyres
75a97ebbf0 Many thanks to Ralf W. for finding a subtle bug in these Makefile.am's
that can *sometimes* cause problems with "make -j [N>1] install".
Ensure to make the target directory before we copy stuff into it --
read the thread starting here for more details:

    http://www.open-mpi.org/community/lists/devel/2008/06/4080.php

This commit was SVN r18570.
2008-06-04 01:28:03 +00:00
Jeff Squyres
3b568d4b14 Remove an old attempt to understand the tradeoffs with using GNU libc's malloc_hooks functionality, which turned out to be totally unusable in practice. I think we just always forgot to remove them.
This commit was SVN r18547.
2008-05-30 00:11:12 +00:00
Shiqing Fan
b67a1244b6 Some small fixes.
This commit was SVN r18541.
2008-05-29 15:05:28 +00:00
Jeff Squyres
ed5bc2cd08 Per http://www.open-mpi.org/community/lists/devel/2008/05/4057.php, remove the darwin memory hooks component
This commit was SVN r18531.
2008-05-28 23:50:53 +00:00
Sharon Melamed
64fe554b8e Fix bug in carto component select. After the insertion of mca_base_select the carto file component was never selected.
This commit was SVN r18496.
2008-05-26 12:52:41 +00:00
Jeff Squyres
d45cb82ecc Fix two bugs in PLPA:
1. If we don't have the topology information, don't bother trying to
    create cross-referencing information
 1. Ensure to only check for valid processor ID's

This commit was SVN r18462.
2008-05-20 12:57:12 +00:00
Terry Dontje
ef7ac86929 created opal_version_string and orte_version_string to match the ompi changes
made in r18345 for ompi_version_string.  This was done per request from Jeff 
Squyres to maintain consistency and to remove some warnings caused by the 
non-use of some static const char.

This commit was SVN r18461.

The following SVN revision numbers were found above:
  r18345 --> open-mpi/ompi@8dd0421015
2008-05-20 12:13:19 +00:00
Jeff Squyres
ea1582856f Clarify some messages, move AC_ARG_WITH outside of the conditional
This commit was SVN r18459.
2008-05-19 23:13:31 +00:00
Jeff Squyres
d12b21e21b Ensure that if an error occurs, we actually return that error rather
than an undefined value (which could be 0/OPAL_SUCCESS).

This commit was SVN r18452.
2008-05-19 11:57:44 +00:00
Terry Dontje
517abf9b09 This commit fixes trac:1288.
This commit was SVN r18441.

The following Trac tickets were found above:
  Ticket 1288 --> https://svn.open-mpi.org/trac/ompi/ticket/1288
2008-05-15 17:40:08 +00:00
Jeff Squyres
fb17097de4 Make ompi_info correctly display "filter" components
This commit was SVN r18435.
2008-05-13 20:56:20 +00:00
Jeff Squyres
e7ecd56bd2 This commit represents a bunch of work on a Mercurial side branch. As
such, the commit message back to the master SVN repository is fairly
long.

= ORTE Job-Level Output Messages =

Add two new interfaces that should be used for all new code throughout
the ORTE and OMPI layers (we already make the search-and-replace on
the existing ORTE / OMPI layers):

 * orte_output(): (and corresponding friends ORTE_OUTPUT,
   orte_output_verbose, etc.)  This function sends the output directly
   to the HNP for processing as part of a job-specific output
   channel.  It supports all the same outputs as opal_output()
   (syslog, file, stdout, stderr), but for stdout/stderr, the output
   is sent to the HNP for processing and output.  More on this below.
 * orte_show_help(): This function is a drop-in-replacement for
   opal_show_help(), with two differences in functionality:
   1. the rendered text help message output is sent to the HNP for
      display (rather than outputting directly into the process' stderr
      stream)
   1. the HNP detects duplicate help messages and does not display them
      (so that you don't see the same error message N times, once from
      each of your N MPI processes); instead, it counts "new" instances
      of the help message and displays a message every ~5 seconds when
      there are new ones ("I got X new copies of the help message...")

opal_show_help and opal_output still exist, but they only output in
the current process.  The intent for the new orte_* functions is that
they can apply job-level intelligence to the output.  As such, we
recommend that all new ORTE and OMPI code use the new orte_*
functions, not thei opal_* functions.

=== New code ===

For ORTE and OMPI programmers, here's what you need to do differently
in new code:

 * Do not include opal/util/show_help.h or opal/util/output.h.
   Instead, include orte/util/output.h (this one header file has
   declarations for both the orte_output() series of functions and
   orte_show_help()).
 * Effectively s/opal_output/orte_output/gi throughout your code.
   Note that orte_output_open() takes a slightly different argument
   list (as a way to pass data to the filtering stream -- see below),
   so you if explicitly call opal_output_open(), you'll need to
   slightly adapt to the new signature of orte_output_open().
 * Literally s/opal_show_help/orte_show_help/.  The function signature
   is identical.

=== Notes ===

 * orte_output'ing to stream 0 will do similar to what
   opal_output'ing did, so leaving a hard-coded "0" as the first
   argument is safe.
 * For systems that do not use ORTE's RML or the HNP, the effect of
   orte_output_* and orte_show_help will be identical to their opal
   counterparts (the additional information passed to
   orte_output_open() will be lost!).  Indeed, the orte_* functions
   simply become trivial wrappers to their opal_* counterparts.  Note
   that we have not tested this; the code is simple but it is quite
   possible that we mucked something up.

= Filter Framework =

Messages sent view the new orte_* functions described above and
messages output via the IOF on the HNP will now optionally be passed
through a new "filter" framework before being output to
stdout/stderr.  The "filter" OPAL MCA framework is intended to allow
preprocessing to messages before they are sent to their final
destinations.  The first component that was written in the filter
framework was to create an XML stream, segregating all the messages
into different XML tags, etc.  This will allow 3rd party tools to read
the stdout/stderr from the HNP and be able to know exactly what each
text message is (e.g., a help message, another OMPI infrastructure
message, stdout from the user process, stderr from the user process,
etc.).

Filtering is not active by default.  Filter components must be
specifically requested, such as:

{{{
$ mpirun --mca filter xml ...
}}}

There can only be one filter component active.

= New MCA Parameters =

The new functionality described above introduces two new MCA
parameters:

 * '''orte_base_help_aggregate''': Defaults to 1 (true), meaning that
   help messages will be aggregated, as described above.  If set to 0,
   all help messages will be displayed, even if they are duplicates
   (i.e., the original behavior).
 * '''orte_base_show_output_recursions''': An MCA parameter to help
   debug one of the known issues, described below.  It is likely that
   this MCA parameter will disappear before v1.3 final.

= Known Issues =

 * The XML filter component is not complete.  The current output from
   this component is preliminary and not real XML.  A bit more work
   needs to be done to configure.m4 search for an appropriate XML
   library/link it in/use it at run time.
 * There are possible recursion loops in the orte_output() and
   orte_show_help() functions -- e.g., if RML send calls orte_output()
   or orte_show_help().  We have some ideas how to fix these, but
   figured that it was ok to commit before feature freeze with known
   issues.  The code currently contains sub-optimal workarounds so
   that this will not be a problem, but it would be good to actually
   solve the problem rather than have hackish workarounds before v1.3 final.

This commit was SVN r18434.
2008-05-13 20:00:55 +00:00
Josh Hursey
c70ba283b8 Fix a warning, and some return codes.
Thanks to Jeff for pointing this out to me.

This commit was SVN r18430.
2008-05-13 13:10:16 +00:00
Josh Hursey
4236255700 Add the framework name to the verbose message for improved debugging.
Also set the 'best_priority' to the smallest 32 bit integer possible so negaive priority component can be selected if they are the highest ranking component available.

This commit was SVN r18427.
2008-05-12 14:07:37 +00:00
Rainer Keller
b0cbeb0b41 - Add detection of __attribute__((hot)) and __attribute__((cold))
to allow explicit grouping of hot functions into similar code
   sections upon link-time. Should decrease TLB misses (iff the code-
   section is really too large)...
   Candidates for __opal_attribute_hot__ are MPI_Isend MPI_Irecv,
   MPI_Wait, MPI_Waitall
   Candidates for __opal_attribute_cold__ are MPI_Init, MPI_Finalize and
   MPI_Abort...

This commit was SVN r18421.
2008-05-10 10:38:51 +00:00
Josh Hursey
9b0cd5b02a Remove the 'include' check from mca_base_select. include/exclude is handled by the mca_base_open functionality and it is redundant (and wrong) to check this in the select function.
Thanks to Pak Lui for bringing this to my attention.

This commit was SVN r18418.
2008-05-08 23:41:07 +00:00
Josh Hursey
da2f1c58e2 Some checkpoint/restart cleanup.
* Remove the opal_only option. This was suffering from bit rot, and no one uses it. It can be added back fairly easily if wanted.
 * Cleanup metadata interactions at the local level.
 * Touch up some of the INC funcitonality (fix typos and a minor ordering issue)

This commit was SVN r18416.
2008-05-08 18:47:47 +00:00
Josh Hursey
8739edc580 Fix a couple of missing OPAL_DECLSPEC missing from r18407
This commit was SVN r18415.

The following SVN revision numbers were found above:
  r18407 --> open-mpi/ompi@7c7b9b0486
2008-05-08 18:44:23 +00:00
George Bosilca
fe495e429a Completely remove the kqueue support on MAC OS X. Remove the test
from kqueue that try to detect if kqueue might works with ptys.

This commit was SVN r18411.
2008-05-08 02:33:23 +00:00
Ralph Castain
7c7b9b0486 Do a little cleanup on the opal graph class and opal carto framework to conform to OMPI naming conventions and avoid potential conflict with user applications - no change in functionality, passes carto test program
This commit was SVN r18407.
2008-05-07 19:33:49 +00:00
Josh Hursey
9971bc9d95 Merge in the mca_base_select changes per RFC:
http://www.open-mpi.org/community/lists/devel/2008/04/3779.php

{{{
svn merge -r 18276:18380 https://svn.open-mpi.org/svn/ompi/tmp-public/jjh-mca-play .
}}}

Any components not in the trunk, but in one of the effected frameworks *must* be
updated. Contact the list, look at the RFC, or look at the diff for how to do this.

Sorry for the early commit of this, but I wanted to get it in today (per RFC) and
didn't know if I would have a chance later today.

This commit was SVN r18381.
2008-05-06 18:08:45 +00:00
Aurelien Bouteiller
c06620ad70 Add a const to the parameters of opal_dss_compare.
This commit was SVN r18374.
2008-05-05 19:12:01 +00:00
Brad Penoff
4f104ba5d1 Add header for FreeBSD.
This commit was SVN r18366.
2008-05-03 23:07:45 +00:00
George Bosilca
f5dfc005a4 Only check for /proc/cpuinfo if we are on a supported architecture.
This commit was SVN r18331.
2008-04-29 22:36:18 +00:00
George Bosilca
465f690f90 We need to force the compiler to preprocess these files as some of
them use #include. The standard way is to rename to file .S instead
of .s.

This commit was SVN r18290.
2008-04-24 21:40:40 +00:00
Josh Hursey
2c736873bb Fix a checkpoint/restart bug that causes a restarted application to occasionally throw a SIGSEGV or SIGPIPE due to invalid socket descriptors.
The problem was caused by a bad ordering between the restart of the ORTE level tcp connections (in the OOB - out-of-band communication) and the Open MPI level tcp connections (BTLs). Before this commit ORTE would shutdown and restart the OOB completely before the OMPI level restarted its tcp connections. What would happen is that a socket descriptor used by the OMPI level on checkpoint was assigned to the ORTE level on restart. But the OMPI level had no knowledge that the socket descriptor it was previously using has been recycled so it closed it on restart. This caused the ORTE level to break as the newly created socket descriptor was closed without its knowledge.

The fix is to have the OMPI level shutdown tcp connections, allow the ORTE level to restart, and then allow the OMPi level to restart its connections. This seems obvious, and I'm surprised that this bug has not cropped up sooner. I'm confident that this specific problem has been fixed with this commit.

Thanks to Eric Roman and Tamer El Sayed for their help in identifying this problem, and patience while I was fixing it.

 * Add a new state {{{OPAL_CRS_RESTART_PRE}}}. This state identifies when we are on the down slope of the INC (finalize-like) which is useful when you want to close, but not reopen a component set for fear of interfering with a lower level.
 * Use this new state in OMPI level coordination. Here we want to make sure to play well with both the OMPI/BTL/TCP and ORTE/OOB/TCP components.
 * Update ft_event functions in PML and BML to handle the new restart state.
 * Add an additional flag to the error output in OOB/TCP so we can see what the socket descriptor was on failure as this can be helpful in debugging.

This commit was SVN r18276.
2008-04-24 17:54:22 +00:00
Shiqing Fan
4a9787979e When valgrind is not available or it is deselected (--without-valgrind, --with-valgrind=no), don't compile this component, continue without abortion.
This commit was SVN r18243.
2008-04-23 11:50:42 +00:00
Josh Hursey
cc83d41ad9 Merge in tmp/jjh-scratch
{{{
 svn merge -r 18218:18240 https://svn.open-mpi.org/svn/ompi/tmp/jjh-scratch .
}}}

Contains:
 * Primarily a fix for a user reported problem where a cached file descriptor is causing a SIGPIPE on restart.
 * Cleanup some small memory leaks from using mca_base_param_env_var() - Thanks Jeff
 * Cleanup ORTE FT tool compilation in non-FT builds - Thanks Tim P.
 * Cleanup mpi interface with missplaced {{{OPAL_CR_ENTER_LIBRARY}}} - Thanks Terry
 * Some other sundry cleanup items all dealing with C/R functionality in the trunk.

This commit was SVN r18241.
2008-04-23 00:17:12 +00:00
Jeff Squyres
db2695ccab Make the symbols be visible.
This commit was SVN r18201.
2008-04-18 00:26:17 +00:00
Ralph Castain
fa082cafa9 Shift the architecture calculation from the ompi/datatype engine to the opal/util area. This allows us to compute the architecture earlier in the launch and communicate it outside of the modex.
Note: this is an early preliminary step in the movement of portions of the datatype engine to the opal layer.

This commit was SVN r18198.
2008-04-17 20:43:56 +00:00
George Bosilca
01148b77dc Generate the help message for the available event ops. Now the list only
contains the one that are compiled on the current ompi.

This commit was SVN r18196.
2008-04-17 18:16:54 +00:00
Ralph Castain
e7487ad533 Implement the seq rmaps module that sequentially maps process ranks to a list hosts in a hostfile.
Restore the "do-not-launch" functionality so users can test a mapping without launching it.

Add a "do-not-resolve" cmd line flag to mpirun so the opal/util/if.c code does not attempt to resolve network addresses, thus enabling a user to test a hostfile mapping without hanging on network resolve requests.

Add a function to hostfile to generate an ordered list of host names from a hostfile

This commit was SVN r18190.
2008-04-17 13:50:59 +00:00
Shiqing Fan
49fbc4e795 These functions should always have a return value.
This commit was SVN r18174.
2008-04-16 13:54:15 +00:00
George Bosilca
b359d84661 Use the correct prefix.
This commit was SVN r18048.
2008-03-31 21:42:59 +00:00
George Bosilca
be2454e0c5 Default the temporary directory to /tmp if no special environment
variables are set.

This commit was SVN r18046.
2008-03-31 20:15:49 +00:00
George Bosilca
ee784b601e For consistency reasons always use opal_home_directory and
opal_tmp_directory.

This commit was SVN r18043.
2008-03-31 18:13:41 +00:00
George Bosilca
60111ce66d Few less warnings.
This commit was SVN r18025.
2008-03-30 19:06:49 +00:00
Lenny Verkhovsky
fa6a084d33 added opal/mca/paffinity/base/paffinity_base_service.c with paffinity functions
This commit was SVN r18020.
2008-03-30 12:01:02 +00:00
Lenny Verkhovsky
7e45d7e134 Few updates due to RMAPS rank_file component changes
1. applied prefix rule to functions and variables of RMAPS rank_file component
2. cleaned ompi_mpi_init.c from paffinity code
3. paffinity code moved to new opal/mca/paffinity/base/paffinity_base_service.c file
4. added opal_paffinity_slot_list mca parameter

This commit was SVN r18019.
2008-03-30 11:52:11 +00:00
Shiqing Fan
f82092566f We don't have inttypes.h on Windows, and some types are redefined.
This commit was SVN r18010.
2008-03-28 17:33:54 +00:00
Shiqing Fan
aaf2730fab Winsock2.h also has definition for timeval and so on, it conflicts with our own definitions.
This commit was SVN r18009.
2008-03-28 17:30:33 +00:00
Jeff Squyres
6ea36061cf Fix typo found by Pak.
This commit was SVN r18000.
2008-03-27 23:04:17 +00:00
Jeff Squyres
c06f7c3992 Fixes trac:1254: ensure that evport.c is in the distribution tarball.
This commit was SVN r17989.

The following Trac tickets were found above:
  Ticket 1254 --> https://svn.open-mpi.org/trac/ompi/ticket/1254
2008-03-27 16:40:55 +00:00
Sharon Melamed
afa98f92e8 Changed the for loop to a while loop so I could
release the edge without conflicting with get next.

This commit was SVN r17979.
2008-03-26 14:45:45 +00:00
Jeff Squyres
33c09b30c2 Patch from George: ensure that we don't overwrite timer_linux_happy
improperly when checking the host type.

This commit was SVN r17975.
2008-03-26 11:22:57 +00:00
George Bosilca
4a5431ef11 Remove the event-config.h file, it is never used.
Correct the include logic that protect the headers. It's amazing
that this didn't bite us yet ...

This commit was SVN r17971.
2008-03-26 03:33:43 +00:00
George Bosilca
64bc580c78 Use evutil_timercmp instead of timercmp to take advantage of the
fallback installed in evutil.h.

This commit was SVN r17968.
2008-03-25 23:54:30 +00:00
George Bosilca
2e46a53b0a Avoid strcpy if its not really required.
This commit was SVN r17962.
2008-03-25 22:40:20 +00:00
George Bosilca
028c7391d3 Coverty fix: Replace strcpy by strncpy.
This commit was SVN r17961.
2008-03-25 22:39:24 +00:00
George Bosilca
6717b2dc75 Add the Solaris evport to the list of available event subsystems.
This commit was SVN r17958.
2008-03-25 18:00:40 +00:00
Jeff Squyres
763218e754 Fix #1253: default libevent to use select/poll and only use the other
mechanisms (such as epoll) if someone (ompi_mpi_init()) requests
otherwise.  See big comment in opal/event/event.c for a full
explanation.

This commit was SVN r17956.
2008-03-25 17:18:17 +00:00
George Bosilca
03c10e2a85 Add the Solaris evport support.
This commit was SVN r17954.
2008-03-25 16:44:27 +00:00
George Bosilca
9222ea0d0a Cast the uintptr_t to int when playing with fds.
This commit was SVN r17925.
2008-03-23 18:16:29 +00:00
Jeff Squyres
8239e40607 Add header for OS X.
This commit was SVN r17924.
2008-03-23 12:57:57 +00:00
Jeff Squyres
314ab2c6e7 Update internal libevent to upstream (v1.4.2-rc + OMPI changes).
Greatly reduce the number of "foo" -> "opal_foo" symbol renames in the
libevent source, and instead greatly expand the event_rename.h file
that uses preprocessor macros to make all public symbols be
"opal_foo".

This commit was SVN r17923.
2008-03-23 12:33:04 +00:00
Jeff Squyres
dee561d29e Per recent off-list discussions about the build system, I have done
some cleanups and standardizations in the various */tools/*/ 
Makefile.am files.  This commit:

 * Somewhat simplify the tool Makefile.am's 
 * Makes the tool Makefile.am's consistent with each other (do similar
   actions in similar ways)
 * Update the tool Makefile.am's to remove old kruft that was required
   by older versions of AM (trunk requires AM >=1.10)

This commit was SVN r17921.
2008-03-22 02:04:05 +00:00
Jeff Squyres
05a7b1ed55 Remove svn:executable from these files.
This commit was SVN r17918.
2008-03-21 21:16:11 +00:00
Jeff Squyres
a4ec8a9d53 Spring cleaning -- no one is using this stuff; remove it from the tree.
This commit was SVN r17913.
2008-03-21 17:14:42 +00:00
Jeff Squyres
e0fb3957cb Patch from Brian:
* The opal_sys_timer_get_cycles() call was implemented for
   Sparc v9 using inline assembly, but not in the assembly files.
   This would only currently matter on Linux Sparc systems using
   a compiler that didn't support inline assembly (not many of
   those), but it should be there for completion.
 * The linux timer component would always build on non-Alpha
   platforms, rather than only building on platforms where
   opal_sys_timer_get_cycles() was implemented.  This would
   only matter on a very narrow set of platforms that we don't
   really support, but still, it could be more right.  We now
   only build the component on platforms where we have the
   assembly call to get the cycle counter.
 * Added a comment to opal/sys/timer.h to note that the linux
   timer component needed to be updated if another platform was
   added.

This should be harmless to commit.  It will only really change
behaviors on platforms we don't have assembly support for, which
currently won't make it through configure.  It really only matters
when (if?) we support atomic operations through libatomic_ops.

This commit was SVN r17887.
2008-03-20 00:29:36 +00:00
George Bosilca
3997639ec6 Hide what should be hidden, and expose the others. Plus some indentation.
This commit was SVN r17856.
2008-03-18 03:00:08 +00:00
Jeff Squyres
f443644bfe From Brian B.:
This commit lowers the priority of the darwin backtrace component
below that of the ''execinfo'' and ''stackprint'' components, which
will cause OS X Leopard to use the ''execinfo'' component.  execinfo
utilizes a public API for printing the stacktrace.  The ''darwin''
component uses some evil hacks and a not-so supported package from
Apple to print the stack trace.  

This commit was SVN r17840.
2008-03-17 13:39:25 +00:00
Jeff Squyres
9b18b0e9c6 Fix visibility symbols on OS X
This commit was SVN r17838.
2008-03-17 13:18:12 +00:00
George Bosilca
210631962c Add two convenience functions in order to make sure we get these
environment variables in a consistent manner. These functions
retrieve the user and the temporary directories (based on the
system).

This commit was SVN r17815.
2008-03-13 17:56:44 +00:00
Jon Mason
2e8a316ae6 opal_evtimer_initialized is missing the opening '('
This commit was SVN r17814.
2008-03-12 20:33:22 +00:00
Sharon Melamed
4a8e2a2648 Renove status check from carto initiation.
This commit was SVN r17812.
2008-03-12 08:55:28 +00:00
George Bosilca
4267f2b967 This symbol have to be visible.
This commit was SVN r17793.
2008-03-08 23:53:17 +00:00
Rainer Keller
32dcd9e551 - Adding #include <stdbool.h> with protection in r17488 and r17504
seemed to be the right thing(tm), but broke the Sun Studio C++
   compiler under Linux (ticket 747).

   This patch should allow inclusion into C and C++ from other header
   files without problems.

This commit was SVN r17792.

The following SVN revision numbers were found above:
  r17488 --> open-mpi/ompi@d53131f261
  r17504 --> open-mpi/ompi@b22e8e7567
2008-03-08 12:53:10 +00:00
Josh Hursey
aaff245271 A couple verbose additions. Poll the event engine while waiting for the
named pipe.

This commit was SVN r17787.
2008-03-07 21:10:14 +00:00