Several updates, including:
* Remove -single dash options
* Don't chmod the whole tree; just chmod the files we're trying to remove
* No more support for SVN or HG; 100% git
* Strengthen the dirty repo checks
* Use git describe for the repo version
* Set tarball_version to "" (i.e., empty) in VERSION
Removed a redundant copy of the scripts running on the build server
and moved the remaining copy out to a top-level directory in contrib
(i.e., contrib/build-server vs. contrib/dist/build-server, where I
never could remember where to find them).
Update the VERSION file scheme:
* Remove "want_repo_rev".
* Add "tarball_version".
All values are now always included (major, minor, release, greek,
repo_rev). However, configure.ac now runs "opal_get_version.sh
... --tarball", which will return the value of tarball_version (if it
is non-empty) or the "full" version string (i.e.,
"major.minor.releasegreek").
Remove configure.params support: configure.params hasn't been used in
years.
Also remove autogen.subdirs support; those should really be handled by
their respective Makefile.am's.
A problem was found with the libnbc MPI_Iallgather
routine when using intercommunicators. Special
thanks to Takahiro Kawashima(Fujitsu) for the patch
and a test case. Verified master fails without the
patch and the test passes with the patch applied.
fixes#219
It turns out that the alps plm code was developed only
on cray systems that were running batch schedulers.
However, for bring up and development systems, its not
at all uncommon for there to be no batch scheduler, and
thus to orte it appears that orte_num_allocated_nodes
is always zero. This forces a user using mpirun on such
a system to always specify a host list:
mpirun -n 4 -N 1 -host 32,45,68 ....
just to get the job to run, but then since the -L argument for aprun
is never built, the app always runs on the first batch of nodes that
aprun finds available.
Version numbering and "make dist" are quite complicated/subtle; I'm
not going to get this finished tonight. So revert VERSION to enable
other people to build.
More fixes coming soon...
This is a first cut at updating various infrastructure for git. There
will definitely be more commits; some of the scripts require
committed/pushed code (e.g., the various make-tarball scripts). So
it's not possible to know if we got it right without committing/pushing.
We don't use this script any more (we use gitdub now), but it took a
long time to figure this out. So I'm putting this script in git just
so that it's in history if we ever need it again.
Remove sections of README concerning Cray that are no longer
relevant owing to being obsolete. Minor grammar fixes.
Clarification of the --with-pmi section, as this works
differently for Cray systems.
This commit was SVN r32819.
In all previous releases, the version number would be "A.B.C" unless C
was 0, in which case it would be "A.B". This commit changes that
scheme to always be "A.B.C", even if C==0.
Hence, v1.9.0 will be the first release where this new scheme is evident.
This commit was SVN r32816.
Initialize the blocking_fence flag to false as the code logic indicates that it should only be set if someone provides that flag.
Thanks to Lisandro Dalcin for reporting it
cmr=v1.8.4:reviewer=hjelmn
This commit was SVN r32812.
Per discussions with pmix folks, it was determined that
the way the cray pmi pmix component was computing the
PMIX_NODE_RANK attribute for a process was incorrect.
This commit fixes the problem.
This commit was SVN r32810.
When using the native aprun launcher, it was observed that
there were frequent memory corruption errors occuring either
during a PMI kvs-fence operation, or at mpi termation during
opal cleanup of allocated objects. This was especially bad
when using
aprun --c none
In some cases, the application would even just hang in finalize
if using ptmalloc, owing to some kind of infinite loop in
cleanup of small blocks, etc.
It turns out that the proble was in orte_ess_base_proc_binding's
improper use of opal_hwloc_base_get_available_cpus. The cpuset
(bitmap) returned from that function is not meant to be freed
by the caller.
This problem is likely never observed when using the mpirun launcher
as there's an early exit if the OMPI_MCA_orte_bound_at_launch
environment variable is set.
This commit was SVN r32809.
It's not enough to AC_COMPILE_IFELSE, do AC_LINK_IFELSE to really make
sure the compiler suite supports it.
Refs trac:4917
This commit was SVN r32802.
The following Trac tickets were found above:
Ticket 4917 --> https://svn.open-mpi.org/trac/ompi/ticket/4917
Mimick the btl/tcp protocol to solve the race condition that happens
when two peers try to connect to each other at the same time
cmr=v1.8.4:reviewer=rhc
This commit was SVN r32799.
MTT found that the addition of the MPI_SIZEOF interfaces to mpif.h was
causing a linker error with the Absoft compiler. Absoft is working on
a fix, but we can workaround the issue for now. See comment in
Makefile.am in this commit for a lengthy explanation.
Refs trac:4917
This commit was SVN r32797.
The following Trac tickets were found above:
Ticket 4917 --> https://svn.open-mpi.org/trac/ompi/ticket/4917
of the topology is higher than the communicator size
It is possible to have a topology degree higher than the size of the communicator.
For example, a periodic cartesian communicator on MPI_COMM_SELF. This will leave
the neighborhood collectives with a request buffer that is too small. This commit
adds a call that will dynamically increase the size of the request buffer if it
is too small.
A better fix would be to create the topology *before* calling the coll_select
routine on a communicator. This will take some discussion and the solution will
not likely be ready anytime soon.
Thanks to Lisandro Dalcin for reporting this.
Original thread: http://www.open-mpi.org/community/lists/devel/2014/08/15713.php
cmr=v1.8.3:reviewer=jsquyres
This commit was SVN r32796.
* remove config/ompi_config_solaris_threads.m4 which was dead code
* check if pthreads work "as is" on all platforms including Solaris
(FWIW, the test should have been skipped if Solaris threads are used
and not if configure is ran on a Solaris box)
Refs trac:4911
This commit was SVN r32792.
The following Trac tickets were found above:
Ticket 4911 --> https://svn.open-mpi.org/trac/ompi/ticket/4911