openmpi

Автор	SHA1	Сообщение	Дата
Jeff Squyres	7e57075f0d	Merge pull request #3248 from jsquyres/pr/remove-macosx-pkg-support dist: remove OS X package script	2017-03-29 18:46:14 -04:00
Jeff Squyres	f0a8a0af51	dist: remove OS X package script We stopped supporting this long ago. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-03-29 10:13:01 -04:00
Kevin Buckley	9e23c5e3f6	openmpi.spec: also put the modulefile in /opt if install_in_opt==1 Thanks to Kevin Buckley for noticing the issue and supplying the patch. [skip ci] bot:notest Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-03-28 20:45:09 -04:00
Howard Pritchard	8e4689c2b8	v3.x:updates for branching v3.x Signed-off-by: Howard Pritchard <howardp@lanl.gov>	2017-03-14 14:03:47 -06:00
Ralph Castain	24e8639826	Platform file update Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-03-14 11:11:48 -06:00
Ralph Castain	6d6bc9bd07	Update alps module to new APIs Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-03-12 09:43:07 -07:00
Ralph Castain	48fc339718	Create an alternative mapping method that pushes responsibility onto the backend daemons. By default, let mpirun only pack the app_context info and send that to the backend daemons where the mapping will be done. This significantly reduces the computational time on mpirun as it isn't running up/down the topology tree computing thousands of binding locations, and it reduces the launch message to a very small number of bytes. When running -novm, fall back to the old way of doing things where mpirun computes the entire map and binding, and then sends the full info to the backend daemon. Add a new cmd line option/mca param --fwd-mpirun-port that allows mpirun to dynamically select a port, but then passes that back to all the other daemons so they will use that port as a static port for their own wireup. In this mode, we no longer "phone home" directly to mpirun, but instead use the static port to wireup at daemon start. We then use the routing tree to rollup the initial launch report, and limit the number of open sockets on mpirun's node. Update ras simulator to track the new nidmap code Cleanup some bugs in the nidmap regex code, and enhance the error message for not enough slots to include the host on which the problem is found. Update gadget platform file Initialize the range count when starting a new range Fix the no-np case in managed allocation Ensure DVM node usage gets cleaned up after each job Update scaling.pl script to use --fwd-mpirun-port. Pre-connect the daemon to its parent during launch while we are otherwise waiting for the daemon's children to send their "phone home" rollup messages Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-03-07 20:43:12 -08:00
Ralph Castain	a774ea73e4	Skip empty files to avoid infinite loop Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-02-27 06:02:54 -08:00
Ralph Castain	9f8f7f3189	Add CPPFLAGS to build of rml/ofi component. Fix finalize to ensure we only destruct the msg queue list once. Update platform file Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-02-25 09:17:41 -08:00
Ralph Castain	8ae55429bc	Be a little less OMPI-centric on checking for the top-level directory Look for .git directory Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-02-22 11:29:51 -08:00
Ralph Castain	665850ed69	Use regex to define the protected files Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-02-18 06:45:13 -08:00
Ralph Castain	2f0aec709a	Protect the embedded libraries when updating copyrights - we shouldn't be overwriting their copyrights with our own bot:notest Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-02-18 06:45:13 -08:00
Jeff Squyres	81e57bb7db	nightly-tarball scripts: more quoting fixes Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-02-12 04:19:35 +00:00
Jeff Squyres	2d4fc45429	nightly-tarball scripts: fix quoting Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-02-11 22:40:10 +00:00
Jeff Squyres	b385ac4f09	nightly-tarball scripts: more debugging and robustness Check the exit status of major commands, as well as (optionally) output the pwd and command being executed (when debugging). Also, read the $debug variable from the environment; if it's set, go into debugging mode (vs. requiring a modification to the script to enable debugging mode). Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-02-11 21:50:10 +00:00
Jeff Squyres	0178307d36	openmpi-nightly-tarball: remove spurrious echo statement Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-02-11 21:50:10 +00:00
Jeff Squyres	704d6a0309	create_tarball: read $debug from environment If $debug is set in the environment, use that. This allows enabling debug mode without requiring an edit to the script. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-02-11 21:50:10 +00:00
Jeff Squyres	51def91003	nightly tarballs: compare the hashes to know if they're new The filenames contain date/timestamps; if you compare those, the tarball generated every night will always be new. Instead, separate out the git hash from the old and new tarballs, and compare those. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-02-09 16:56:00 +00:00
Ralph Castain	28abe78f8c	Add new platform files. Modify scaling.pl to support ppn option Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-01-29 15:55:49 -08:00
Open MPI Team	dba106ee10	pmix nightly tarball: only save 7 days We don't have infinite disk space: only save 7 days of builds, not 28. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-01-20 19:16:43 +00:00
Open MPI Team	96a90ffab3	remove-old.pl: update / fix minor bugs - Ensure that $to_delete is always defined - Re-indent to 4 spaces for readability - Don't only delete files -- it's ok to delete directories, too - Print the directory from which we are deleting Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-01-20 19:16:43 +00:00
Open MPI Team	e642d1d91c	nightly tarball: put the SSH target in a variable Just to make the scripts a little less error-prone. Also split up the ssh/scp lines just for readability. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-01-20 19:16:43 +00:00
Jeff Squyres	328b654626	snapshot: fix hash comparison - Don't use "-i" CLI option to perl; it's unnecessary here and causes a warning - Branch names may not be entirely letters (e.g., "v1.11"), so take any character in the regexp to match the branch name Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-01-20 13:26:42 -05:00
Ralph Castain	6509f60929	Complete the memprobe support. This provides a new scaling tool called "mpi_memprobe" that samples the memory footprint of the local daemon and the client procs, and then reports the results. The output contains the footprint of the daemon on each node, plus the average footprint of the client procs on that node. Samples are taken after MPI_Init, and then again after MPI_Barrier. This allows the user to see memory consumption caused by add_procs, as well as any modex contribution from forming connections if pmix_base_async_modex is given. Using the probe simply involves executing it via mpirun, with however many copies you want per node. Example: $ mpirun -npernode 2 ./mpi_memprobe Sampling memory usage after MPI_Init Data for node rhc001 Daemon: 12.483398 Client: 6.514648 Data for node rhc002 Daemon: 11.865234 Client: 4.643555 Sampling memory usage after MPI_Barrier Data for node rhc001 Daemon: 12.520508 Client: 6.576660 Data for node rhc002 Daemon: 11.879883 Client: 4.703125 Note that the client value on node rhc001 is larger - this is where rank=0 is housed, and apparently it gets a larger footprint for some reason. Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-01-05 10:32:17 -08:00
Ralph Castain	f355fb926d	Continue cleanup of notifications. Resolve a race condition that can result in attempt to send a message on a closed socket Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-01-04 09:16:33 -08:00
Ralph Castain	9eab9a1ed3	Remove stale global variables Revamp the event notification integration to rely on the PMIx event chaining and remove the duplicate chaining in OPAL. This ensures we get system-level events that target non-default handlers. Restore the hostname entries for MPI-level error messages, but provide an MCA param (orte_hostname_cutoff) to remove them for large clusters where the memory footprint is problematic. Set the default at 1000 nodes in the job (not the allocation). Begin first cut at memory profiler Some minor cleanups of memprobe Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-01-02 14:04:24 -08:00
Jeff Squyres	6002a8bca5	buildrpm.sh: don't use $HOME This is news to me: I didn't know that some distros do not set $HOME. So use "~" instead, and only try to grep ~/.rpmmacros if it exists. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-12-21 07:42:32 -08:00
Jeff Squyres	5ecd271934	buildrpm.sh: minor fixes Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-12-20 10:54:37 -08:00
Jeff Squyres	bd1828c54d	Merge pull request #2451 from martinkontsek/master master: Add arguments to rpmbuild script and update README.	2016-12-17 12:28:59 -05:00
Martin Kontsek	30d076a2f7	Add arguments to rpmbuild script and update README, implement pull request suggestions. Signed-off-by: Martin Kontsek <mkontsek@cisco.com>	2016-12-15 11:18:41 -08:00
Jeff Squyres	a28ae984ee	make-authors: we no longer require organizations Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-12-14 10:20:56 -08:00
Jeff Squyres	1187212f5d	scaling.pl: minor change to perl quoting Makes emacs syntax hilighting work better. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-12-08 09:25:08 -08:00
Ralph Castain	d5a428b646	Scaling test should only launch one proc/node Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-12-08 09:24:22 -08:00
Ralph Castain	144a9d267b	Update the purge-tab-indents.pl script to avoid resetting permissions Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-12-05 09:49:58 -08:00
Ralph Castain	af9a55ccf1	Fix the session directory cleanup - only remove the jobfam session dir level if we are the local daemon and are cleaning up our own session directory. Update the scaling test to run more trials and report the options being tested each time Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-12-03 09:59:18 -08:00
Ralph Castain	1e2019ce2a	Revert "Update to sync with OMPI master and cleanup to build" This reverts commit `cb55c88a8b`.	2016-11-22 15:03:20 -08:00
Ralph Castain	cb55c88a8b	Update to sync with OMPI master and cleanup to build Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-11-22 14:24:54 -08:00
Ralph Castain	0c8359b0b9	Avoid adding blank lines when purging tabs Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-11-22 09:38:37 -08:00
Ralph Castain	8ecb240955	Use quiet print for debug Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-11-19 11:47:27 -08:00
Ralph Castain	14b4698890	Fix executable mode	2016-11-19 11:44:19 -08:00
Ralph Castain	fb644abd1e	Add a couple of helper tools to prepare git commits by removing all trailing blank lines, and replacing tabs with indents. These tools default to looking only at modified files, but can also be used to scan the entire directory tree via the --all option. Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-11-19 11:44:19 -08:00
Jeff Squyres	06e75d65c3	nightly-tarball: update Coverity configure params * Point to local libfabric v1.4 install * Add MPI C++ bindings * Remove PSM support (if someone can install PSM/PSM2 libraries on the build server, let's re-enable this) Also change from -j8 to -j4 (the new AWS build instance only has 1 core / 2 hyperthreads). Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-11-03 12:27:34 -04:00
Jeff Squyres	7ccf253063	Remove old/now-useless SVN integration scripts Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-11-03 12:18:14 -04:00
Jeff Squyres	a47ad865d3	create_tarball.sh: make sure to just get the git hash Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-11-02 08:57:32 -07:00
Jeff Squyres	78d1e4ebff	create_tarball.sh: update snapshot filename Nightly snapshots will now be named: openmpi-${BRANCHNAME}-${YYYYMMDDHHMM}-${SHORTHASH}.tar.${COMPRESSION}. Fixes #2337 Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-11-01 17:09:17 -07:00
Ralph Castain	649301a3a2	Revise the routed framework to be multi-select so it can support the new conduit system. Update all calls to rml.send* to the new syntax. Define an orte_mgmt_conduit for admin and IOF messages, and an orte_coll_conduit for all collective operations (e.g., xcast, modex, and barrier). Still not completely done as we need a better way of tracking the routed module being used down in the OOB - e.g., when a peer drops connection, we want to remove that route from all conduits that (a) use the OOB and (b) are routed, but we don't want to remove it from an OFI conduit.	2016-10-23 21:52:39 -07:00
Ralph Castain	2f966bf3bf	Cleanup external PMIx v3 component for copy/paste errors - component and module require unique names	2016-10-20 09:11:46 -07:00
Ralph Castain	50bb0ded70	Update the PMIx nightly scripts to generalize locations	2016-10-14 08:40:05 -07:00
Ralph Castain	b11c9574d4	Remove debug and update copyright	2016-10-11 23:28:16 -07:00
Ralph Castain	a2326e3ba0	Update the scaling test to properly use orterun for orte-dvm tests, and extend by adding params for async mpi init/finalize	2016-10-11 23:24:52 -07:00

1 2 3 4 5 ...

1111 Коммитов