b1da6f8bc4
* num_children should really be an int instead of size_t since 'size_t' is not signed and num_children can (in rare cases) drop below 0, and don't want it to roll around to MAX_INT or some such. * I figured out that this problem only happened to me because I use the pls_fork_reap_timeout MCA parameter and thus the only time that the code in pls_fork_module.c to waitpid is executed is if this is not set to 0 (I had it set to 1 to give my procs time to exit). I adjusted the loop from while{...} to do{...}while; so that it is executed at least once for consistency. * de-register the SIGCHILD callback for the pid before we attempt to kill it, so that we don't leave the door open for both the waitpids (the one in the callback, and the one in this function) to race to see who can wait on the child. * Move the 'thread release' to outside the for loop for a bit of an optimization, and always set the value to 0 since we want to finish after this function. * Added a help message for the case when we can't send a kill() signal to the process. Should never happen, but all is possible in the wild wild west of HPC. This commit was SVN r10666.