keep track of the sizeof the blocklen_per_process and displs_per_process on the aggregator datastructure to minimze the number of realloc function calls required in the shuffle_init operation. Signed-off-by: raafatfeki <fekiraafat@gmail.com>