![]() |
GemaCoreLib
The GeMA Core library
|
Implementation of a set of functions for merging sorted vectors. More...
Functions | |
template<class DataT > | |
static void | mergeVectorPair (const DataT *v1, size_t n1, const DataT *v2, size_t n2, DataT *out) |
Basic algorithm to merge two vectors storing the result in a third vector with size n1 + n2. | |
template<class DataT > | |
static void | heapBasedMerge (const QVector< DataT * > &vectors, const QVector< size_t > &sizes, DataT *output) |
Merges a set of "k" ordered vectors whose pointers and sizes are given by vectors and sizes, storing the merged data in the given output vector with size equal to the sum of the input vector sizes. More... | |
template<class DataT > | |
void | GmMergeKVectors (const QVector< DataT * > &vectors, const QVector< size_t > &sizes, DataT *output) |
Merges a set of "k" ordered vectors whose pointers and sizes are given by vectors and sizes, storing the merged data in the given output vector with size equal to the sum of the input vector sizes. More... | |
Implementation of a set of functions for merging sorted vectors.
void GmMergeKVectors | ( | const QVector< DataT * > & | vectors, |
const QVector< size_t > & | sizes, | ||
DataT * | output | ||
) |
Merges a set of "k" ordered vectors whose pointers and sizes are given by vectors and sizes, storing the merged data in the given output vector with size equal to the sum of the input vector sizes.
Requires the DataT type to implement the < operator
|
static |
Merges a set of "k" ordered vectors whose pointers and sizes are given by vectors and sizes, storing the merged data in the given output vector with size equal to the sum of the input vector sizes.
The merge is done by creating a minimum Heap with the smallest value from all vectors. At each step the minimum value is stored in the output vector and the next value from the source vector is added to the heap. Works in O(n * log k), where n is the size of the output vector and k is the number of vectors.
The implementation uses a custom based heap, instead of the std heap implementation, since it can be 50% faster by changing the minimum heap value and calling heapify() when updating the heap, instead of the available push and pop operations from the standard. A loser tree (as described in https://en.wikipedia.org/wiki/K-way_merge_algorithm) was also tried but it ended-up beeing slower than our heap implementation.
Requires the DataT type to implement the < operator
Heap data storing a pointer to the src vector and also a pointer to one position past the last one
< Pointer to the current vector value
< Pointer to a position past the end of the vector
Our custom based min heap. See Cormen for an explanation