![]() |
GemaCoreLib
The GeMA Core library
|
Implementation of a set of functions for integer vector sorting of triplet data. More...
#include "gmIntegerSort.h"
#include "gmMemory.h"
#include <QVarLengthArray>
#include "ska_sort.hpp"
#include "gmOmp.h"
#include <assert.h>
#include "gmTrace.h"
Macros | |
#define | CS_MASK(v) (( (*(KeyT*)(v)) >> shift) & mask) |
Macro used by countSort to extract the sort key from the current value v. | |
Functions | |
template<class DataT , class KeyT , class BucketT > | |
static void | countSort (DataT *src, size_t n, DataT *dst, unsigned maskSize, unsigned shift, BucketT *bucketMem) |
A serial, stable, Count sort implementation based on the algorithm in "Introduction to Algorithms", Cormen et all. More... | |
template<class DataT , class KeyT , class BucketT > | |
static void | parallelCountSort (DataT *src, size_t n, DataT *dst, unsigned maskSize, unsigned shift, BucketT *bucketMem, int nt) |
Simmilar to countSort() but running in parallel with 'nt' threads. More... | |
template<class DataT , class KeyT , class BucketT > | |
DataT * | GmLsdRadixSort (DataT *src, size_t n, DataT *aux, bool auxCopy, unsigned digitBits, unsigned keyBits, unsigned keyOffset) |
A serial, stable, Radix sort implementation based on the algorithm in "Introduction to Algorithms", Cormen et all. More... | |
template<class DataT , class KeyT , class BucketT > | |
DataT * | GmPLsdRadixSort (DataT *src, size_t n, DataT *aux, bool auxCopy, int nt, unsigned digitBits, unsigned keyBits, unsigned keyOffset) |
Simmilar to GmLsdRadixSort() but executing with nt threads. More... | |
template<class DataT > | |
GMC_API_EXPORT void | GmSkaRadixSort (DataT *src, size_t n) |
Simple stub for calling ska_sort(). Used to include the external lib inside GemaCoreLib so that clients don't need to include its path. | |
template<class DataT , class ExtractKey > | |
GMC_API_EXPORT void | GmSkaRadixSort (DataT *src, size_t n, ExtractKey &&key) |
Simple stub for calling ska_sort(). Used to include the external lib inside GemaCoreLib so that clients don't need to include its path. | |
Implementation of a set of functions for integer vector sorting of triplet data.
|
static |
A serial, stable, Count sort implementation based on the algorithm in "Introduction to Algorithms", Cormen et all.
This is a generic implementation that can be used for any vector data type where the sorting key is an unsigned number and is present as the FIRST field in the data type structure. It also receives as parameters the number of "digit" bits that should be used for sorting (maskSize) and its position inside the key (shift).
This implementation is NOT in-place and needs a destination vector with the same size as the input one. It also uses stack space equal to 2^maskSize * sizeof(BucketT) bytes for the counting buckets, making a maskSize of 16 bits its "practical" limit (64K * sizeof(BucketT)). The bucket size must be provided through the bucketMem vector. It is not allocated inside the function since it can be reused along multiple calls when countSort() is called by
radix sort (it also can't be allocated on the stack since maskSize is
a parameter).
Keep in mind that a faster algorithm can be implemented if we restrict ourselves to a scenario where the keys are limited to 16 bits types and the digit is the full value. In that case the last loop can be replaced by just sequentially filling the vector with the bucket index value for the number of occurrences in each bucket (see reference (1)).
This implementation also does NOT handle signed numbers. If signed values are treated as unsigned, the resulting order will be inverted with negative values after positive ones. See reference (1) below for an idea of how to handle the sign bit by splitting the last loop in two.
Links: https://www.drdobbs.com/architecture-and-design/algorithm-improvement-through-performanc/224201198
DataT | The type of the vector data |
KeyT | The type of the key used to sort the vector. Must be unsigned. |
BucketT | The unsigned type used for the count array. Must be big enough to fit the value of n. |
src | The vector to be sorted |
n | The vector size |
dst | The destination vector that will be used to hold the sorted data |
maskSize | The number of bits in the sorting digits. |
shift | The right shift size needed to bring the "digit" part of the key to the least significant bits of the key. The digit is extracted by "(key >> shift) & mask" where mask is a value with maskSize 1's on the least significant bits. |
bucketMem | A vector with 2^maskSize entries |
DataT* GmLsdRadixSort | ( | DataT * | src, |
size_t | n, | ||
DataT * | aux, | ||
bool | auxCopy, | ||
unsigned | digitBits, | ||
unsigned | keyBits, | ||
unsigned | keyOffset | ||
) |
A serial, stable, Radix sort implementation based on the algorithm in "Introduction to Algorithms", Cormen et all.
This is a generic implementation that can be used for any vector data type where the sorting key is an unsigned number and is present as the FIRST field in the data type structure.
Sorting is done by running counting sort repeatedly for each key digit, starting from the least significant digit (LSD). Digit size is a function parameter. There is support for when the digit size does not evenly divides the key type size (ex: 32 bits key with 12 bits digit)
Another parameter specifies how many bits of the key must be sorted. This allows, for example, when sorting a vector with 32-bit keys and 8 bit digits, where the maximum key value is 16.000.000 (24 significant digits) to issue only 3 sorting passes instead of the usual 4.
This implementation is NOT in-place and needs a destination vector with the same size as the input one. It also uses stack space equal to 2^digitBits * sizeof(BucketT) for the counting buckets, making a digitBits of 16 bits its "practical" limit (64K * sizeof(BucketT))
This implementation also does NOT handle signed numbers. See comments in the countSort() function.
DataT | The type of the vector data |
KeyT | The type of the key used to sort the vector. Must be unsigned. |
BucketT | The unsigned type used for the count array. Must be big enough to fit the value of n. |
src | The vector to be sorted |
n | The vector size |
aux | Auxiliar vector with size n used by the sort routine. Can be NULL. In that case the vector will be allocated and freed internally. |
auxCopy | Flag used only if aux is different from NULL. If set to true, the sorted result will always end-up in src. If the number of sort passes is odd, a memcpy oeration will ensure that the result ends up in src. |
If set to false, no extra copy will be done and the result is either in src or in aux, as reported by the function return.
digitBits | The number of bits in the sorting digits. |
keyBits | The number of bits in the key that should be considered for sorting. |
keyOffset | The offset in bits for the start of the key. Usually 0. |
DataT* GmPLsdRadixSort | ( | DataT * | src, |
size_t | n, | ||
DataT * | aux, | ||
bool | auxCopy, | ||
int | nt, | ||
unsigned | digitBits, | ||
unsigned | keyBits, | ||
unsigned | keyOffset | ||
) |
Simmilar to GmLsdRadixSort() but executing with nt threads.
The extra parameter nt controls the number of requested threads. A value of 0 (or -1 for compatibility with GmThreadManager) means that omp_get_max_threads() will be used.
|
static |
Simmilar to countSort() but running in parallel with 'nt' threads.
The parallelization is done by creating a bucket vector per thread. In the count step each thread counts the occurrences in a block (partition) from the global vector.
The second step adjusts each bucket[i], from each thread, to contain number of value less than i from all threads up until that one. This gives the "position" of a value i for that thread.
With the bucket vector adjusted, the last step where each value is placed in its final position can also proceed with each thread taking care of its own block.
The nt parameter (> 0) contains the number of threads that should be used in the paralelization. The bucketMem vector points to an area with size equal to 2^maskSize * nt to be used as the per thread bucket vector.