Blender V4.3
BLI_set.hh File Reference
#include "BLI_array.hh"
#include "BLI_hash.hh"
#include "BLI_hash_tables.hh"
#include "BLI_probing_strategies.hh"
#include "BLI_set_slots.hh"

Go to the source code of this file.

Classes

class  blender::Set< Key, InlineBufferCapacity, ProbingStrategy, Hash, IsEqual, Slot, Allocator >
 
class  blender::Set< Key, InlineBufferCapacity, ProbingStrategy, Hash, IsEqual, Slot, Allocator >::Iterator
 

Namespaces

namespace  blender
 

Macros

#define LOAD_FACTOR   1, 2
 
#define SET_SLOT_PROBING_BEGIN(HASH, R_SLOT)
 
#define SET_SLOT_PROBING_END()   SLOT_PROBING_END()
 

Typedefs

template<typename Key , int64_t InlineBufferCapacity = default_inline_buffer_capacity(sizeof(Key)), typename ProbingStrategy = DefaultProbingStrategy, typename Hash = DefaultHash<Key>, typename IsEqual = DefaultEquality<Key>, typename Slot = typename DefaultSetSlot<Key>::type>
using blender::RawSet = Set<Key, InlineBufferCapacity, ProbingStrategy, Hash, IsEqual, Slot, RawAllocator>
 

Detailed Description

A blender::Set<Key> is an unordered container for unique elements of type Key. It is designed to be a more convenient and efficient replacement for std::unordered_set. All core operations (add, remove and contains) can be done in O(1) amortized expected time.

In most cases, your default choice for a hash set in Blender should be blender::Set.

blender::Set is implemented using open addressing in a slot array with a power-of-two size. Every slot is in one of three states: empty, occupied or removed. If a slot is occupied, it contains an instance of the key type.

Bench-marking and comparing hash tables is hard, because many factors influence the result. The performance of a hash table depends on the combination of the hash function, probing strategy, max load factor, key type, slot type and the data distribution. This implementation is designed to be relatively fast by default in all cases. However, it also offers many customization points that allow it to be optimized for a specific use case.

A rudimentary benchmark can be found in BLI_set_test.cc. The results of that benchmark are there as well. The numbers show that in this specific case blender::Set outperforms #std::unordered_set consistently by a good amount.

Some noteworthy information:

  • Key must be a movable type.
  • Pointers to keys might be invalidated when the set is changed or moved.
  • The hash function can be customized. See BLI_hash.hh for details.
  • The probing strategy can be customized. See BLI_probing_stragies.hh for details.
  • The slot type can be customized. See BLI_set_slots.hh for details.
  • Small buffer optimization is enabled by default, if the key is not too large.
  • The methods add_new and remove_contained should be used instead of add and remove whenever appropriate. Assumptions and intention are described better this way.
  • Lookups can be performed using types other than Key without conversion. For that use the methods ending with _as. The template parameters Hash and #IsEqual have to support the other key type. This can greatly improve performance when the set contains strings.
  • The default constructor is cheap, even when a large #InlineBufferCapacity is used. A large slot array will only be initialized when the first key is added.
  • The print_stats method can be used to get information about the distribution of keys and memory usage of the set.
  • The method names don't follow the std::unordered_set names in many cases. Searching for such names in this file will usually let you discover the new name.

Possible Improvements:

  • Use a branch-less loop over slots in grow function (measured ~10% performance improvement when the distribution of occupied slots is sufficiently random).
  • Support max load factor customization.
  • Improve performance with large data sets through software prefetching. I got fairly significant improvements in simple tests (~30% faster). It still needs to be investigated how to make a nice interface for this functionality.

Definition in file BLI_set.hh.

Macro Definition Documentation

◆ LOAD_FACTOR

#define LOAD_FACTOR   1, 2

The max load factor is 1/2 = 50% by default.

Definition at line 144 of file BLI_set.hh.

◆ SET_SLOT_PROBING_BEGIN

#define SET_SLOT_PROBING_BEGIN ( HASH,
R_SLOT )
Value:
SLOT_PROBING_BEGIN (ProbingStrategy, HASH, slot_mask_, SLOT_INDEX) \
auto &R_SLOT = slots_[SLOT_INDEX];
#define SLOT_PROBING_BEGIN(PROBING_STRATEGY, HASH, MASK, R_SLOT_INDEX)
#define HASH(i, j, k)

Iterate over a slot index sequence for a given hash.

Definition at line 157 of file BLI_set.hh.

◆ SET_SLOT_PROBING_END

#define SET_SLOT_PROBING_END ( )    SLOT_PROBING_END()

Definition at line 160 of file BLI_set.hh.