Difference between revisions of "Intermediate C++ Game Programming Tutorial 24"
From Chilipedia
(→Video Timestamp Index) |
(→Video Timestamp Index) |
||
Line 101: | Line 101: | ||
</div> | </div> | ||
* Using an unordered map [https://youtu.be/LsjFAx-dG5I?t=1m38s 1:38] | * Using an unordered map [https://youtu.be/LsjFAx-dG5I?t=1m38s 1:38] | ||
+ | <div class="mw-collapsible-content"> | ||
** The interface is pretty much the same as its ordered counterpart | ** The interface is pretty much the same as its ordered counterpart | ||
** Include <code><unordered_map></code>, declare using <code>std::unordered_map<KeyType,ValueType></code> | ** Include <code><unordered_map></code>, declare using <code>std::unordered_map<KeyType,ValueType></code> | ||
** You can initialize your map object with an initializer list if you wanted to using <code>({ {..,..},{..,..},... })</code> inside your declaration | ** You can initialize your map object with an initializer list if you wanted to using <code>({ {..,..},{..,..},... })</code> inside your declaration | ||
+ | </div> | ||
* The Hash Table data structure [https://youtu.be/LsjFAx-dG5I?t=3m20s 3:20] | * The Hash Table data structure [https://youtu.be/LsjFAx-dG5I?t=3m20s 3:20] | ||
+ | <div class="mw-collapsible-content"> | ||
** A hash table allows you to get the quick access to values, comparable to array access using the index, but with efficient memory usage | ** A hash table allows you to get the quick access to values, comparable to array access using the index, but with efficient memory usage | ||
** Buckets are used to group keys; this is done by mapping keys to buckets using a hash function (a.k.a. hashing) | ** Buckets are used to group keys; this is done by mapping keys to buckets using a hash function (a.k.a. hashing) | ||
Line 114: | Line 117: | ||
** The Standard Library provides general hashing functions for all the standard types | ** The Standard Library provides general hashing functions for all the standard types | ||
** For general use of unordered maps, we don't have to worry about the technical details of how the hash table works, the STL provides this | ** For general use of unordered maps, we don't have to worry about the technical details of how the hash table works, the STL provides this | ||
+ | </div> | ||
* Requirements for the KeyType of an <code>unordered_map</code> / a hash table [https://youtu.be/LsjFAx-dG5I?t=11m56s 11:56] | * Requirements for the KeyType of an <code>unordered_map</code> / a hash table [https://youtu.be/LsjFAx-dG5I?t=11m56s 11:56] | ||
+ | <div class="mw-collapsible-content"> | ||
** There needs to be a working hash function defined for the KeyType | ** There needs to be a working hash function defined for the KeyType | ||
** There need to be comparison and equality functor definitions for the KeyType | ** There need to be comparison and equality functor definitions for the KeyType | ||
+ | </div> | ||
* Example: map from <code>Vec2</code> class (2D coordinates) to a string [https://youtu.be/LsjFAx-dG5I?t=12m46s 12:46] | * Example: map from <code>Vec2</code> class (2D coordinates) to a string [https://youtu.be/LsjFAx-dG5I?t=12m46s 12:46] | ||
+ | <div class="mw-collapsible-content"><div class="mw-collapsible-content"> | ||
** In order to make this work, you need to define a hash function and the comparators for <code>Vei2</code> | ** In order to make this work, you need to define a hash function and the comparators for <code>Vei2</code> | ||
** You can implement a comparison/equality functor as a <code>struct</code> that defines a <code>bool operator()( const T& lhs,const T& rhs ) const</code> member function, templated on <code>T</code> | ** You can implement a comparison/equality functor as a <code>struct</code> that defines a <code>bool operator()( const T& lhs,const T& rhs ) const</code> member function, templated on <code>T</code> | ||
** Defining a custom hashing function is an art, it requires knowledge of cryptography, abstract algebra, discrete math, etc. | ** Defining a custom hashing function is an art, it requires knowledge of cryptography, abstract algebra, discrete math, etc. | ||
** Luckily, we don't need this; you can revert to the standard hashing functions for the basic types that make up any custom type | ** Luckily, we don't need this; you can revert to the standard hashing functions for the basic types that make up any custom type | ||
+ | </div> | ||
* Hash combining [https://youtu.be/LsjFAx-dG5I?t=14m25s 14:25] | * Hash combining [https://youtu.be/LsjFAx-dG5I?t=14m25s 14:25] | ||
+ | <div class="mw-collapsible-content"> | ||
** Combining hashes from basic types to create a hash over your custom object | ** Combining hashes from basic types to create a hash over your custom object | ||
** Stack Overflow question "How do I combine hash values in C++" give the example as used in the boost library | ** Stack Overflow question "How do I combine hash values in C++" give the example as used in the boost library | ||
Line 136: | Line 145: | ||
** You pass these functors when defining the map: <code>std::unordered_map<Vei2,std::string,HashVec2> map;</code> [https://youtu.be/LsjFAx-dG5I?t=17m15s 17:15]. | ** You pass these functors when defining the map: <code>std::unordered_map<Vei2,std::string,HashVec2> map;</code> [https://youtu.be/LsjFAx-dG5I?t=17m15s 17:15]. | ||
** Note that the comparison functor is not needed: we can revert back to the equality operator already defined in the <code>Vec2</code> class definition | ** Note that the comparison functor is not needed: we can revert back to the equality operator already defined in the <code>Vec2</code> class definition | ||
+ | </div> | ||
* Template Specialization [https://youtu.be/LsjFAx-dG5I?t=18m43s 18:43] | * Template Specialization [https://youtu.be/LsjFAx-dG5I?t=18m43s 18:43] | ||
+ | <div class="mw-collapsible-content"> | ||
** Unordered map uses <code>std::hash</code> by default. You can inject Template Specialization for <code>std::hash</code> into the <code>std</code> Namespace for your own custom types only | ** Unordered map uses <code>std::hash</code> by default. You can inject Template Specialization for <code>std::hash</code> into the <code>std</code> Namespace for your own custom types only | ||
*: <code>namespace std</code> | *: <code>namespace std</code> | ||
Line 147: | Line 158: | ||
*: <code>}</code> | *: <code>}</code> | ||
** Now you don't need to pass <code>HashVec2</code> in the map definition | ** Now you don't need to pass <code>HashVec2</code> in the map definition | ||
+ | </div> | ||
* The <code>std::unordered_map<></code> Bucket interface [https://youtu.be/LsjFAx-dG5I?t=20m00s 20:00] | * The <code>std::unordered_map<></code> Bucket interface [https://youtu.be/LsjFAx-dG5I?t=20m00s 20:00] | ||
+ | <div class="mw-collapsible-content"> | ||
** Allows you to get information about the buckets in the hash table and access nodes | ** Allows you to get information about the buckets in the hash table and access nodes | ||
** The bucket iterator takes an index of the bucket and allows you to iterate over all the elements in that specific bucket | ** The bucket iterator takes an index of the bucket and allows you to iterate over all the elements in that specific bucket | ||
+ | </div> | ||
* The <code>std::unordered_map<></code> Hash policy interface [https://youtu.be/LsjFAx-dG5I?t=21m47s 21:47] | * The <code>std::unordered_map<></code> Hash policy interface [https://youtu.be/LsjFAx-dG5I?t=21m47s 21:47] | ||
+ | <div class="mw-collapsible-content"> | ||
** Allows you to tune your hash table (and thus the growth behavior & performance of the map) | ** Allows you to tune your hash table (and thus the growth behavior & performance of the map) | ||
** Load Factor = average number of elements per bucket. For performance, you typically want to keep this below 1 | ** Load Factor = average number of elements per bucket. For performance, you typically want to keep this below 1 | ||
Line 157: | Line 172: | ||
** You can manually rehash to a number of buckets you define | ** You can manually rehash to a number of buckets you define | ||
** You can reserve space for max number of elements, is then derives (and manages) the required number of buckets | ** You can reserve space for max number of elements, is then derives (and manages) the required number of buckets | ||
+ | </div> | ||
* When to choose <code>std::map</code> over <code>std::unordered_map</code>? [https://youtu.be/LsjFAx-dG5I?t=24m15s 25:15] | * When to choose <code>std::map</code> over <code>std::unordered_map</code>? [https://youtu.be/LsjFAx-dG5I?t=24m15s 25:15] | ||
+ | <div class="mw-collapsible-content"> | ||
** For simplicity and when performance is not a critical issue, no need to define a hash function; | ** For simplicity and when performance is not a critical issue, no need to define a hash function; | ||
** If you want to iterate in order; | ** If you want to iterate in order; | ||
** When you want to be able to find keys that are close to a certain key (with <code>lower_bound</code> and <code>upper_bount</code> | ** When you want to be able to find keys that are close to a certain key (with <code>lower_bound</code> and <code>upper_bount</code> | ||
+ | </div> | ||
* Homework assignment [https://youtu.be/LsjFAx-dG5I?t=26m04s 26:04] | * Homework assignment [https://youtu.be/LsjFAx-dG5I?t=26m04s 26:04] | ||
</div> | </div> |
Revision as of 21:57, 2 February 2020
Associative containers are super useful, both as a convenient fast way to create dictionary or mapping for real-world problems like managing game resources, and as a data structure to help solve more abstract algorithmic computer science problems. And hash tables are fast as balls.
Contents
Topics Covered
Part 1: ordered associative containers
-
std::map
container interface - Binary tree data structure
-
std::map
key requirements (comparison) -
std::map
gotchas (std::remove_if
andconst
keys) -
std::set
-
std::multimap
andstd::multiset
Part 2: unordered associative containers
- Hash table performance vs. binary tree performance
- Hash table data structure
-
std::unordered_map
key requirements - Hash combining
-
std::unordered_map
bucket interface and hashing policy - When to choose
std::map
overstd::unordered_map
Video Timestamp Index
Tutorial 24.1: The ordered associative containers
- The
std::map<KeyType,ValueType>
class 0:46
- Maps consist of keys to lookup (associated) values
-
map.insert( {key,value} )
to insert (key,value) pairs -
map[key]
returns a reference to the ValueType for a KeyType
- A Binary Tree data structure is used to manage the order of map elements 2:46
-
std::map
performs lookup in O(log(n)), it uses a Binary tree data structure - Key properties of a Binary Tree (BT):
- - Nodes can have at most 2 children (hence: binary)
- - Each left child is smaller and each right child is larger than its parent
- - Insertion is done by navigating the tree along a route Left for smaller, Right for larger such that the order property always holds
- The big advantage of the BT properties is that retrieval is very fast
- The beauty of
std::map
is that we don't have to implement any of this; it's all there in the STL 7:00 - The STL implementation is further optimized, e.g. it uses a red-black tree for BT rebalancing
-
- A look at the
std::map
cppreference.com documentation: insert, lookup & find 7:35
-
map.insert()
takes a pair typestd::pair<KeyType,ValueType>
, the Map's elements - C++ can deduce the pair Type, so
map.insert({keyX,valueXYZ});
with curly braces will do the job - An even better way to insert is through
map.emplace()
operation; it will construct the pair in-place. - For lookup, you can use square braces,
map[x]
will return a reference to the corresponding value - Note: a lookup with a new key value will create that element in the map with the default constructed ValueType value
-
insert
oremplace
with a key that already exists will NOT override the existing value:std::map::emplace
returns astd::pair<iterator,bool>
where the bool inidicates whether an insertion took place -
map.find("xyz")
returns an iterator to the element if it exitst, and an iterator tomap.end()
if it doesn't exist (useful to check if a key already exists) -
std::map
comes with iterators and because it is a sorted map, when you iterate over its elements withfor (auto& el : map)
, it will be in order (of the keys)
-
- Requirements on KeyType 14:30
- The KeyType has to be comparable. The third template parameter is a functor for KeyType Comparison that defaults to
std::less<KeyType>
- So by default keys have to implement the "less than" comparison operator or provide your own comparison functor when defining the map
- The KeyType has to be comparable. The third template parameter is a functor for KeyType Comparison that defaults to
-
std::map
cppreference.com documentation continued: erase 15:28
-
std::map::erase
offers three basic ways to erase elements:
- - With an iterator; returns an iterator following the last removed element
- - With an iterator range, idem
- - By key through
map.erase(const KeyType& key)
; this operation returns the number of elements erased (insize_type
)
-
- Two important things to know when working with associative containers 16:04
-
std::remove_if
does not work with associative containers (will come with C++20).
- - You have to iterate over the elements with
for( auto i = map.begin(); i != map.end();)
- - And apply
i = map.erase(i);
in the body of yourif
logic, and++i
in theelse
block.
- Keys are
const
. You're not allowed to modify the keys 18:38
- - Makes sense: the keys define the structure of the binary tree.
- - If you modify the key you invalidate this structure (it would require a deletion and insertion to do it properly)
-
- The
std::set<KeyType>
class 20:00
- With a set, you only have keys, and a unique entry for each unique key
- Use case: ensure that there are no duplicates in a set
- The
std::multimap
andstd::multiset
classes 21:28
- Map has unique keys, with multimap you can insert multiple elements with the same key
- This enables operations like
std::multimap::equal_range
that returns a pair of iterators (begin and end) of the range where these elements have that same key -
std::multimap::count
will return the number of elements with specific key
- Practical example of a multimap use case 22:30
- Implementation example of a custom Comparison functor for the
Vei2
class (2D coordinate vector).
- - Chili's choice for ordering (used in the body of the functor):
- -
return (lhs.x == rhs.x) ? lhs.y < rhs.y : lhs.x < rhs.x;
- Example of how to find and print multiple elements in a multimap using
equal_range()
- Implementation example of a custom Comparison functor for the
- Lookup in multimaps 25:21
- Note: the multimap class does not have an index operator
[]
- When you do a lookup on a multimap, you should use
equal_range()
- The problem with
find()
on a multimap, is that if there are several elements with key in the ccontainer, any of them may be returned
- Note: the multimap class does not have an index operator
Tutorial 24.2: The unordered associative containers
- Main difference between ordered/unordered is performance 0:14
- Implication: if you iterative over an unordered container, keys will appear in (seemingly) random order
- Releasing the ordering requirement makes it possible to use a hash table with performance advantages: O(1) contant time insertion and lookup
- Using an unordered map 1:38
- The interface is pretty much the same as its ordered counterpart
- Include
<unordered_map>
, declare usingstd::unordered_map<KeyType,ValueType>
- You can initialize your map object with an initializer list if you wanted to using
({ {..,..},{..,..},... })
inside your declaration
- The Hash Table data structure 3:20
- A hash table allows you to get the quick access to values, comparable to array access using the index, but with efficient memory usage
- Buckets are used to group keys; this is done by mapping keys to buckets using a hash function (a.k.a. hashing)
- Multiple keys can map to the same bucket in a hash table ("collision"). We use a linked list to store multiple {key,value} pairs in a bucket
- Two ways to minimize hash collisions: i) more buckets, ii) smart hash function that distributes key values uniformly across your bucket space
- Hashing a a two step process 9:26:
- - A hash function takes in the KeyType input (typically a string or int) and outputs a size_t
- - the size_t output is reduced/ditributed to the size of the hash table (number of buckets)
- The Standard Library provides general hashing functions for all the standard types
- For general use of unordered maps, we don't have to worry about the technical details of how the hash table works, the STL provides this
- Requirements for the KeyType of an
unordered_map
/ a hash table 11:56
- There needs to be a working hash function defined for the KeyType
- There need to be comparison and equality functor definitions for the KeyType
- Example: map from
Vec2
class (2D coordinates) to a string 12:46
- In order to make this work, you need to define a hash function and the comparators for
Vei2
- You can implement a comparison/equality functor as a
struct
that defines abool operator()( const T& lhs,const T& rhs ) const
member function, templated onT
- Defining a custom hashing function is an art, it requires knowledge of cryptography, abstract algebra, discrete math, etc.
- Luckily, we don't need this; you can revert to the standard hashing functions for the basic types that make up any custom type
- In order to make this work, you need to define a hash function and the comparators for
- Hash combining 14:25
- Combining hashes from basic types to create a hash over your custom object
- Stack Overflow question "How do I combine hash values in C++" give the example as used in the boost library
- You can implement a hashing functor as a
struct
that defines a member function, templated onT
, the basic type of theVec2
coordinates:
-
size_t operator()( const _Vec2<T>& vec ) const
-
{
-
std::hash<T> hasher;
-
auto hashx = hasher ( vec.x );
-
auto hashy = hasher ( vec.y );
-
hashx ^= hashy + 0x9e3779b9 + (hashx << 6) + (hashx >> 2);
-
return hashx;
-
-
}
- You pass these functors when defining the map:
std::unordered_map<Vei2,std::string,HashVec2> map;
17:15. - Note that the comparison functor is not needed: we can revert back to the equality operator already defined in the
Vec2
class definition
- Template Specialization 18:43
- Unordered map uses
std::hash
by default. You can inject Template Specialization forstd::hash
into thestd
Namespace for your own custom types only
-
namespace std
-
{
-
template <> struct hash<Vei2>
-
{
-
size_t operator()( cont Vei2& vec ) const
-
{...}
-
-
};
-
-
}
- Now you don't need to pass
HashVec2
in the map definition
- Unordered map uses
- The
std::unordered_map<>
Bucket interface 20:00
- Allows you to get information about the buckets in the hash table and access nodes
- The bucket iterator takes an index of the bucket and allows you to iterate over all the elements in that specific bucket
- The
std::unordered_map<>
Hash policy interface 21:47
- Allows you to tune your hash table (and thus the growth behavior & performance of the map)
- Load Factor = average number of elements per bucket. For performance, you typically want to keep this below 1
- You can set the maximum load factor above which the table gets rehashed
- When the load factor becomes too high, it will automaticall rehash the table and increase the number of buckets
- You can manually rehash to a number of buckets you define
- You can reserve space for max number of elements, is then derives (and manages) the required number of buckets
- When to choose
std::map
overstd::unordered_map
? 25:15
- For simplicity and when performance is not a critical issue, no need to define a hash function;
- If you want to iterate in order;
- When you want to be able to find keys that are close to a certain key (with
lower_bound
andupper_bount
- Homework assignment 26:04
Homework Assignment
The homework for this video is to enable use of a custom datatype in unordered_map
hashing over multiple (4) members of that datatype. The solution video is here.