18548, "bmcdonald3", "Should a load factor setter for map be added?", "2021-10-11T22:09:09Z"
Summary
Exposing a maxLoadFactor
(name chosen to be consistent with other languages, open for discussion) to allow users to decide when is best for their map
to grow can help improve both performance and memory in specific cases. Choosing a higher load factor can help save memory on unnecessary map allocations if memory is a concern. Choosing a lower load factor can help improve performance because it can reduce the number of slots probed when there are collisions.
Proposed solutions
Proposal 1: Setter with fractional argument
var m = new map(int, int);
m.setMaxLoadCapacity(0.75);
Proposal 2: Setter with integer argument
var m = new map(int, int);
m.setMaxLoadCapacity(2);
Proposal 3: Keyword argument for map constructor
var m = new map(int, int, maxLoadCapacity=.75);
Pros/cons of proposals
- Setter with fractional argument
a. Accepting a fraction is easy to understand IMO (table can be this % full)
b. Setter method won't make constructors more complicated for something most users won't use
c. Not consistent with the C++ implementation - Setter with integer argument
a. Consistent with C++ implementation
b. Confusing to wrap your head around IMO (table can be 1/value full -- hard to intuit)
c. Limiting in what can be achieved - Keyword argument for map constructor
a. Doesn't require adding another method
b. Consistent with Java implementation
c. Fixed for life ofmap
Comparisons to other languages
C++ unordered_map
std::unordered_map<std::int, std::int> m;
m.max_load_factor(2); // map can now be up to 50% full (numslots/loadfactor)
Java Hashtable
Hashtable<int, int> m = new Hashtable<int, int>(10, 0.50); // initial capacity 10, table can be up to 50% full
Other languages
- Rust doesn't have a concept of a
loadFactor
, but instead allows setting of an initial capacity, which may have the same effect in many cases - Python, Julia, etc., do not have a concept of this, just like our
map
now
Links to related information
Motivated by: Change ChapelHashtable to use power-of-2 table size by bmcdonald3 · Pull Request #18427 · chapel-lang/chapel · GitHub and https://github.com/Cray/chapel-private/issues/2536
PR with proposal 1: Add a maximum capacity to Map to control when new memory is allocated by bmcdonald3 · Pull Request #18523 · chapel-lang/chapel · GitHub
C++ load factor: std::unordered_map<Key,T,Hash,KeyEqual,Allocator>::max_load_factor - cppreference.com
Java hashtable: Hashtable (Java Platform SE 8 )
Rust hashmap: HashMap in std::collections - Rust