New Issue: Should a load factor setter for map be added?

18548, "bmcdonald3", "Should a load factor setter for map be added?", "2021-10-11T22:09:09Z"

Summary

Exposing a maxLoadFactor (name chosen to be consistent with other languages, open for discussion) to allow users to decide when is best for their map to grow can help improve both performance and memory in specific cases. Choosing a higher load factor can help save memory on unnecessary map allocations if memory is a concern. Choosing a lower load factor can help improve performance because it can reduce the number of slots probed when there are collisions.

Proposed solutions

Proposal 1: Setter with fractional argument

var m = new map(int, int);
m.setMaxLoadCapacity(0.75);

Proposal 2: Setter with integer argument

var m = new map(int, int);
m.setMaxLoadCapacity(2);

Proposal 3: Keyword argument for map constructor

var m = new map(int, int, maxLoadCapacity=.75);

Pros/cons of proposals

  1. Setter with fractional argument
    a. Accepting a fraction is easy to understand IMO (table can be this % full)
    b. Setter method won't make constructors more complicated for something most users won't use
    c. Not consistent with the C++ implementation
  2. Setter with integer argument
    a. Consistent with C++ implementation
    b. Confusing to wrap your head around IMO (table can be 1/value full -- hard to intuit)
    c. Limiting in what can be achieved
  3. Keyword argument for map constructor
    a. Doesn't require adding another method
    b. Consistent with Java implementation
    c. Fixed for life of map

Comparisons to other languages

C++ unordered_map

std::unordered_map<std::int, std::int> m;
m.max_load_factor(2); // map can now be up to 50% full (numslots/loadfactor)

Java Hashtable

Hashtable<int, int> m = new Hashtable<int, int>(10, 0.50); // initial capacity 10, table can be up to 50% full

Other languages

  • Rust doesn't have a concept of a loadFactor, but instead allows setting of an initial capacity, which may have the same effect in many cases
  • Python, Julia, etc., do not have a concept of this, just like our map now

Links to related information

Motivated by: Change ChapelHashtable to use power-of-2 table size by bmcdonald3 · Pull Request #18427 · chapel-lang/chapel · GitHub and https://github.com/Cray/chapel-private/issues/2536
PR with proposal 1: Add a maximum capacity to Map to control when new memory is allocated by bmcdonald3 · Pull Request #18523 · chapel-lang/chapel · GitHub
C++ load factor: std::unordered_map<Key,T,Hash,KeyEqual,Allocator>::max_load_factor - cppreference.com
Java hashtable: Hashtable (Java Platform SE 8 )
Rust hashmap: HashMap in std::collections - Rust