How smart is a select statement

damianmoz · December 8, 2022, 2:11pm

If a select statement has (say) N branches, we it execute (on average) N/2 comparisons if the data fed to the select statement is random?

mppf · December 8, 2022, 4:07pm

When translating to C or LLVM IR, the Chapel compiler currently translates select statements into series of conditionals.

config const n = 7;

proc foo() {
  select n {
    when 1 do return 1*1;
    when 2 do return 2*2;
    when 3 do return 3*3;
    when 4 do return 4*4;
    when 5 do return 5*5;
    when 6 do return 6*6;
    when 7 do return 7*7;
    when 8 do return 8*8;
    when 9 do return 9*9;
    when 10 do return 10*10;
    otherwise do return 0;
  }
}

writeln(foo());

chpl bb.chpl  --target-compiler=gnu --savec=tmp

from tmp/bb.c:

static int64_t foo_chpl(void) {
  int64_t ret_chpl;
  int64_t tmp_chpl18;
  tmp_chpl18 = n_chpl;
  if (tmp_chpl18 == INT64(1)) {
    ret_chpl = INT64(1);
    goto _end_foo_chpl;
  } else if (tmp_chpl18 == INT64(2)) {
    ret_chpl = INT64(4);
    goto _end_foo_chpl;
  } else if (tmp_chpl18 == INT64(3)) {
    ret_chpl = INT64(9);
    goto _end_foo_chpl;
  } else if (tmp_chpl18 == INT64(4)) {
    ret_chpl = INT64(16);
    goto _end_foo_chpl;
  } else if (tmp_chpl18 == INT64(5)) {
    ret_chpl = INT64(25);
    goto _end_foo_chpl;
  } else if (tmp_chpl18 == INT64(6)) {
    ret_chpl = INT64(36);
    goto _end_foo_chpl;
  } else if (tmp_chpl18 == INT64(7)) {
    ret_chpl = INT64(49);
    goto _end_foo_chpl;
  } else if (tmp_chpl18 == INT64(8)) {
    ret_chpl = INT64(64);
    goto _end_foo_chpl;
  } else if (tmp_chpl18 == INT64(9)) {
    ret_chpl = INT64(81);
    goto _end_foo_chpl;
  } else if (tmp_chpl18 == INT64(10)) {
    ret_chpl = INT64(100);
    goto _end_foo_chpl;
  } else {
    ret_chpl = INT64(0);
    goto _end_foo_chpl;
  }
  _end_foo_chpl:;
  return ret_chpl;
}

However, the LLVM optimizer can change this into a switch LLVM IR instruction or to a lookup table which can be implemented more efficiently

chpl bb.chpl  --llvm-print-ir foo --llvm-print-ir-stage full --fast

define weak dso_local fastcc i64 @foo_chpl() unnamed_addr #0 {
  %1 = load i64, i64* @n_chpl, align 8, !tbaa !4
  %2 = add i64 %1, -1
  %3 = icmp ult i64 %2, 10
  br i1 %3, label %4, label %7

4:                                                ; preds = %0
  %5 = getelementptr inbounds [10 x i64], [10 x i64]* @switch.table.foo_chpl, i64 0, i64 %2
  %6 = load i64, i64* %5, align 8
  br label %7

7:                                                ; preds = %4, %0
  %8 = phi i64 [ %6, %4 ], [ 0, %0 ]
  ret i64 %8
}

So, if you are using select to implement a lookup table or something like that, it looks reasonably efficient to me. You can always check your specific case with --llvm-print-ir as I showed in the Details above. Please let us know if you are running into a specific case where you can't get it to optimize reasonably well.

damianmoz · December 10, 2022, 12:54am

Thanks. Very useful

Topic		Replies	Views
Vectorized and parallell loops. Can I have both? Developers	8	412	June 6, 2022
Branch prediction with foreknowledge Users	3	245	March 29, 2022
Nested foralls fail with llvm backend in over-relation soln of Laplace's equation Users	7	237	March 23, 2023
Use `param` to choose algorithm Users	3	205	July 25, 2022
Optimizer Arithmetic Operations Order Users	6	66	February 5, 2025

How smart is a select statement

Related topics