Wikinotes

Maintainer: admin

1 Graham Brent theorem (continued)
- 1.1 Proof
2 Work stealing
- 2.1 Cilk++ language

2 kinds of steps:
- complete steps: >= p tasks available
- incomplete steps: < p tasks available

$T_p <= T_1/p + T_\infty$

$T_1/p$: complete
$T_\infty$: incomplete

“an application with p processors at most would take time that is addition of sequential time / p and critical path time”

$T_p^* >= max(T_1/p, T_\infty)$ (1)
$and\ T_1/p <= max(T_1/p, T_\infty)$ (2)
$T_\infty <= max(T_1/p, t_\infty)$ (3)

$T_p <= T_1/p + t_\infty$ (0)
$<= max(T_1/p, t_\infty) + max(T_1/p, T_\infty)$ (3)
$<= 2max(T_1/p, T_\infty)$
$<= 2T_P^*$

No more than 2x the optimal time

Have a task queue
If threads have their own queue, they need a way to distribute the work

Queue and tasks for each processors

($T_1$ stole from $T_2$)
If empty, pop from somewhere else
If tasks live in stack frames (like C), then some code needs to grow stack and some doesn’t
e.g.

main() {
  A()
}
A() {
  spawn B()
  spawn C()
  -sync-
}
B() {
  spawn D()
  spawn E()
  -sync-
  foo() // regular function
}
C() {
  spawn F()
  spawn G()
  -sync-
}

This gives a cactus stack

In a concurrent program, there may have more than one active.

Wednesday, November 20, 2013
Transactional Memory

1Graham Brent theorem (continued)¶

1.1Proof¶

2Work stealing¶

2.1Cilk++ language¶

Wednesday, November 20, 2013 Transactional Memory

1Graham Brent theorem (continued)¶

1.1Proof¶

2Work stealing¶

2.1Cilk++ language¶

Wednesday, November 20, 2013
Transactional Memory