Wikinotes

Maintainer: admin

1 X10 continued
2 Generic condition synchronization
3 Places/locations
4 Work distribution
5 Fibonacci

async s; // mini thread, no handle, lightweight
finish {
  for (i in 1...100) {
    async {a[i]++}
  }
} // all fors done
finish s // all async done

finish {
  async {
    finish {
      async {x=2}
      // x E {0, 2}
    } // x E {2}
    x++
    // x E {3}
  } // x E {0, 2, 3}
} // x E {3}

when (c) -> S
// wait for C to be true -> atomically do S
// C should be pure (might be execed many times)

Constraints on C, S:

sequential (no async)
nonblocking (no finish)
local (this place)

There are many places. Places are (logical) locations (actual binding to machines is done by RT).

at(p) S // execute S at place p

This'll serialize the data and make a copy (expensive) on p.
N.B. not copied back: x=at(p) E

$C=PE$
$\sum\limits_{k=0}^{n-1} a_{ix}b_{kj} $
^ can be concurrently done!

class worker extends thread {
  int row, col
  worker (row, col) {}
  run() {
    dot = 0.0
    for (k from 0 to n - 1) {
      dot += A[row][k] * B[k][col]
    }
    C[row][col] = dot
  }
}
//
{
  for (i from 0 to n - 1) {
    for (j from 0 to n - 1) {
      n[i][j] = new worker(i, j)
      n[i][j].start()
    }
  }
}

lots of threads not doing much work. Not efficient, high overhead.

Partition the array + use a thread pool.

Java has thread pool (executor framework)

executor ex = {
  ex.execute(new Runnable) // may or may not create a new thread
}

class directEx implements Executor {
  execute(Runnable r) {
    r.run()
  }
} // no thread created
class asyncEx Executor{
  execute(Runnable r) {
    new Thread(r).start()
  }
}

Executor interface is limited. How to wait for thread? How to return value from execution?

Executor fence
Callable <V> interface
Like Runnable but can return something & throw exceptions

submit(callable) // get return value from callable
// returns a Future<V> (promise)
y = new Future(task) // (maybe) spawn a thread to execute task
// later...
x = y.get() // blocks an wait til task is done

class Future {
  callable c
  Future(callable c) {}
  get() {
    return c.call()
  }
}

ThreadPoolExecutor: a pool of threads (capped). Queue job of all threads currently busy (empty pool).

fib(n) {
  if (n < 2) return 1
  Future f = new Future(fib(n-1))
  return fib(n-2) + f.get()
}

(Node: computation. Edge: dependency)

This translates to:

With infinite processors, can execute nodes/computations in the dependency dag.
What’s the minimum time it’ll take?
The largest dependency chain is a lower bound. Critical path = 8 (ADGHKIFC).

Speedup: $T_1 = 17$. $T_\infty = 8$. $17/8 > 2$
Notation:
$T_n$: time taken for n processors
$T_1$: sequential

In practice, you have a fixed # of processors.
What’s the speedup for $T_2$, $T_4$? Need to know the schedule. How/when/where tasks are assigned.
We need a clever scheduler. Greedy scheduler works well.
But anything would give less than 2x speedup (Graham Brent Theorem).

Monday, November 18, 2013
PGAS languages and transactional programming

1X10 continued¶

2Generic condition synchronization¶

3Places/locations¶

4Work distribution¶

5Fibonacci¶

Monday, November 18, 2013 PGAS languages and transactional programming