Active Messages: A Mechanism for Integrated Communication and Computation
T. von Eicken, D.E. Culler, S.C. Goldstein, and
K.E. Schauser. Proceedings of the Nineteenth Annual International
Symposium on Computer Architecture. May 1992, pp. 256-266.
The Problem: Communication latency is high, and there is not
enough overlap between communication and computation.
The Solution: Embed the address of the receiving message
handler in the message itself. The handler immediately handles the
message, integrating it into the computation (think matrix row GET
responses) or responds right away (a matrix row GET handler). This
is efficient on ordinary hardware
The details:
- Eliminates the need for message buffering, since
messages handled right away (no memory allocation per-message).
- Requires the same code image on all machines
- Handlers are not allowed to block for "a long time"
- Overlap can be achieved by compiler support (prefetching)
- It's better even on message-passing-optimized hardware
(Monsoon, JMachine).
- But there are HW things that can help: message registers,
multiple simultaneous message creating/receiving, protection
checks
- Architectural changes that would help too: Polling instead of interrupts, user-level handlers, separate message threads, message-only CPU
A question: for large messages, still need to allocate memory. Is there time after the first (size-indicating) message to do so (before the rest comes in)? You can't let those messages queue up.
Umesh Shankar
Last modified: Tue Jul 3 17:18:20 PDT 2001