Remote Procedure Call
RPC is a communication mechanism used to call subroutines in a different address space. It is typically used to communicate over a network but the programmer doesn’t need to write the code that deals with the communication aspects of it. He/She invokes the remote subroutine the same way a local subroutine would be invoked and the RPC framework deals with the rest.
Some problems with RPC
After making a call to a remote procedure the client needs to wait for the response which means its thread will be blocked until a response comes back. The server also can’t return immediately without providing a response, so while the response is being created the server is also blocked in the subroutine that was called.
This characteristic will often force the programmers to develop their programs in a multi-threaded way to avoid performance issues when multiple calls are being received.
Two Generals’ Problem
Since the calls are made over a wire and is well known that the network is not reliable there is no clear way to distinguish if the server is still processing the request or if the server died. The server might even never have received the request in the first place. And if you think from the server’s perspective the same thing could occur to the client, the server sends the response but the client might be dead by that time.
An unsuccessful execution can be illustrated as:
┃ ┃ ┣─────┐ ┃ ┃ └──────┐ ┃ ┃ request────┐ ┃ ┃ └─────▶┃ ┃ ┃ client server ┃ ┃ ┃ ┃ ┃ ┌─────┃ ┃ response┘ ┃ ┃ X◀────┘ ┃ ┃ ┃ ▼ ▼
The parties could start sending ACKs to inform they have received the previous message:
┃ ┃ ┣─────┐ ┃ ┃ └──────┐ ┃ ┃ request────┐ ┃ ┃ └─────▶┃ ┃ ┌────────┃ ┃ ┌──ack───┘ ┃ ┃◀───────┘ ┃ ┃ ┌─────┫ client ┌──────┘ server ┃ ┌──response ┃ ┃◀─────┘ ┃ ┃──────┐ ┃ ┃ └────ack ┃ ┃ └──────┐ ┃ ┃ └────▶┃ ┃ ┃ ┃ ┃ ┃ ┃ ▼ ▼
What happens if some ACK disappears? How the other party would know that the message got lost instead of thinking that it is still on its way?
One of the most obvious options is to set timeouts when calling the server so that if the client doesn’t receive a response before the timeout the client can act accordingly. But this violates the principle from RPC that was to make local and remote calls transparent. In this case the programmer is clearly writing code that wouldn’t be necessary if it was a local call.
When a remote call is made the client must then deal with all possible exceptional cases that might happen. This is closely related to the problem above and also breaks the transparency promise.
Multiple execution semantics
Since the communication is made over a faulty network how both client and server should behave in case of a message loss? In a local call the execution happens exactly once but achieving the same semantics in a remote call requires different code to be written. Fake it ’til you make it.
Using RPC can allow different clients written in different languages to call servers that are also written in a different language than the client. The framework must then provide a way to pass the parameters such as they are mapped correctly on the other end. Types like boolean and integers are easily mapped but others like arrays, lists, structs, etc may impose difficulties.
If a language has pointers then a bigger problem appears. Should the pointer be passed over or the thing it points to? What if the pointer is referencing some element of a complex structure? Should the entire structure be sent?
The same problem occurs if the language allows defining global variables and a procedure using global variables is suddenly changed to be executed remotely instead of locally as before.
There is also the problem of how each machine represents their data. You have little-endian vs big-endian, structure alignment, etc.
You can see again that this also breaks the transparency promise of RPC.
Solutions or improvements
Some RPC frameworks try to tackle the problems above. They might offer async calls by default, cancellation mechanisms for in-progress RPCs, streaming capabilities, connection backoff, load balancing, filters and many other features attempting to overcome the problems mentioned before. You can take a look at gRPC and Finagle for examples of frameworks with some of these capabilities.
Another way of thinking is to avoid RPC altogether. This leads to different models such as REST and message queuing frameworks for example. It is also possible to make a clear distinction between local and remote calls, something like Erlang does, providing all the necessary building blocks to deal with the problems that come once you are over a network.
- Andrew S. Tanenbaum, and Robbert van Renesse. A Critique of the Remote Procedure Call Paradigm. Proc. European Teleinformatics Conf. (EUTECO 88), North-Holland, Amsterdam, 1988, pp. 775-783.
- Jim Waldo, Geoff Wyant, Ann Wollrath, and Sam Kendall. A Note on Distributed Computing. Sun Microsystems Laboratories, 1994.
- Steve Vinoski. Convenience Over Correctness. IEEE Internet Computing, Jul./Aug. 2008, pp. 89-92.