All what developers need to know about proxy servers
Proxy servers are a fundamental component of almost every distributed system. This post explains what proxies are, which are the different kind of proxies and why they are so useful.
Proxies are a fundamental component of almost every distributed system. The average person has only a vague understanding of the purpose of proxy servers and often associate them with unblocking some media content from other countries. As a developer however, you should know that proxy servers do much more and are vital for businesses.
This post will explain you what proxies are, which are the different kind of proxies and why they are so useful when designing a distributed system.
What is a proxy?
Let's start with a general definition. In real life a proxy is a person who is given the power or authority to do something for someone else. For example, if you are not available to vote, you can nominate another person to act as your proxy and vote for you.
In distributed systems, a proxy is a server acting as intermediary between a set of clients and a set of servers. We can distinguish 2 categories of proxies:
- forward proxy: the proxy acts on behalf of the clients
- reverse proxy: the proxy acts on behalf of the servers
Forward Proxies
A forward proxy is a server that acts on behalf of the client. If the proxy has been configured correctly, when a client want to communicate with a server the requests from the client don't go directly to the server but to the proxy.
The proxy then performs the following steps:
- forward the requests to the proper server
- wait for responses
- send the responses back to the corresponding client.
Notice how in this way the server doesn't know who is the client. It only knows the source IP address of the proxy and not of the client. So a forward proxy can secure and hide the identity of the client. This is basically the same principle how Virtual Private Networks (VPN) work.
What are the main advantages of using a forward proxy?
- the clients could use a proxy to access servers that they would not be supposed to access otherwise
- a system administrator could configure a proxy to selectively send/block certain requests from the client (i.e. allow or block the access to web pages of specific sites)
- a proxy can log or monitor requests
- a proxy can cache responses (i.e. frequently used pages so the user request doesn’t have to retrieve them each time from the web)
Reverse Proxies
A reverse proxy is a server that acts on behalf of another server. If the reverse proxy has been configured correctly, the requests of the client are not received directly by the server, but by the reverse proxy.
Once again, the next steps are quite obvious. The reverse proxy forward the requests to the proper server, wait for responses and send the responses back to the corresponding client.
Notice how the client has no idea that it is communicating with the reverse proxy. The responses are then returned to the client, appearing as if they were originated from the server itself. So a reverse proxy can secure and hide the identity of the server.
Let's consider a practical example to understand better. Let's suppose that you type http://francofernando.com in your browser. Your browser makes a DNS query to get the IP address of my website. If my website used a reverse proxy, the DNS would return the reverse proxy IP address instead that the one of my website.
What are the main advantages of using reverse proxies?
- Security and anonymity: by intercepting requests directed to the backend servers, a reverse proxy can not only protect their identities but it can also act as an additional defense against security attacks. A reverse proxy can also make possible the access to multiple servers from a single URL regardless of the structure of the local area network.
- Performance: Reverse proxies can cache static content reducing the overall system latency. For example they can cache HTML/CSS/JS, photos and videos. They can also take the load off of the servers performing additional tasks like: data compression and decompression, data encryption and decryption, HTTP authentication.
- Load balancing: Reverse proxies can distribute client requests across a group of servers in a way that maximizes speed and capacity utilization, ensuring that no one of the servers gets overloaded.
Introducing reverse proxies in a system have also some disadvantages. The main one is that the complexity of the overall system increases. Not only a reverse proxy represents an additional component introduced into the sysytem, but it is also represents a single point of failure. So it is usually necessary to make it redundant and configuring multiple reverse proxies further increases the complexity.
As always when designing a system, deciding if use or not a reverse proxy layer is a matter of trade offs.
Some examples of reverse proxy implementation are Nginx, Caddy, HAProxy, Squid.
Forward vs Reverse Proxies
Notice how forward and reverse proxies have an exact opposite interaction pattern.
Forward proxies are configured by the clients. They send out requests on behalf of the clients and receive responses from the servers.
Reverse proxies are configured by the system administrators. They receive requests on behalf of the server and send out responses to the client.
API Gateways
In microservices architectures, reverse proxies can also act as API Gateways. An API gateway is a single entry point for external clients that implements an abstraction layer hiding all the details about microservices.
The API gateway handles the requests from the clients in one of two ways:
- routing the requests directly to the appropriate services
- dispatching the requests to multiple services and aggregating the results
Rather than provide a one-size-fits-all style API, the API gateway can also expose different APIs for different clients. It's not uncommon also to have different API gateways for web application, mobile application, and external 3rd party applications.
API gateways provide many benefits:
- reduce the volume of requests and traffic, combining multiple API calls
- allow a flexible implementation of internal microservices, decoupling them from the clients
- make easier collect log, metrics and monitoring the activities
Conclusion
Proxy servers are a fundamental component of distributed systems and can be extremely useful in a lot of scenarios. I hope the post helped you in understanding better what proxy server are and which are the different kind of proxies you can use while designing a system.
If you liked this post, follow me on Twitter to get more related content daily!