Proxygen source code review
Published:
Source code review of Proxygen, a C++ HTTP library.
Proxygen source code review
Interfaces and how to use them
Proxygen is Facebook’s C++ HTTP libraries. The basic components are HTTPServer
, RequestHandlerFactory
, and RequestHandler
under the /proxygen/httpserver
directory.
The official version has incorporated sample usage, which is under /proxygen/httpserver/samples/
and it includes echo, and static http server samples.
I would like to take static
server as an example. The main()
function is in StaticServer.cpp
. It first initializes folly
and IP configs. Then you should define the options
structure. handlerFactories
is important in options
.
options.handlerFactories = RequestHandlerChain()
.addThen<StaticHandlerFactory>()
.build();
The StaticHandlerFactory
class is implemented by user in StaticServer.cpp
, and it returns a new static handler. Every specific handler factory should inherit RequestHandlerFactory
, which is a virtual class that defines some interfaces.
class StaticHandlerFactory : public RequestHandlerFactory {
public:
void onServerStart(folly::EventBase* /*evb*/) noexcept override {}
void onServerStop() noexcept override {}
RequestHandler* onRequest(RequestHandler*, HTTPMessage*) noexcept override {
return new StaticHandler;
}
};
The role of handler in the whole system is that, it is attached with each request, and interacts with operation system, or upstream servers requested by the clients.
Then options
and IPconfigs will be passed to the interfaces. Your simple static http server is completed!
HTTPServer server(std::move(options));
server.bind(IPs);
// Start HTTPServer mainloop in a separate thread
std::thread t([&] () {
server.start();
});
t.join();
return 0;
Introductions of each component
HTTPServer
HTTPServer.cpp
abstracts actions of server like start and stop. By using bind()
, codecFactory
, When starting a server, it first gets an event base to manage events in a thread pool. Most thread io, and event management in Proxygen are based on the folly
library developed by Facebook.
After getting an event base, it does following things in startTcpServer()
:
- Register
HandlerCallbacks
as the observer of thread pool.HandlerCallbacks
implements interfaces likethreadStarted
andthreadStopped
. AcceptorFactory
new aHTTPServerAcceptor
, and pass it intobootstrap_
.bootstrap_
setschildHandler
, and thread pools for accepting connections and IO handling.
Then the server will start the main event loop.
RequestHandler
RequestHandler
is an abstract class and its interfaces are implemented by specific handlers like staticHandler
to handle static requests.
ResponseHandler
acts as clients for RequestHandler
subclasses and provides methods to send back the responses.
Methods of RequestHandler
are registed as callback functions in HTTPSession
module, to finish tasks in different stages. Some important methods include:
onRequest()
onBody()
onUpgrade()
onEOM()
onError()
RequestHandlerAdaptor
This class acts as an adaptor that converts HTTPTransactionHandler
to RequestHandler
.
public:
explicit RequestHandlerAdaptor(RequestHandler* requestHandler);
It implements two abstract classes: HTTPTransactionHandler
and ResponseHandler
. It can send response directly back to clients, or interacts with Transport
.
Overview of Proxygen architecture
The architecture of Proxygen is as this picture in its official README shows. A complete diagram of how it processes a http request is in HTTPTransaction.h
:
/**
* An HTTPTransaction represents a single request/response pair
* for some HTTP-like protocol. It works with a Transport that
* performs the network processing and wire-protocol formatting
* and a Handler that implements some sort of application logic.
*
* The typical sequence of events for a simple application is:
*
* * The application accepts a connection and creates a Transport.
* * The Transport reads from the connection, parses whatever
* protocol the client is speaking, and creates a Transaction
* to represent the first request.
* * Once the Transport has received the full request headers,
* it creates a Handler, plugs the handler into the Transaction,
* and calls the Transaction's onIngressHeadersComplete() method.
* * The Transaction calls the Handler's onHeadersComplete() method
* and the Handler begins processing the request.
* * If there is a request body, the Transport streams it through
* the Transaction to the Handler.
* * When the Handler is ready to produce a response, it streams
* the response through the Transaction to the Transport.
* * When the Transaction has seen the end of both the request
* and the response, it detaches itself from the Handler and
* Transport and deletes itself.
* * The Handler deletes itself at some point after the Transaction
* has detached from it.
* * The Transport may, depending on the protocol, process other
* requests after -- or even in parallel with -- that first
* request. Each request gets its own Transaction and Handler.
*
* For some applications, like proxying, a Handler implementation
* may obtain one or more upstream connections, each represented
* by another Transport, and create outgoing requests on the upstream
* connection(s), with each request represented as a new Transaction.
*
* With a multiplexing protocol like SPDY on both sides of a proxy,
* the cardinality relationship can be:
*
* +-----------+ +-----------+ +-------+
* (Client-side) | Transport |1---*|Transaction|1---1|Handler|
* +-----------+ +-----------+ +-------+
* 1
* |
* |
* 1
* +---------+ +-----------+
* (Server-side) |Transport|1---*|Transaction|
* +---------+ +-----------+
*
* A key design goal of HTTPTransaction is to serve as a protocol-
* independent abstraction that insulates Handlers from the semantics
* different of HTTP-like protocols.
*/
From starting listening to new connections coming in
In last section, I introduced several components to build a simple http server using Proxygen. I would like to explain what is happening behind those interfaces from starting listening to accepting new connections.
In the main
function, we can configure our options and pass it into server. options
has a list of HandlerFactory
that can handle different events.
options.handlerFactories = RequestHandlerChain()
.addThen<StaticHandlerFactory>()
.build();
AcceptorFactory
accepts HandlerFactory
as the parameter of its constructor.
// AcceptorFactory
auto factory = std::make_shared<AcceptorFactory>(
options_, // include HandlerFactory
codecFactory,
accConfig,
sessionInfoCb_);
Then a bootstrap
is started. bootstrap
is in the wangle
library. What bootstrap
does here are mainly completed by three functions:
bootstrap_[i].childHandler(factory);
bootstrap_[i].group(accExe, exe);
bootstrap_[i].bind(addresses_[i].address);
bootstrap.group()
sets the thread pool for acception and IO. In group()
, AcceptorFactory
will new an acceptor and binds it with a new thread pool through newAcceptor(eventBase)
in wangle::threadStarted()
.
In bind()
, new asynServerSocket
is built by newSocket()
of socketFactory
. Asynchronous callback function for accepting new connections is binded with asynServerSocket
, and thread is set with it:
socketFactory -> addAcceptCB(socket, worker, worker -> getEventBase());
After a new connection coming in, acceptCB()
will be called and several functions in wangle
will be executed. The last function is onNewConnection()
in HTTPServerAcceptor
.
In Proxygen, HTTPServerAcceptor::onNewConnection()
will call HTTPSessionAcceptor::onNewConnection()
. In this function, new session is started:
session -> setSessionStats(downStreamSessionStats_);
Acceptor::addConnection(session);
session -> StartNow();
Handle a HTTP request
When a new HTTP request is read in, Proxygen uses an external HTTP parser library to parse it, and save it as a HTTPMessage
object. The HTTP parser it uses is based on the parser used by Nginx.
HTTPCodec
implements the callback functions defined in the http parser library. onIngress()
is the first function of HTTPCodec
to be called. It starts parsing the message by
size_t bytesParsed = http_parser_execute(&parser_,
getParserSettings(),
(const char*)buf.data(),
buf.length());
Callback functions are set in getParserSettings()
, and those functions are called when http_parser_execute()
is called. Please be sure that all of the callback functions should return 0 when successed and return 1 when failed. Unexpected errors might happen if the return values are misdefined.
const http_parser_settings* HTTP1xCodec::getParserSettings() {
static http_parser_settings parserSettings = [] {
http_parser_settings st;
st.on_message_begin = HTTP1xCodec::onMessageBeginCB;
st.on_url = HTTP1xCodec::onUrlCB;
st.on_header_field = HTTP1xCodec::onHeaderFieldCB;
st.on_header_value = HTTP1xCodec::onHeaderValueCB;
st.on_headers_complete = HTTP1xCodec::onHeadersCompleteCB;
st.on_body = HTTP1xCodec::onBodyCB;
st.on_message_complete = HTTP1xCodec::onMessageCompleteCB;
st.on_reason = HTTP1xCodec::onReasonCB;
st.on_chunk_header = HTTP1xCodec::onChunkHeaderCB;
st.on_chunk_complete = HTTP1xCodec::onChunkCompleteCB;
return st;
}();
return &parserSettings;
}
The functions on...()
in HTTPSession
will also be called in the functions of HTTPCodec
. The diagram is as followed:
HTTP1xCodec::onHeadersComplete() -> HTTPSession::onHeadersComplete() -> HTTPDownstreamSession::setupOnHeadersComplete()
In HTTPSession
, a new controller is allocated by calling HTTPSessionBase::getController()
. In HTTPDownstreamSession
, a new handler is created by the controller, by calling acceptor_ -> newHandler()
, and is attached to a HTTPTransaction
by calling
handler = SimpleController::getRequestHandler();
Transaction -> setHandler(handler);
In newHandler()
of HTTPServerAcceptor
, each HandlerFactory
’s onRequest()
will be called, and a new RequestHandlerAdaptor
is returned;
HTTPTransactionHandler* HTTPServerAcceptor::newHandler(
HTTPTransaction& txn,
HTTPMessage* msg) noexcept {
SocketAddress clientAddr, vipAddr;
txn.getPeerAddress(clientAddr);
txn.getLocalAddress(vipAddr);
msg->setClientAddress(clientAddr);
msg->setDstAddress(vipAddr);
// Create filters chain
RequestHandler* h = nullptr;
for (auto& factory: handlerFactories_) {
h = factory->onRequest(h, msg);
}
return new RequestHandlerAdaptor(h);
}
RequestHandlerAdaptor
will interact with ResponseHandler
and send responses to clients.