visit
I have wanted to demystify what goes behind the Python Flask framework. How does defining something as simple as app.route
handle HTTP Requests? How does app.run
create a server and maintain it?
To demystify flask
, I had two options: read Flask code end to end and understand or Reverse engineer flask by building one on my own. I chose the latter, and this blog is a step-by-step log of how it went.
Side Note
If you are new to Flask, then might be a good place to start.Reverse engineering started in my head. I am going to be working with just two files, ownflask.py
and demo.py
.
# demo.py
from flask import Flask
app = Flask(__name__)
@app.route("/", methods=["GET", "POST"])
def hello():
return "hello"
if __name__ == "__main__":
app.run()
We need a class Flask
which initializes an app
object
The Flask
class has a method run
And it starts a server
The Flask
class also has a route
method that registers the endpoints
#ownflask.py
class Flask:
def __init__(self, name):
self.name = name
def run(self):
pass
def route(self, path, methods):
def wrapper(f):
pass
return wrapper
That's gives us the basic skeleton. Let's add the functionality one by own. Python module provides a HTTPServer
let's use that.
In Flask, app.run
is responsible for starting a development webserver. The server then listens to all HTTP requests and responds to them.
#ownflask.py
from http.server import HTTPServer, BaseHTTPRequestHandler
class Flask:
...
def run(self, server_class=HTTPServer, handler_class=BaseHTTPRequestHandler, port=8000):
server_address = ('', port)
print (f"Running server in port {port}")
httpd = server_class(server_address, handler_class)
httpd.serve_forever()
In demo.py
, change from flask
to from ownflask
to work with the module, we just created and run demo.py
. On hitting the //127.0.0.1:8000
You get a 501 error from the browser since we haven't implemented anything to handle the incoming request.
The app.route
method in Flask registers an endpoint. When an HTTP request comes, it maps it to the associated function call. These routes are maintained in a global object so that the request handler can refer to it. For our ownflask
, let's use a global dictionary.
Here I have two methods one to record routes
to its associated functions and route_methods
to associate endpoints and its HTTPMethods.
routes = {}
route_methods = {}
class Flask:
...
def route(self, path, methods):
def wrapper(f):
routes[path] = f
route_methods[path] = methods
return wrapper
When running our server, we have used a BaseHTTPRequestHandler
. From the Python documentation, it is clear that we have to extend it to support handling requests.
class RequestHandler(SimpleHTTPRequestHandler):
def do_GET(self):
self.send_response(200)
self.send_header("Content-type", "application/json")
self.end_headers()
self.wfile.write(str.encode("Handling GET"))
def do_POST(self):
pass
The above snippet sends Handling GET
as a response despite what the route function returns. Let's change that.
dir(self)
returns that self.path
is the URL, mapping that with routes dict, we can call the respective function.
class RequestHandler(SimpleHTTPRequestHandler):
def do_GET(self):
resp = routes[self.path]()
...
...
self.wfile.write(str.encode(resp))
/book/<int:id>
/book?id=10
The 1st one would require some form of regex in the routes and the way we store them. Let's handle them later. Let's handle hello world
with the name //127.0.0.1:8000?name=Joe
The current code fails with a KeyError
since the query string is also a part of the route.
KeyError: '/?name=jpe'
To parse this and separate the URL path and the query params, we will use urllib
import urllib.parse as urlparse
class RequestHandler(SimpleHTTPRequestHandler):
def do_GET(self):
path = urlparse.urlparse(self.path).path
qs = urlparse.parse_qs(urlparse.urlparse(self.path).query)
resp = routes[path]()
self.send_response(200)
self.send_header("Content-type", "application/json")
self.end_headers()
self.wfile.write(str.encode(resp))
In Flask, the routing function can access the request params via the global Request
object. In our case, for the hello
route to access query params, we need the means to pass it to them.
class Request:
def __init__(self, request, method):
self.request = request
self.method = method
self.path = urlparse.urlparse(request.path).path
self.qs = urlparse.parse_qs(urlparse.urlparse(request.path).query)
self.headers = request.headers
class RequestHandler(SimpleHTTPRequestHandler):
def do_GET(self):
request = Request(self, "GET")
resp = routes[request.path](request)
self.send_response(200)
self.send_header("Content-type", "application/json")
self.end_headers()
self.wfile.write(str.encode(resp))
With the current state, hello() takes 0 positional arguments but 1 was given
let's capture request
@app.route("/")
def hello(request):
return f"hello {request.qs["name"][0]}"
If we modify the hello
endpoint to return a dict
instead of str
, we will receive an error.
It happens because we convert dict to a bytes object. To do this, we should convert the response dict to str
and then encode it.
def do_GET(self):
request = Request(self, "GET")
resp = routes[request.path](request)
if isinstance(resp, dict):
resp = json.dumps(resp)
class Request:
def __init__(self, request, method):
...
...
self.content_length = int(self.headers.get('content-length', 0))
self.body = request.rfile.read(self.content_length)
try:
self.json = json.loads(self.body)
except json.decoder.JSONDecodeError:
self.json = {}
@app.route("/todo", methods=["POST"])
def todo(request):
return {"status": "success", "data": request.json}
Right now, if you hit /todo
from the browser, you will get the response. This is wrong since we have clearly defined that /todo
on supports post request. This is where route_methods
comes in really handy.
def do_GET(self):
...
if "GET" not in route_methods[request.path]:
self.send_response(401)
self.send_header("Content-type", "application/json")
self.end_headers()
self.wfile.write(str.encode(f"{request.path} {request.method} not supported"))
return
Looks like we are repeating ourselves a lot; let's move them to a common function
class RequestHandler(SimpleHTTPRequestHandler):
...
...
def write_response(self, response, status_code):
self.send_response(status_code)
self.send_header("Content-type", "application/json")
self.end_headers()
if isinstance(response, dict):
response = json.dumps(response)
self.wfile.write(str.encode(response))
The final do_GET
and do_POST
method looks like this.
def do_GET(self):
request = Request(self, "GET")
if "GET" not in route_methods[request.path]:
self.write_response("Method not supported", 401)
return
resp = routes[request.path](request)
self.write_response(resp, 200)
def do_POST(self):
request = Request(self, "POST")
if "POST" not in route_methods[request.path]:
self.write_response("Method not supported", 401)
return
resp = routes[request.path](request)
self.write_response(resp, 200)
def not_found(self, request):
return self.write_response(f"{request.path} 404 NOT FOUND", 404)
def method_not_supported(self, request):
return self.write_response(f"{request.path} {request.method} not supported", 401)
def process_request(self, request):
if request.path not in routes:
return self.not_found(request)
if request.method in route_methods[request.path]:
return self.method_not_supported(request)
resp = routes[request.path](request)
self.write_response(resp)
def do_GET(self):
request = Request(self, method='GET')
return self.process_request(request)
def do_POST(self):
request = Request(self, method='POST')
return self.process_request(request)
At this point, if you write a small multithreading script and hit our server, it will hang because HTTPServer
is not designed to handle multiple requests. Replacing it with ThreadingHTTPServer.
from http.server import ThreadingHTTPServer
class Flask:
...
def run(self, name):
...
...
self.server = WSGIServer((self.host, self.port), ThreadingHTTPServer)
At this point, I was happy with what I had accomplished and had already posted a , and nudged me in the direction to explore WSGIServer
.
from wsgiref.simple_server import WSGIServer
class Flask:
...
def run(self, name):
...
...
self.server = WSGIServer((self.host, self.port), HttpReqHandler)
How is WSGIServer
different from HTTPServer
the interface look the same?
How can we plug the ownflask
to work with Gunicorn
Also published .