PDF processing API using Flask + Heroku + Gunicorn
Hello! When I was developing a personal project, I found myself forced to process PDF files with python in the cloud. In order to do this, I have learned to use flask, gunicorn and Heroku. In this post I will show how I have done it.
Heroku is a platform-as-a-service (PaaS) that enables developers to build, run, and operate applications entirely in the cloud.
Gunicorn is a Python WSGI HTTP server for UNIX. It is widely compatible with various web frameworks, is easy to deploy, has low server resources, and is quite fast.
Flask is a minimalist framework written in Python that allows you to create web applications quickly and with a minimum number of lines of code.
Creating the server locally
First of all we will first create and test the server locally.
To do this, create your working folder, enter it and then create a python virtual environment:
python -m venv env
Then you must activate it, in Windows:
.\env\Scripts\activate
On Linux or MacOS:
source env/bin/activate
.gitignore
This file is important so as not to upload all external dependencies to heroku, as they will be downloaded automatically thanks to requirements.txt, so create a file called .gitignore, and inside it will contain only the word env/
. You can do this easily with the echo env/ > .gitignore
command for both UNIX and Windows.
server.py
Now it’s time to create our web application. In my case, I’ll call the file server.py. I’ll give the basic code to create an application that processes PDF files, but your application can be anything you want.
To start working with flask and gunicorn, install them with pip install flask flask_cors gunicorn
and if you are going to follow this example to edit PDFs install PyPDF2 also pip install PyPDF2
. To use it, look at the following code:
|
|
To test the server locally, start it with python .\server.py
. It will be available at http://localhost:5000
Procfile
To deploy the server on Heroku using Gunicorn we will need a file called Procfile, which in our case will contain the following:
web: gunicorn server:app
where server is the name of the python file and app is the flask instance.
Requirements.txt
This is the last file we need. It is the one that gathers all the necessary dependencies for the project, heroku needs it to install them. To get this file, run pip freeze > requirements.txt
Heroku
In order to easily deploy to heroku, install the heroku CLI tool. Once installed and started the session (by doing heroku login
), inside our working folder where we have requirements.txt, server.py and Procfile, we will execute heroku create
and create the application.
Now it’s time to create a git repository with our code and finally deploy it.
git init
heroku git:remote -a <your app name, for example shrouded-shelf-55195 >
git add .
git commit -am "make it better"
git push heroku master
If everything went well, the last command will deploy your application to heroku! 😀