Deploying NLP Model on Heroku using Flask, NLTK, and Git-LFS

Pulkit Rathi
Analytics Vidhya
Published in
6 min readNov 28, 2020

--

Flask app + Git-LFS + Heroku

I was trying to deploy my NLP model for Imdb reviews on Heroku and faced a lot of issues and had to refer a lot of resources for each error individually.

This article is a quick guide to debug errors that you might encounter while deploying your NLP model on Heroku. So, if you encounter any of the following issues just follow the steps and you’ll be good to go.

A) Pre-requisites:

✔️ Working Flask App — You should have a working Flask app. You can check if your app is working by running it locally(127.0.0.1).

✔️ Heroku CLI — You need to install Heroku CLI to see the errors. Download the setup from here and install it. Open cmd and enter heroku login. It will redirect you to a web page. After logging in you need to enter the below query in cmd to see the logs.

heroku logs --tail --app <your_heroku_app_name>

✔️ GIT-LFS(Optional) — Use Git-LFS only when you have large files with a size greater than 100MB. I used it in my project because my word2vec pickle file was greater than 100MB. Download the setup from here. Open git bash in your local repo directory and follow the steps. This example is for .pickle files. You can do this for any file or file type.

#.gitattributes file will be generated when you initialize git-lfs
git lfs install
git lfs track "*.pickle"
git add .
git commit -m "Add lfs support for .pickle files"
git push origin master

B) Errors & Solutions:

Note: You can see all these errors in the logs in Heroku CLI.

1. at=error code=H14 desc=No web processes running” method=GET path=”/”: This error can occur when you don’t have Procfile in your repo, or you have saved it with some extension like “Procfile.txt”, or you have wrong data in Procfile like wrong python Flask application file name, or you have no dynos to run the app.

H14 Error

Solution:

  • Check if you have Procfile with no extension in your remote repo.
  • Check if you have the correct data in Procfile. In the below example, api.py is the name of the python file that contains the flask application with the name app (app = Flask(__name__) in file api.py).
web: gunicorn api:app
  • Run the below command in cmd to assign dyno for your app.
heroku ps:scale web=1 --app <your_heroku_app_name>

2. at=error code=H10 desc=App crashed” method=GET path=”/”: This error can occur when you specify the wrong app name in the Procfile.

H10 Error

Solution:

  • Check that name of the Flask name in your project and the app name in Procfile are the same.
#for e.g. if you have -> app = Flask(__name__) with api.py as main Flask file then Procfile should look like:
web: gunicorn api:app

3. NLTK LookupError: This error occurs when you forget to add the nltk.txt file to the repo. This file holds the name of NLTK resources to be downloaded by Heroku for the proper functioning of the app.

NLTK LookupError

You can also check your Heroku deploy logs for this.

Solution:

  • Add nltk.txt at the same root directory location of the flask application in your repo with the NLTK resources you use in your app.
nltk.txt

4. Git-LFS Error: I was using Git-LFS to push .pickle files to my remote repo. But, the sad part is that there is no native support for Git-LFS in Heroku. I got the following error because of it:

_pickle.UnpicklingError: invalid load key, ‘v’ .

UnpicklingError

Solution:

You need to do 3 things to integrate Git-LFS with Heroku:-

i) Create a “Personal Access Token” for your Github account. Go to your Github profile ➜ Settings ➜ Developer Settings ➜ Personal access tokens ➜ Generate new token. Save this token somewhere safely.

ii) Add Heroku buildpack for Git-LFS: You can add the required buildback using either Heroku CLI or Heroku dashboard.

For CLI method: run the below command in cmd.

heroku buildpacks:add \
https://github.com/raxod502/heroku-buildpack-git-lfs \
-a <your_heroku_app_name>

For Heroku dashboard: Go to Settings in your Heroku dashboard ➜ Buildpacks ➜ Add buildpack ➜ Enter “https://github.com/raxod502/heroku-buildpack-git-lfs” in URL field ➜ Save Changes.

You can now see the buildpack added to the list of buildpacks.

Heroku Settings: Buildpacks

iii) Add config variable to your Heroku app. This step also can be done by either Heroku CLI or Heroku dashboard. The key is HEROKU_BUILDPACK_GIT_LFS_REPO and the value is the URL for your Github remote repo from where you want to download Git LFS assets. See here for details on the syntax. The URL should be like this:

https://<token_generated_in_first_step>@github.com/<user_name>/ <remote_repo_name>.git

For CLI method: run the below command in cmd.

heroku config:set HEROKU_BUILDPACK_GIT_LFS_REPO=<URL_stated_above>  -app <your_heroku_app_name>

For Heroku dashboard: Go to Settings in your Heroku dashboard ➜ Config Vars ➜ Reveal Config Vars ➜ Enter HEROKU_BUILDPACK_GIT_LFS_REPO in key field and the URL in value field ➜ Click Add.

You can now see the config var you just added.

Heroku Settings: Config Vars

Note: Free Github account gives you only 1GB/month of bandwidth for Git-LFS files. Every time you deploy your app on Heroku with LFS assets, they’ll be counted in your quota. So, make sure to use it wisely or Git-LFS will be disabled for your account until you upgrade your data plan. I learned it the hard way 😬😅

That’s all folks! Do comment and share if you like the article.

Happy Deploying !!

And don’t forget to give your 👏 !

--

--

Pulkit Rathi
Analytics Vidhya

Data Science & Machine Learning Enthusiast | Data Engineer at Dell Technologies | Always Curious