Utilizing Python and the mindee API to automate expense stories
It was once that workers needed to file expense stories by hand, or not less than draft them on a spreadsheet. Then computer systems got here alongside, with built-in spreadsheets that made it simpler to generate and observe stories. However what in the event you might automate the whole knowledge entry course of itself to facilitate producing expense stories from photos of receipts?
On this article, I’ll present you the steps to automate expense stories utilizing Python.
The steps to automate expense stories can be:
- Create an Account on the mindee Platform
- Arrange an API Key
- Set up the “mindee” Package deal
- Import Dependencies
- Write Helper Features
- Load, Parse and Extract the Knowledge from the Expense Receipts
- Export Outcomes to a Desk
- Save Desk to
.csv
File
Let’s get began!
1. Create an Account on the mindee Platform
For this automation so as to keep away from having to jot down customized code for detecting the textual content within the photos of the receipts, we’ll use a Python package deal known asmindee
, which comes with an API that permits you to do all of that with just a few traces of code.
Though an expert model of this package deal is paid, they provide 250 pages a month at no cost, which for people ought to be greater than sufficient to automate their private expense stories.
To create the account do the next:
- Head over to the
mindee
platform web site - Enroll
2. Arrange an API Key
To set your API key do the next:
- Click on on “Create a brand new API”
- Choose the “Expense Receipt” API
- Copy your API key and reserve it.
3. Set up the ‘mindee’ Package deal
To put in the mindee
package deal run:
pip set up mindee
4. Import Dependencies
For this challenge we can be utilizing the next packages:
mindee
pandas
random
glob
matplotlib
seaborn
If you do not have them in your native setting, set up them with pip set up <package deal>
.
Now we are able to import our dependencies:
from mindee import Shopper, paperwork
import random
import pandas as pd
import glob
# Sanity Test utilizing pandas and matplotlib
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()
5. Write Helper Features
For this automation, we’ll want 3 helper features: one for extracting the expense knowledge after getting the response from the mindee
API, one other for changing time to meal sort (on this instance my expense report entails having to explicitly state meal sort like lunch or dinner for meals bills). Lastly, a 3rd perform for creating the ultimate desk with all of our knowledge.
The code can be:
# Output of the exams written above['13:51', '11:49', '22:13', '19:57', '10:32', '20:47', '20:40', '14:27', '14:41', '15:06']
13:51 Lunch
11:49 Lunch
22:13 Dinner
19:57 Dinner
10:32 Lunch
20:47 Dinner
20:40 Dinner
14:27 Lunch
14:41 Lunch
15:06 Lunch
6. Load, Parse and Extract the Knowledge from the Expense Receipts
Now, all now we have to do is:
- Instantiate our mindee shopper utilizing the API key we obtained
# Instantiate a brand new shopper
mindee_client = Shopper(api_key="Your API KEY")
2. Initialize some empty lists that may comprise the info extracted
date_list = []
amount_list = []
category_list = []
time_list = []
meal_type_list = []
filenames_list = []
3. Load the picture of an expense receipt and feed it to the mindee API
picture = "./expense_images/1669895159779.jpg"
input_doc = mindee_client.doc_from_path(picture)
api_response = input_doc.parse(paperwork.TypeReceiptV4)
expense_data = api_response.doc
expense_data<mindee.paperwork.receipt.receipt_v4.ReceiptV4 at 0x7f9685b278b0>
The output can be a mindee object that’s tailor-made for expense receipts (there are in all probability a number of choices so be happy to research that within the official documentation within the mindee platform).
4. Extract the bills data from the API response
date, quantity, filename, class, time = extract_expenses_data(expense_data)
5. Convert the time of day data into related meal sort data
This instance may be very particular to my explicit case, so that you may change this perform in accordance with the forms of bills you could have. However right here, what I’m doing is remodeling a string like 13:30
to lunch and a string like 20:30
to dinner.
if not time:
meal_type = "Unknown"
else:
meal_type = convert_time_to_meal_type(time)
6. Append the extracted data to their corresponding lists
On this case, I’m solely doing this for a single receipt, however when doing it for a number of receipts the listing strategy will make extra sense.
date_list.append(date)
# I'm changing the . for , right here as a result of the ultimate report goes on
# a google sheet which takes `,` as an alternative of `.` for float numbers.
amount_list.append(str(quantity).change(".", ","))
category_list.append(class)
time_list.append(time)
meal_type_list.append(meal_type)
filenames_list.append(filename)
Now that we all know every step intimately, the whole supply code for working this on a number of expense receipts:
There you could have it! You automated the boring activity of reporting your bills from photos of receipts! As a closing verify, it’s all the time good to check out the ultimate outcomes to verify the knowledge you’re getting is in keeping with the precise knowledge within the expense receipts.
For that, we are able to visualize the receipts, facet by facet with the textual content knowledge obtained from the extraction for every expense receipt utilizing matplotlib
.
...
...
...
I'm displaying simply a few photos with restricted
data for privateness causes however the general thought is right here.
Okay, the outcomes appear constant! There now we have it, a neat automation to avoid wasting you a while each month!