AI for productivity
Summarize your Emails with GPT4
Gmail API + OpenAI GPT 4 → better email
I read a lot of newsletters; the one I keep up with the most (and actually open) is TLDR.
I find it tedious to open up each individually; I prefer to have them all on one page.
For example, on my website, I have the Hackernews feed fetched from their API, displayed minimally with just the hyperlinked title and a speech balloon emoji that directed me to the comments.
All in one page. Simple and clean.
On a whim, I coded it. Here’s what the email looks like.
Since you clicked on this article, I’m assuming you want to also “summarize” your emails.
This article is a tutorial for you to recreate what I did.
I’ll show you how to read emails from specific Gmail senders, pass it to OpenAI GPT 4 with your custom prompt, and send the response back to your email.
So fire up your IDE (I recommend Cursor) and follow along.
All the code is in this Deepnote notebook.
Setup: Prerequisites
Here are some prerequisites
- Python 3.10 or higher
- Gmail account
- Google Cloud account with Gmail API enabled
- OpenAI API key
Once you have these, clone my GitHub
git clone https://github.com/benthecoder/gmail_llm.git
cd gmail_llm
Install some Python packages
pip install -r requirements.txt
Set up Google API credentials. This way, your code has access to your account.
There are many ways; this is one of them:
- Create a new OAuth 2.0 Client ID (instructions)
- Download the JSON file and rename it to
credentials.json
- Put
credentials.json
in the project directory.
Rename .env.local
to .env
and set your key
OPENAI_API_KEY=<YOUR_API_KEY>
Now, you should be good to go!
Configurations
I’m using the GPT 4 Turbo model with 128k context length, so it can handle more emails; feel free to switch this out.
You can also change the prompt to your liking. Look for system_message
in the function summarize_email
You can now run python main.py
It should generate a token file in the background and prompt you with a few questions.
Here’s a sample output.
This code should also be generalizable to other use cases you have. If you want to tweak it or are curious about how it works, keep reading 👇.
Gmail stuff
Creating the Gmail client
Here, the credential file is read to create a token file, which expires after a day. So, remember to refresh this token when it expires.
With a Gmail client, you can perform Gmail operations.
Filtering emails
You can build a query to filter your emails based on sender and dates. More options are in the docs.
Fetching the emails
Emails are fetched with the users.messages.list
API.
Parsing the emails
We only want the text from our emails, so it checks for the text/plain
mimeType
, and encode it in ASCII format so it’s readable.
Here’s what it looks like parsing the emails from TLDR.
And the body of one of the emails.
Sending emails
AI stuff
GPT 4 Turbo (JSON Output)
Some things to note here
- I’m joining all the emails into one big string. There could be better ways to first format the data, maybe in a database.
- It does a retry on fail using tenacity and stops after three attempts.
- It checks for token length using tiktoken
- Temperature is set to 0.9 arbitrarily; play around with this value and the prompt.
- It enforces JSON output with
response_format={“type”: “json_object”}
Output format
If OpenAI is behaving nicely and I get a JSON output from it, I convert it into a markdown format. You could prompt OpenAI to give this to you directly, but having it output JSON first is more reliable. Plus, it lets you utilize the data for other purposes if you decide to.
Next steps
This is an MVP, and there are a lot of improvements that can be made to this project.
Here are a few I can think of now:
- Improve Link Extraction: Develop methods for accurately parsing and extracting links from newsletters.
- Zapier Integration for Automation: Set up a workflow in Zapier to automate the execution of your Python script, processing emails, and sending summaries.
- Database Integration: Store extracted links and relevant information in a database for better organization and long-term data management.
- Monitoring and Logging: Implement systems to track the performance of your script and log any errors or issues for easier troubleshooting.
- Token Count Optimization: Research and apply techniques to optimize the use of tokens in GPT-4 requests to reduce costs and improve efficiency.
- Experiment with Output Formats: Explore different formats for presenting the extracted links, such as article-style summaries or other creative formats.
Thanks for reading!
Be sure to follow the bitgrit Data Science Publication to keep updated!
Want to discuss the latest developments in Data Science and AI with other data scientists? Join our Discord server!
Follow Bitgrit below to stay updated on workshops and upcoming competitions!
Discord | Website | Twitter | LinkedIn | Instagram | Facebook | YouTube