This post is part of my series, 12 Years of Gmail, taking a look at the data Google has accumulated on me over the past 12 years of using various Google services and documenting the learning experience developing an open source Python project (Takeout Inspector) to analyze that data.
Jumping back in to Python has been just as fun as my first experiences with it. After brushing off some of the dust, I have managed to put together a (very) small package that does a couple of basic things with a Google Takeout Mail (mbox) file:
- Parses and standardizes the format of email addresses;
- Imports key messages data in to an sqlite database;
- Produces simple graphs of top recipients and senders.
Parsing Email Addresses
0 1 2 3 4 5 6 7 8 9
import mailbox email = mailbox.mbox('/path/to/email.mbox') # The number of emails in the mbox file. print len(email.keys()) 114407 # The "Delivered-To" header of the first email. print email …