Examining the Remnants of a Small DDoS Attack

Posted on 03 December 2016 in Technology • Tagged with apache, botnets, ddos, drupal, ip, logs, sqlite

On Sunday (27 November 2016) a small website that I advise on was the victim of a DDoS attack that managed to knock the site offline. I received notice on Monday that the website was not working. I was able to ssh to the web server and quickly found that the database service was stopped. After a brief examination of the database logs (nothing too out of the ordinary), I started the service back up and sure enough the website came back online. As the website runs on Drupal, I logged in to take a peak at the Recent log messages and found hundreds of records of log in attempts from a lot of different IP addresses. User accounts on the website are only used by administrators to update content, so it was clear that the site was hit by a DDoS attack!

After getting things back online, I poked around the various log files to try to get an idea of what happened. The Drupal watchdog logs seemed to indicate that the attack started around 15:22 EST and overloaded the server's memory around 15:42 EST. The Apache server's access logs, however, revealed that the attack started closer …


Continue reading

Loading Plotly Graphs on Demand with Waypoints

Posted on 23 November 2016 in Technology • Tagged with graphing, javascript, plotly, waypoints

In my last post, 12 Years of Gmail, Part 4: Chat, I included eight Plotly graphs on a single page. All the graphs worked correctly, but the page was taking almost four seconds to render any content at all and up to 6-8 seconds to load completely without cached elements. By contrast, the landing page of chrxs.net takes less than a second to load with visual content rendering almost immediately. The site is intentionally designed to be light weight and uses very few resources on a standard load. But Plotly graphs require a big (1MB+ uncompressed) JavaScript file in order to load with all the bells and whistles. What can be done to improve this slow load time, particularly when many graphs are on a single page?

film strip before Film strip before optimization (webpagetest.org)

The page load film strip above shows almost three whole seconds before any content is rendered. The obvious first step was to move the loading of Plotly's large JavaScript file from the page head (which loads before content is rendered) to the end of the page body, theoretically allowing the page's content to be partially loaded and rendered earlier. However, doing this created a bit of …


Continue reading

12 Years of Gmail, Part 4: Chat

Posted on 18 November 2016 in Technology • Tagged with 12 years of gmail, chat, graphing, plotly, python, takeout inspector

This post is part of my series, 12 Years of Gmail, taking a look at the data Google has accumulated on me over the past 12 years of using various Google services and documenting the learning experience developing an open source Python project (Takeout Inspector) to analyze that data.

With the Finishing Touches in place, it's finally time to start looking at some of the data in my Google Takeout Mail export file. What better to start with than the Google Talk (or Google Chat, as I will refer to it) content stored within!

I am starting with Chat because I was surprised to find it all stored in the export file. It makes sense as chat history is accessible from the Chats link in the old Gmail interface (I couldn't find an equivalent in Inbox). My surprise led to curiosity and my curiosity led to obsession with trying to figure how the Chat data is stored and what information each messages contains. It turns out there are quite a few things that can be gleaned from these chat messages -


Continue reading

12 Years of Gmail, Part 3: Finishing Touches

Posted on 12 November 2016 in Technology • Tagged with 12 years of gmail, configparser, names, graphing, plotly, python, takeout inspector

This post is part of my series, 12 Years of Gmail, taking a look at the data Google has accumulated on me over the past 12 years of using various Google services and documenting the learning experience developing an open source Python project (Takeout Inspector) to analyze that data.

After spending last week Bootstrapping things and, somewhat related, working my way around Pelican, today I have tried to tie up loose ends so I can start spending more time thinking about what information I can get from all this data. While the package is far from complete, these "finishing touches" ended up being the three themes of this morning's work -

  1. Implementing a settings file
  2. Customising Plotly graphs
  3. Generating random names

Implementing a Settings File

While thinking about how to customize graphs (more on that below) and allow for changes to styles without too much effort, it struck me that there is likely some common ("Pythonic") way to handle settings. And, of course, there is - it's called ConfigParser and it's extremely handy.

To get my feet wet, I created a settings.cfg file with the following contents:

0
1
2
3
4
;settings.cfg
[mail]
anonymize = False
db_file = data/email …

Continue reading

12 Years of Gmail, Part 2: Bootstrapping

Posted on 08 November 2016 in Technology • Tagged with 12 years of gmail, mailbox, graphing, plotly, python, sqlite, takeout inspector

This post is part of my series, 12 Years of Gmail, taking a look at the data Google has accumulated on me over the past 12 years of using various Google services and documenting the learning experience developing an open source Python project (Takeout Inspector) to analyze that data.

Jumping back in to Python has been just as fun as my first experiences with it. After brushing off some of the dust, I have managed to put together a (very) small package that does a couple of basic things with a Google Takeout Mail (mbox) file:

  1. Parses and standardizes the format of email addresses;
  2. Imports key messages data in to an sqlite database;
  3. Produces simple graphs of top recipients and senders.

Parsing Email Addresses

The mailbox Python module makes it very simple to get an mbox file in to Python and play around using the mailbox.Mailbox and email.Message classes. Here is an example using my mbox file:

0
1
2
3
4
5
6
7
8
9
import mailbox
email = mailbox.mbox('/path/to/email.mbox')

# The number of emails in the mbox file.
print len(email.keys())
114407

# The "Delivered-To" header of the first email.
print email …

Continue reading

12 Years of Gmail, Part 1: Google Takeout

Posted on 28 October 2016 in Technology • Tagged with 12 years of gmail, email, google takeout, python, takeout inspector

This post is part of my series, 12 Years of Gmail, taking a look at the data Google has accumulated on me over the past 12 years of using various Google services and documenting the learning experience developing an open source Python project (Takeout Inspector) to analyze that data.

I have been slowly migrating off of a Gmail email address for a couple of months now - I established this domain, selected an email provider, set up SPF, DMARC, etc. and finally created myself a new email address. I updated the address in all of the obvious places, but still found myself using Gmail frequently to keep up. At some point I realized that the only way to finish the migration would be to do something with all the email I had hoarded away in Gmail.

When I made the transition to Gmail (from a mail server in my basement) back in 2004, I found some tool that pulled all my existing email in to Gmail using POP. So, I thought to myself in 2016, I'll just do that again! I fired up Thunderbird, set up Gmail POP access and started downloading. At some point, thousands of emails in, I decided …


Continue reading