Follow Ham was created in one afternoon as an exercise in machine learning using the Twitter API.

Follow Ham is a recommendation service for Twitter users. It scans their followers and employs machine learning techniques to classify those other accounts as “spam” or “ham” in order to recommend which to follow back.

To decide whether an account is “spam” or “ham”, I use classical machine learning methods — I created a classifier using decision tree induction similar to Quinlan’s C4.5 algorithm on a set of features including a user’s follower count, following count, age, tweets per day and other factors. For training data I collected spam reports sent to @spam for several weeks to identify a few thousand known spammers, and hand picked another two thousand non-spam accounts from my and others’ following lists.

The result is a system that can identify low quality accounts very quickly, helping you to decide which of your followers to follow back without checking each account yourself. The more people use it, the more profiles it’ll be exposed to in order to improve the algorithm in the future.

Follow Ham is a PHP application using Twitter’s API. It is backed by a MySQL database. The website design and logo are original except for the background image, which was licensed from a vector art gallery. The image was aptly named “spam attack”.

Release Date:

January 2010

My Role:

Technologies Used:

Current Status:

No longer available