What is Twitter spam follower? Here is my definition of it:

A Twitter account uses manipulating way to attract other Twitter users to follow it. Usually, they would follow using automatic method, then you would follow back.

This post has two parts, first one is my thoughts about spam follower, the second part is the data I have been collecting, so you can see by yourself.

1   Thoughts

I believe there are some programs let you automatically follow other Twitter users, and thats really bad in my opinion. You may think this is no big deal, please take a look at the following chart, then think again.

http://lh6.ggpht.com/_CLdf4ORfzWk/S2eI4UeN-QI/AAAAAAAACcU/dh2KStZQsBw/s800/tc_1m.png

The chart above represents the last 30 days of followers counts of my old Twitter livibetter1. I used to block those bad Twitter accounts but at the beginning of this year, I decided to stop blocking for a month, so I could show you how serious the issue is. The real followers counts are actually a little bit higher for each day because some would follow you, then unfollow you after a short time (or being suspended by Twitter).

I have been using Twitter since 11/21/2007, I am not so active, but I believe I have good observation of such kind of spamming. The reasons of doing such thing for spamming are:

  • They want you to read their spam tweets. When they follow you, you may go to check up their profile page, so that reaches their goals.
  • They want to have more followers count. Some Twitter users really do not care whom follow them, they just follow back.

They have the following ways to do the spam following:

  • Track specific topics, then follow
  • Follow users tweet about trending topics
  • Follow the followers of specific popular users: This is really a smart strategy. Say a spam follower is targeting for diet, he can find some remarkable users and follows those users followers. He would have higher successful rate since those followers are already interested in diet.
  • Follow whoever has just tweeted

The reasons of following in above ways is to make sure they spam follow active users, how so? Because they want you to follow back, if you are inactive, how would you possible to follow back? Basically, if you are more active useryou tweet very oftenmore spam followers you will get.

Such kind of things must be stopped. I know Twitter has been suspending abnormal behavior2 but its not enough. Many people are still able to cheat (yes, its cheating). Basically, Twitter allows you to follow 2000 without troubling you if you are not aggressive, but I would say 200 is more an appropriate number if this is really a social networking thing. If you claim you have 2000 friends, I feel bad for your so-called friends, you are just treating them cheap.

If you have to follow a real friend who would really read your tweets, then I dont think you should follow an account which has followed more than two or three hundred Twitter users because there is no way that Twitter user would possibly read your tweets. It gets tweets from two hundred users, how would anyone can read that many? The only way, your tweet would be noticed is the mention if that account do read tweets from its Twitter home.

I want to give you a list so you can pay more attention to your new followers:

  • You have no idea whom the followers are.
  • Accounts have followed more than 200 users.
  • Accounts only tweet via services, which allows to tweet automatically by feeds like twitterfeed. Its really sad to see such good service being involved in such bad behavior.
  • Accounts have never tweeted like a person.
  • Accounts have only tweeted tweets with links.
  • Accounts have sexy girl avatars or default Twitter avatars.
  • Accounts have bouncing following/followers count every hour.
  • Accounts follow you and you have no reasons to DM them. DM user requires user to follow you first. Some services Twitter accounts allow you to DM for sending a command, then it would be fine to have such followers.

2   See some real data

As I said I stopped blocking at beginning of this year, now take a look of my followers counts for last three months:

http://lh4.ggpht.com/_CLdf4ORfzWk/S2eI4WDjR4I/AAAAAAAACcY/U68iP3O6MBA/s800/tc_3m.png

I dont think I would need to tell you anything.

Next, I will show you some processed data. The raw data I collected were my followers list via API in JSON format, date range is between 15:00 1/19/2010 and 08:00 2/2/2010, 13 days and 17 hours, all times are UTC+8. I downloaded my followers list every hour.

Here is how you read that data:

  • The line starts with F means some Twitter users follow me in that hour; with U means some Twitter users unfollow me in that hour.
  • If a Twitter user unfollow me, then
    • The time (duration) after the screen name is how long that user had been following me.
    • The following lines would indicate if there are changes with that users following/followers count.
    • The last line shows three numbers, how many that user has followed, the gain of followers of that user, then how many that user has to follow to gain 1,000 followers.

Here is the processed data:

http://sites.google.com/site/livibetter/blog-files/results.png?attredirects=0

As you can see some users unfollowed me around 0-4 days after followed me. Some even unfollowed me within an hour after followed, I had seen few times that even happened within 10 minutes. Within just more than 13.5 days, I have been followed 31 times and unfollowed 17 times, net gain of followers is 14. Do some simple math 2 * years / (13 days + 17 hours) * 14 followers ~= 746 followers. I should have 746 followers already at least since I started using Twitter, if I didnt try to block them.

3   Conclusion

Why do we really need to do with the followers? Twitter has lots of users, I am just one of them, the real impact could be a million times. So dont you think this kind of spamming contributes energy to whale for surfacing from the ocean? If you dont help stop these, more and more spammers (even normal users) would think this is an easy way to achieve higher followers count. However, I believe most people wouldnt follow back such spam followers, they just create a bunch of accounts and follow in a mess along with some real users. Please dont let them, dont let it ever happen on Twitter (maybe already?).

As a simple calculation from previous section, I might have 746 followers who are not interested in me. You may claim 746 is not a big number but I am just one of thousands of thousands Twitter users. For every follower and your each tweeting, inevitably, Twitter has do some process. Even that is just a small pay, however, you times 487, would that be just a small pay? Then you times a million, would that be a small pay?

In my collected data, one shows 952 follows within 3.5 days, it also means 952 API calls (Its almost impossible that a human to do 952 follows within 3.5 days), but actually API calls could be doubled, or even greater because that account also did unfollows and some would be missed due to hourly data downloading.

Writing such automatic program is not rocket science, its fairly simple, track and follow plus unfollow. But I have some words for those who develop those programs, shame on you! I believe Twitter provides API is for Twitter clients to provide better user experience, not let clients to let you do automatic follow/unfollow/or whatsoever in name of automation. Why on earth you need automation to do on a social network website like Twitter?

Those spam followers are just like junk mails in your mailbox but they stay forever if you dont clean them up. (You do clean up your mailbox, dont you?) Please dont like your real followers to be stuck with those trash. Your followers list should not be like a wastebasket.

Stop them by blocking and/or reporting them for spam (if they do) today for societys sake!

4   Supplement

4.1   Another kind of spam, RT bot

If you pay attention to mention_you, you should get RTd sometimes by bots. They track keywords and do RTing. I really dont know what the reason we need such thing. There is already a searching function and RSS of the results, we dont need the RT bots, it only makes Twitter worse.

I report them for spam. In my opinion, I think you should do the same.

4.2   Have I got a real follower?

Yes, I do have. But its rare. The last time is about half a month ago. At first, I thought that account is just another spam follower, but we have conversations later.

4.3   What I am going to do next?

After this posting, first thing, I might be going to clean up my followers list; second one is I might be going to block those who unfollowed me after short following, I wont let them get out of it! Then, I might modify my code so I can get reports every week about who tries to trick me, so I can block them.

4.4   The code I use

There are two scripts I used to download and to do simple analysis. Its just for reference3, so I put them here.

For downloading followers list:

#!/usr/bin/env python


import datetime
import os
import sys
import urllib2


SCREEN_NAME = 'livibetter'


STATUSES_FOLLOWERS = 'http://twitter.com/statuses/followers/%s.json' % SCREEN_NAME
TIMETAG = datetime.datetime.now().strftime('%Y%m%d%H')
DIRNAME = os.path.expanduser('~/followers')
if not os.path.exists(DIRNAME):
  os.makedirs(DIRNAME)
FILENAME = os.path.expanduser('%s/%s.json' % (DIRNAME, TIMETAG))


def main():

  if os.path.exists(FILENAME):
        print 'Already has the data for this hour'
        return

  try:
        u = urllib2.urlopen(STATUSES_FOLLOWERS)
        json = u.read()
        u.close()
        f = open(FILENAME, 'w')
        f.write(json)
        f.close()
        print 'Done.'
  except urllib2.HTTPError, e:
        print >> sys.stderr, 'Error: %s' % repr(e)
        return


if __name__ == '__main__':
  main()

For analysis:

#!/usr/bin/env python


import datetime as dt
import glob
import json
import re
import sys


def print_counts(followers, id):

  # find first row
  for i in range(0, len(followers)):
        if id in followers[i][1]:
          break
  start = i
  acc_count = 0
  fler_count = sys.maxint
  frnd_count = sys.maxint
  for i in range(start, len(followers)):
        if id not in followers[i][1]:
          break
        fler = followers[i][1][id]
        if fler_count != fler['followers_count'] or frnd_count != fler['friends_count']:
          if fler['friends_count'] > frnd_count:
                acc_count += fler['friends_count'] - frnd_count
          fler_count = fler['followers_count']
          frnd_count = fler['friends_count']
          print ' '*22 + ': % 5s/% 5s/% 5s  %s' % (frnd_count, fler_count, fler['statuses_count'], followers[i][0])

  gain_flers = followers[i - 1][1][id]['followers_count'] - followers[start][1][id]['followers_count']
  if gain_flers > 0:
        ratio = 1000.0 * acc_count / gain_flers
  else:
        ratio = 0.0
  print ' '*22 + ': % 5s/% 5s>% 5d' % (acc_count, gain_flers, ratio)


def main ():

  _RE = re.compile(r'(\d{4})(\d{2})(\d{2})(\d{2})\.json')

  followers = []
  for filename in glob.iglob('followers/*.json'):
        m = _RE.search(filename)
        if not m:
          continue
        p_followers = json.load(open(filename, 'r'))
        flers = {}
        for fler in p_followers:
          new_fler = {}
          for key in ['id', 'screen_name', 'followers_count', 'friends_count', 'statuses_count']:
                new_fler[key] = fler[key]
          flers[fler['id']] = new_fler
        followers.append((dt.datetime(*[int(d) for d in m.groups()]), flers))
        sys.stdout.write('.')
        sys.stdout.flush()
  print
  followers.sort()

  first_follow = {}
  for id in followers[0][1].keys():
        first_follow[id] = followers[0][0]

  for i in range(1, len(followers)):
        set_p = set(followers[i - 1][1].keys())
        set_n = set(followers[i][1].keys())
        ids = set_n - set(first_follow.keys())
        if ids:
          print 'F', followers[i][0], ':',
          for id in ids:
                first_follow[id] = followers[i][0]
                print followers[i][1][id]['screen_name'],
          print

        ids = set_p - set_n
        if ids:
          print 'U', followers[i][0], ':'
          for id in ids:
                print '% 22s: %s' % (followers[i - 1][1][id]['screen_name'], followers[i][0] - first_follow[id])
                print_counts(followers, id)

if __name__ == '__main__':
  main()

[1]My new Twitter account is lyjl, if you have a question about why I created new account, please read this post.
[2]I have 160+ in my blocking list, 83 of them have been suspended by Twitter.
[3]They dont read good, but I still need to save them somewhere.