As of 2016-02-26, there will be no more posts for this blog. s/blog/pba/
Showing posts with label statistics. Show all posts

Sudoku is a game, generator, solver, and statistics calculator. It has command-line options and well-documented manual page, plus nice UI to play.

https://lh3.googleusercontent.com/-_8Z9dXgUa8g/VplrexUcDUI/AAAAAAAAI6w/LEbJcIOk57o/s800-Ic42/sudoku.O3k4.gif

In game, you can save or load board, request a hint which might even give you which digit to try, it can also solve the board if you have given up.

Sudoku was released in 2005-07 by Michael Kennett into public domain, continued by Peter Spiess-Knafl on 2015-02-12, written in C with ncurses, running on Windows and Unix-like, currently git-61e3f39 (2015-03-13, post v1.0.4 (2015-02-28)).

GitHub has just pushed out the latest feature: Contributions. I took a screenshot of my profile page:

I love this new feature. You can clearly see what you have done towards others repositories and the amount at when you do. That calendar is definitely the spotlight of the entire feature. Longest streak is probably the most interesting number among the statistics to me. Only 8 days streak coding, I must code harder!

Coincidentally, I was thinking something Id call Contribution Ratio. Its basically calculating commits to others repositoriesdirect commit or via pull requests are all countedover total commits, then you have a percentage, the higher means you contribute more. Although, it doesnt really indicate you have no contributions even when its 0%, you may be maintaining a popular open source project under your account name.

Off-topic: as you can see I know use GetSatisfaction to handle feedback, previously Google Moderator. If you have any suggestions, please add it there.

I want to see such number because I feel I havent contributed any recently, and would like to know how low my number is. However, I didnt start it, because itd be just a number, even that wouldnt take long to code it. If you want to see such number, add your vote on that idea.

Yes, I know it's still 2012. When the clock tells me it's 2013, I will be too lazy to summarize it. Besides, Doomsdays is coming, we probably wouldn't have seen the new year. Yeap, I know. Again. Only silly people will believe Doomsdays is true, right? How could they not see the real threat is Juno?

Moving the range one month ahead, so when it's said 2012 in this post, it means from December, 2011 to November, 2012. First, starts with the summary in case you just want to read the numbers.

Summary

  • Blog: 331 posts and 85 comments.
  • Code: 17,591 additions and 17,874 deletions.
  • Gentoo: 1,056 merges and 999 un-merges.
  • Emails: 750 mails.
  • Last.fm: 4,711 scrobbles ~ 9.8 days (if each track is 3 minute long).
  • Television: 1,240 episodes ~ 52 days.
  • Film: 29 films ~ 2 days.
  • mt: 265.
  • gad: 35.47 (yjlv) + 76.18 (brps) = 111.65.

Blog

Basics numbers

331 Posts       378.730 per year   31.561 per month
 85 Comments     97.257 per year    8.105 per months  0.257 per post

First post                     <-  0.9 years ->                      Last post
One-liner text using jQuery... <-  10 months -> Stack Overflow testing revi...
2012-01-16 11:53:00-08:00      <-   319 days ->      2012-11-30 14:34:00-08:00

331 Posts    210 Updated (after 11 days, 3:10:58.704524 in average)

106,338 Words     321.263 per post
541,550 Chars   1,636.103 per post
  1,574 Labels      4.755 per post

With addition 10 spam comments. Most used word is, well, the "I," 3,170 times. I really need to quit that, heck, it's I++.

Only 321 words per post, it's not very long. I wish my posts will reach 1,000 words in average in the future, though I won't keep that in mind when writing. Blogging shouldn't be like that, it should go with the flow of mood, not the word count.

Most commented posts

   25 ( 29.4%): Full referrer URL in Google Analytics reports
    9 ( 10.6%): Sigh, glad I still have Disqus on my side
    5 (  5.9%): Bored? Have some random eats with recipes on ActiveState Code
    5 (  5.9%): Let your readers decide when to load Disqus
    4 (  4.7%): The Reading list in Blogger Dashboard
    3 (  3.5%): GitHub following and watching graph
    3 (  3.5%): Silly retard filename for removal
    3 (  3.5%): Three years with Gentoo
    3 (  3.5%): Custom date time format in Google Docs Spreadsheets
    2 (  2.4%): Fus Ro Dah!
I can't even remember 7 of them which were written in 2012. It's not like I forgot the content, just too many posts that I couldn't recall when I wrote them.

Distribution of posting and comments

. By Year and Month ..........................................................

YYYY-MM  Posts                             |                          Comments
2012-01  16                           #####|                                 0
2012-02  42                   #############|##                               2
2012-03  94 ###############################|#####                            5
2012-04  89   #############################|############################### 27
2012-05  32                      ##########|##                               2
2012-06  24                         #######|####                             4
2012-07  10                             ###|##########################      23
2012-08   0                                |                                 0
2012-09   0                                |#########                        8
2012-10   0                                |#####                            5
2012-11  24                         #######|##########                       9

. By Day of Month ............................................................

Day  Posts                               |                            Comments
 01   9                ##################|####                               1
 02  16 #################################|                                   0
 03  11            ######################|########################           5
 04   6                      ############|########################           5
 05   9                ##################|####                               1
 06  16 #################################|##############                     3
 07  12          ########################|################################## 7
 08  10              ####################|###################                4
 09  12          ########################|################################## 7
 10  10              ####################|################################## 7
 11  12          ########################|#############################      6
 12   5                        ##########|########################           5
 13  15    ##############################|###################                4
 14   8                  ################|                                   0
 15   9                ##################|####                               1
 16  14      ############################|                                   0
 17   6                      ############|#########                          2
 18  12          ########################|##############                     3
 19  11            ######################|#########                          2
 20   8                  ################|#########                          2
 21   9                ##################|#########                          2
 22  15    ##############################|                                   0
 23  14      ############################|########################           5
 24  10              ####################|####                               1
 25  16 #################################|                                   0
 26  10              ####################|#########                          2
 27  12          ########################|####                               1
 28  15    ##############################|##############                     3
 29   5                        ##########|#########                          2
 30   7                    ##############|##############                     3
 31   7                    ##############|####                               1

. By Hour of Day .............................................................

Hour Posts                               |                            Comments
 01   9                ##################|####                               1
 02  16 #################################|                                   0
 03  11            ######################|########################           5
 04   6                      ############|########################           5
 05   9                ##################|####                               1
 06  16 #################################|##############                     3
 07  12          ########################|################################## 7
 08  10              ####################|###################                4
 09  12          ########################|################################## 7
 10  10              ####################|################################## 7
 11  12          ########################|#############################      6
 12   5                        ##########|########################           5
 13  15    ##############################|###################                4
 14   8                  ################|                                   0
 15   9                ##################|####                               1
 16  14      ############################|                                   0
 17   6                      ############|#########                          2
 18  12          ########################|##############                     3
 19  11            ######################|#########                          2
 20   8                  ################|#########                          2
 21   9                ##################|#########                          2
 22  15    ##############################|                                   0
 23  14      ############################|########################           5
 24  10              ####################|####                               1

Labels

29 (  1.8%): Bash
28 (  1.8%): Python
26 (  1.7%): thought
22 (  1.4%): JavaScript
22 (  1.4%): Blogger
17 (  1.1%): shell scripting
13 (  0.8%): Gentoo
12 (  0.8%): Song of the Day
12 (  0.8%): Google
10 (  0.6%): email

Numbers from Google Analytics


Compare to 2011, the numbers are doubled, even four months have no new posts. In Blogger Stats, the pageviews are 172,645, that's 42.7% more than 120,945 in Google Analytics.

Top posts haven't changed much, three posts written in 2012 got into the list.


As for traffic sources, more people from search engine, which is not what I wanted at all.

Television shows and films

In 2012, I watched 1,240 episodes of television shows and 29 films. Just for rough average number 1,240 * 1h + 29 * 2h = 1,240 hours + 58 hours ~= 54 days.

Holy cow, 54 days! I spent two months on watching television shows and films in 2012.

I wish I was wrong but I took very detail records. After I watch an episode, I use VimNotes to take a record with timestamp. One episode per line, so grep and wc can get the correct numbers. I have been keeping record since October, 2010.

If I was able to count my time spent on YouTube, that would definitely contribute a lot.

Code


17,591 additions and 17,874 deletions. Only count towards some of my own repositories.

Hold your congratulation line, don't get too excited for me.


It may look amazing, reaching one thousand pageviews a day. In fact, the record seems to be 1,056 pageviews when one post got into reddit.

You can see there is a significant increase in last few days, that's when I made Disqus load with page, instead of load by a button click. I wanted Disqus Discovery, to be more precise, the related content since I had removed related posts. But it really came with a price, a bloated pageviews.

There were a lot of http://disqus.com/embed/comment/ referrers:


Unfortunately, it's not like viewers click on related content and read another post of mine. Somehow, embed.js requests the page and more than once, sometimes:


I also checked on other website which has Discovery shown, same issue. If Discovery isn't shown, this won't happen. There must be a bug, because there is no need to load the page more than once. Well, it shouldn't be even once, actually. I can't really think of any reason for this.

Right now, I have reverted to button-loading Disqus. It's not worth to bother readers' browser to load unnecessary stuff. Disqus still loads a lot of stuff.

I stumbled on Alexa.com for stats about my domain yjl.im. Alexa.com claims they track 30 millions in their front page, then it means my website is at 99th percentile (430,285th), which only means there are a lot of crappy or pretty idling websites around. I truly doubt my website can rank this high, even Alexa.com statistics is not comprehensive, it can't show the whole picture.

My site has 102 links-in, but more than half of them are scrappers to Alexa or those domain listing directory I hate. This only proves that most of websites are garbage once again.

I really wish people would visit my sites from home and not mostly via search engine. I don't want my site to be technical mainly, but that's what it is, at least for now. Someday, my site would be read by everyone, that's the goal.

Anyway, it's still fun to read some statistics. I found the demographics data is amusing. To put in words, Alexa concludes as:

Relative to the overall population of internet users, Yjl.im's users are disproportionately male, and they are disproportionately childless people under the age of 35 who browse from work and have postgraduate educations.


It seems young, male, and highly educated adults are my primary readers. No children do not mean much to me, I am also curious about the marital status, but it's not included. It it was no children, single, and never married, would that mean nerds?

Visitors from India are top 1 readers, then from United States. Interestingly, Tunisia is on the top 5, not sure why so.


As I said, Alexa doesn't show the full picture, it's not accurate. If I am correct, it relies on its toolbar web browser plugin. According to Google Analytics, within similar 3-month range, United State is the top source, India only ranks 4th and Tunisia isn't in top 10. Too bad, demographics in Google Analytics, only location is helpful.


As you can see, that's a huge difference, Alexa is never close to reality. Unfortunately, some still refer to its statistics. Its data is gathered by limited method. Even worse, it encourages you to put some code to your website in order to get better ranking, the data would be unfairly distorted.

Its been almost two weeks since the announcement of Blogger Export Analyzer (BEA), I added a few statistics to BEA:

  • More charts of published time by Year, Month, and Hour.
  • Most used words.
  • Post updating statistics.

For these additions, here is a sample output:

- Posts ----------------------------------------------------------------------

   941 Posts    899 Updated (after 256 days, 14:30:00.346941 in average)

   250,853 Words     266.581 per post
 1,339,093 Chars   1,423.053 per post
     5,021 Labels      5.336 per post

. 266 most used words ........................................................

8,234 i       7,205 the     7,128 to      4,752 a       4,340 you
4,298 is      3,499 and     3,242 it      2,904 of      2,693 in
<SNIP>

- Posts and Comments Published Time ------------------------------------------

. By Year and Month ..........................................................

YYYY-MM  Posts                             |                          Comments
2008-09  18                           #####|                                 1
2008-10  25                        ########|##                               6
<SNIP>
2012-03  94 ###############################|#########                       20
2012-04  65           #####################|############                    26

. By Year ....................................................................

Year Posts                               |                            Comments
2008 151             ####################|############################      81
2009 192       ##########################|################################# 93
2010 236 ################################|##############################    86
2011 145              ###################|#########################         72
2012 217    #############################|######################            63

. By Month of Year ...........................................................

Month  Posts                              |                           Comments
  01    90           #####################|###########                      30
  02   114     ###########################|################                 42
<SNIP>
  11   126  ##############################|#############                    35
  12    77              ##################|################################ 81

. By Day of Month ............................................................

Day  Posts                               |                            Comments
 01  29              ####################|########                          11
 02  38       ###########################|###                                4
<SNIP>
 30  22                   ###############|#######                           10
 31  22                   ###############|#####                              7

. By Hour of Day .............................................................

Hour Posts                               |                            Comments
 01  29              ####################|########                          11
 02  38       ###########################|###                                4
<SNIP>
 23  32            ######################|#############                     17
 24  30             #####################|#######                           10

For the additional charts, I wanted to see if I have particular posting hour. In most used words section, as you can see I use really lots of I. That is 3.2% of total words I wrote, its awfully a lot. As for updating statistics, 95.5% of posts has been updated at least once.

I created BEA because I desired seeing some numbers and I did. I will continue to add new stuff when something comes to my mind. If you have suggestions, feel free to leave a comment or create new issue.

Blogger Export Analyzer (BEA) is a simple analyzer for Blogger Export XML file, that I created to get some statistics using data from Blogger Export file. I was hoping the Blogger Stats data is part of Export file, but they werent. The code is written in Python 3 and licensed under the MIT License.

The following is a sample output:

= Blogger Export Analyzer 0.0.2 ==============================================

  YJL --verbose by Yu-Jie Lin
  Outputs directly from me <strike>about almost everything</strike>...

- General --------------------------------------------------------------------

       930 Posts       258.727 per year   21.561 per month
       391 Comments    108.777 per year    9.065 per months  0.420 per post
         2 Pages
         0 Drafts
     2,041 Labels

First post                     <-  3.6 years ->                      Last post
Let's Make Some Garbages       <-  43 months -> Multitasking with storytell...
2008-09-13 16:13:00-07:00      <-  1312 days ->      2012-04-18 10:56:00-07:00

- Posts ----------------------------------------------------------------------

   246,339 Words     264.881 per post
 1,315,911 Chars   1,414.958 per post
     4,970 Labels      5.344 per post

- Comments -------------------------------------------------------------------

   34 out of 391 Comments are not counted in this section.

. Top Commenters .............................................................

  125 ( 35.0%): livibetter
    9 (  2.5%): Calidan
    7 (  2.0%): Vajrasar
    6 (  1.7%): Derick Dalton Lee
    5 (  1.4%): Mario César
    4 (  1.1%): Jain
    4 (  1.1%): Guilherme Lino
    3 (  0.8%): zizukabi
    3 (  0.8%): MHazell
    3 (  0.8%): Lenama7

. Most Commented Posts .......................................................

   34 (  9.5%): Get ready for this Falling Snow Season!
   24 (  6.7%): Stick div at top after scrolling
   17 (  4.8%): Using Django's I18N in Google App Engine
   16 (  4.5%): Adobe AIR 1.5 on Fedora 10 x86_64
   14 (  3.9%): Migrating to tmux from GNU/Screen
   10 (  2.8%): Follow mouse for x11grab of FFmpeg
    9 (  2.5%): Sigh, glad I still have Disqus on my side
    9 (  2.5%): jQuery plugin jk navigation
    9 (  2.5%): Bad value X-UA-Compatible for attribute http-equiv on eleme...
    8 (  2.2%): Installing Woopra 1.2 beta on Ubuntu amd64

. Most Commented Posts Over Days Since Published aka. Popular Posts ..........

1.000: Sigh, glad I still have Disqus on my side
0.400: One Day Without Shoes 2012
0.375: Silly retard filename for removal
0.308: The Reading list in Blogger Dashboard
0.111: Better Bitbucket Explore
0.062: Earth Hour, a one-hour globally fanatic phenomenon?
0.038: Follow mouse for x11grab of FFmpeg
0.033: Disquise
0.030: Stick div at top after scrolling
0.027: Get ready for this Falling Snow Season!

- Posts and Comments by Month ------------------------------------------------

YYYY-MM Posts                             |                           Comments
2008-09  18                          #####|                                  1
2008-10  25                        #######|##                                6
2008-11  57             ##################|#####                            11
2008-12  51               ################|##############################   63
2009-01  32                     ##########|##########                       23
2009-02  13                           ####|###                               8
2009-03  27                       ########|########                         18
2009-04  41                  #############|####                             10
2009-05  14                           ####|#                                 3
2009-06   1                               |#                                 4
2009-07   1                               |                                  1
2009-08   1                               |                                  2
2009-09   1                               |                                  0
2009-10  27                       ########|##                                6
2009-11  18                          #####|####                              9
2009-12  16                          #####|####                              9
2010-01  16                          #####|#                                 3
2010-02   3                               |#                                 4
2010-03   2                               |                                  0
2010-04   9                             ##|                                  0
2010-05  19                         ######|                                  0
2010-06   2                               |                                  0
2010-07   1                               |                                  0
2010-08  53               ################|                                  0
2010-09  62            ###################|###########################      57
2010-10   8                             ##|#                                 3
2010-11  51               ################|#####                            11
2010-12  10                            ###|###                               8
2011-01  26                       ########|                                  1
2011-02  56              #################|#######                          16
2011-03   5                              #|####                              9
2011-04   0                               |                                  2
2011-05   0                               |#                                 3
2011-06   0                               |#                                 4
2011-07  13                           ####|                                  2
2011-08  18                          #####|######                           14
2011-09  27                       ########|#####                            12
2011-10   0                               |#                                 4
2011-11   0                               |#                                 4
2011-12   0                               |                                  1
2012-01  16                          #####|#                                 3
2012-02  42                  #############|######                           14
2012-03  94 ##############################|#########                        20
2012-04  54              #################|##########                       22

- General --------------------------------------------------------------------

     2,041 Labels labled      4,970 times      2.435 Labeled per label

. Most Labeled Labels ........................................................

  117 (  2.4%): OldBlogBlogarbage
   95 (  1.9%): OldBlogTuxWearsFedora
   93 (  1.9%): Python
   74 (  1.5%): OldBlogGetCtrlBack
   72 (  1.4%): Bash
   63 (  1.3%): JavaScript
   61 (  1.2%): OldBlogmakeYJL
   58 (  1.2%): thought
   51 (  1.0%): Google
   50 (  1.0%): Blogger

. Least Labeled Rate .........................................................

 1396 ( 68.4%) Labels labeled   1 times
  293 ( 14.4%) Labels labeled   2 times
  131 (  6.4%) Labels labeled   3 times
   62 (  3.0%) Labels labeled   4 times
   38 (  1.9%) Labels labeled   5 times
   24 (  1.2%) Labels labeled   6 times
   19 (  0.9%) Labels labeled   7 times
   16 (  0.8%) Labels labeled   8 times
   12 (  0.6%) Labels labeled   9 times
    4 (  0.2%) Labels labeled  10 times

The output is plain text like my Google Analytics report and I dont intend to add options for customization unless someone requests with good reason. The statistics you see in the image on the right is pretty much I have planned to have. I will only add new stuff when I get a new idea.

One number I wanted to see is the labels count, which have confirmed that my labeling seems to have gone out of control. 1,396 labels only are used one time. I always knew I was unable to tame my labeling misbehavior. ;p

Updated at 2012-02-25T23:08:40Z: This has nothing to do with Blogger. It seems Google Analytics' tracking script will detect if you hold the account. If you do, then it shows you the interface. Here is a screenshot when I view the homepage of my blog:





When I was writing my previous post, I saw this after I hit preview button:


We found no clickthroughs for this page. Try adjusting the date range or select another page.

You got it perfectly right, Google Analytics! Because it's totally new post, how could you find click? If you do, either your are a fortune teller or something gone haywire.

But I don't mind this show up when I edit my old posts. It would be nice to know some statistics. Only it takes a few seconds to load the Google Analytics frame every time you hit preview button.

Since Blogger pushed new feature of Stats, I began to see more fake referrer in Stats tab or Google Analytics. Fake referrer is very common spam, but Blogger Stats needs you doing nothing to read the statistics. It's convenient for bloggers and spammers.


Those people who create crappy website have targeted us now. If it spams not only for traffics, but also try to hack into your computer, that would be very bad.

Don't click on any referrers that you have no idea what they are. I hate spammers!

I just tried to add two entity counts to my app's statistics page. Then I found out, the statistics APIreleased on 10/13/2009, version 1.2.6is not available for development server.

You can run the following code without errors:
from google.appengine.ext.db import stats
global_stat = stats.GlobalStat.all().get()

But global_stat is always None.

So I ended up with a code as follows:
db_blog_count = memcache.get('db_blog_count')
if db_blog_count is None:
blog_stat = stats.KindStat.all().filter('kind_name =', 'Blog').get()
if blog_stat is None:
db_blog_count = 'Unavailable'
else:
db_blog_count = blog_stat.count
memcache.set('db_blog_count', db_blog_count, 3600)

The documentation didn't explicit mention whether if the statistics is available for development server or notmaybe I didn't read carefully, neither did Release Notes.

PS. I know the code is awful, str / int types mixed, terrible. But I am lazy to add and if clause in template file to check if db_blog_count is None or something like -1, or anything represents the data is not available.

PS2. The code should be just if blog_stat: (fourth line) and swap the next two statements if you know what I meant.