As of 2016-02-26, there will be no more posts for this blog. s/blog/pba/
Showing posts with label feed. Show all posts

tl;dr http://pipes.yahoo.com/pipeslivibetter/newrubygems

Note

Yahoo! Pipes is gone. (2015-12-02T03:06:52Z)

I have been monitoring Python Packages for quite some time, few days ago, I wanted to expand to the gems on RubyGems.org. Sadly, it doesnt even have any feeds for newly created packages.

The best option for me is to utilize the /api/v1/activity/latest of the Activity APIwhich has XML, JSON, and YAML formatswith the help from Yahoo! Pipes:

$ curl 'https://rubygems.org/api/v1/activity/latest.json'

You can grab the feed at http://pipes.yahoo.com/pipeslivibetter/newrubygems.

Thhttps://bitbucket.org/grandpas/e.bashe source looks like:

https://lh5.googleusercontent.com/-HQApdtvInWE/U51BqIsDs7I/AAAAAAAAGac/wpEmKs6uzF0/s800/New%2520gems%2520feed%25202014-06-15--14%253A46%253A35.png

It only took me 10 minutes or less to figure out how to make such conversion. This was my first time to use Yahoo! Pipes to convert a JSON to an RSS feed.

Although it works well and is very simple, this isnt my ideal solution. I actually asked someone who knows about Ruby to add feed support and timestamp to RubyGems.orgs source code. Thats not going to happen by the person I asked, so if you know about Ruby and you have time, please think about adding such feed to it.

I am thinking to create a couple of issues, but not really sure if I should do that since I dont think I can even call myself a Ruby user.

Before that, well, this feed would do just fine.

The PyPI newest packages feed contains 40 entries, thats 40 new packages. As of writing, it spans back about 21 hours, 3 of 40 are those A simple printer of nested lists spams, for those I had even filed an issue to PyPI issue tracker, I was told:

Richard Jones: I regularly clean out these modules. Just try to ignore them.

As you can guess, I am really sick of those. Everyday, at least three are spams if not less. Wasting time to go through that feed, 7.5% of spam rate, thats not an insignificant number by any means. I just could not believe that I had waited for three months to do something.

So, here it is a cleaner feed1 with the help of Yahoo! Pipes, once again:

https://lh4.googleusercontent.com/-Rrd6uWsUEPU/U3tFBW4fsGI/AAAAAAAAGSw/CyWZVKHA9-s/s800/Yahoo%2520Pipes%2520for%2520filtering%2520PyPI%2520packages%25202014-05-20--20%253A03%253A44.png

PyPI Packages Filtered

You can see two filters, first one is for the spams, seconds are for other legit packages that I am not interested in. Off-topic: the quality of packages are usually low, some are not even ready, links send you to 404. I really dont know why those people bother wasting others time and theirs.

This action was actually prompted by someone who was asking about my Ultimate YJL feed, reminded me of the usefulness of Yahoo! Pipes, should have thought about using Yahoo! Pipes, this pipe only took me about 10 minutes to lay down after 3 months of wasting time on skipping through.

Only 26 packages got through, 14 are removed. 14 actions less to take every day. 14 keypress might not sound a lot but multiply by 365, its 5,110 presses a year. And if you are a mouse clicker, you are wasting even more.

What, mileage? Its a mouse not a flight!


[1]http://pipes.yahoo.com/pipeslivibetter/pypipackagesfiltered is gone with Yahoo! Pipes.

This is the second time I noticed, I forgot it before when I saw the Updates tabs were gone. This time, I missed the timing for resolving some issues of a project which I have commit permission.

I relied on the feed of projects I stared. Google Code provided such feed before, but it has gone for two months at least. For now, I subscribed to 4 project feeds: Issues, Downloads, Source changes, and Wiki updates.

I didn't check the starred projects feed because that feed was only return no entries, not a 404, it should be, even the updates function is disabled. So, it never occurred to me that the feed was essentially rendered useless when Google Code disabled Updates function.

There is an issue opened and a discussion thread. They have grown too long, didn't read much of them. But I feel Google developers and other developers are two different species.

Please star that issue if you had also used Project Updates, hopefully they can finally understand some of developers rely on that very much.

Sadly, Google Webmaster Tools is going to remove Subscriber Stats, which I read on a weekly basis. Even though Google said there are alternatives or replacements available, such as FeedBurner:

Subscriber stats reports the number of subscribers to a sites RSS or Atom feeds. This functionality is currently provided in Feedburner, ...

Well, the fact is it is not, I have never read the exactly same statistics (will explain why not the same soon) in FeedBurner, clearly the poster, Jonathan Simon, Webmaster Trends Analyst, has no idea what information their productsSubscriber stats and FeedBurner Subscribers tabhave provided and are based. At least, not familiar with altogether.

Please, allow me to explain. Firstly, look at Subscriber stats (before it gets wiped):


You can see there are two entries, one is Atom feed and the other is RSS feed, basically by Blogger's default setting. The types may not be same if you redirect to other feed services. You don't see such information in detail in FeedBurner:


The subscriber counts of two feeds are included in Google Feedfetcher, and that's not I want to have and it shows you that FeedBurner is not a replacement and barely you can mark it as an alternative. FeedBurnder does not give you the information you have in Webmaster Tools, it's a summation with different sources. From the explanation:

Google Feedfetcher

Feedfetcher is how Google grabs RSS or Atom feeds when users subscribe to them in Google Reader or iGoogle. Subscriber counts include Google Reader and the iGoogle. Feedfetcher collects and periodically refreshes these user-initiated feeds, but does not index them in Blog Search or Google's other search services.

At first glance, it seems to say that it is same data as the one in Subscriber stats, but it is not, because that's not what you see. 62 != 21+3. Secondly, again, it's a summation, I would like to have more detail.

As you can see in the screenshot, it's the old FeedBurner interface, I should've switched to new interface and see if the information is there. Wait, where is the link to switch interface version?

It'd gone and I didn't notice until now. They probably had dropped the development some time ago and didn't even bother to post an announcement on FeedBurner's blog. Oh, yea, FeedBurner does have a blog, last post was published in October, 2010, you figure out right?

Why 62 != 21 + 3?

First of all, you need to know where are 21 and 3 from. They are statistics provided by Google Feedfetcher bot in User Agent. If you want to get the number for your Blogger blog, you need to redirect feed to a piece of code to intercept the user agent, then redirect it (again) to the real feed, for instance /feeds/posts/default?orderby=update, which I think actually is where FeedBurner gets your content.

The bottom line is, if FeedBurner does not provide such detail information, you need to have a intermediate server for catching those information in order to get individual numbers instead of just a summation.

Back to the question: Why do my blog has 62 subscribers in FeedBurner, when Webmaster Tools only reports 21 + 3.

This is because I have other Blogger blogs' feeds redirected to the same FeedBurner feed after I imported them into this blog.

Now, you should be aware of the problem are:
  1. You can't distinguish the subscribers of different types of feed.
  2. You can't tell the subscribers apart if they are redirected from different sources.
These two problems are fundamentally the same problem. FeedBurner sums up all subscriber counts by Feedfetcher, direct or redirected requests, with query string or without query string. As long as the redirected request made to same burned feed URL, they are all summed up as one number.

For people who don't care, it's fine with you, you are satisfied with a big chuck of number. I care more about this kind of detail. I want to know what type of feed is being subscribed to and how many are from my old blogs' feeds. Once Webmaster Tools removes Subscriber stats, I can't know about it unless I make a simple script to record the number, it's not hard, but i do not want to do that.

In fact, it only takes a few essential lines on Google App Engine. I wonder if there is a service which redirects to specific location and logs all HTTP headers for incoming requests.

However, there is a way to get that Atom feed subscriber count which I have known a long time. I can search my blog in Google Reader, I get exactly 21 subscribers reported with my blog's feed. The problem is the searching function is somewhat slow and it's strange for me to do such task every week.

[edited 2012-04-29T16:05:52Z: An easier way to get the subscriber count in Google Reader is to subscribe to the feed and gets the count via Feed settings... View details and statistics, guess I have to shamelessly subscribe to my own blog feed for the count.]

To be fair, I am not surprised that poster doesn't know about this (I assume). Google has too many products already, it's hard to know every bit. If you test their employees, I am sure most of them can't even name a half of all products which are still in active development. In my case, you need to have such experience to know there is some details missing in FeedBurner's report.

My conclusion is FeedBurner's Subscribers tab is not a replacement of Subscriber stats of Webmaster Tools. Unfortunately, I will live with it.

Beside this issue, other removals, such as the generation of robots.txt, I have no problem of that, because I never used that. But some may find that handy, maybe they should open source'd that part into a standalone page.

The last one is the Site Performance, I had used it once. I did want to see more from it, but I didn't know why there was no new data. Since Google Analytics has started to provide much more thorough data, down to each single page, which I have posted a few days ago.

I just spent around 30 minutes to click on profiles on Google Plus, trying to find some profiles who has Blogger blogs listed in Contributor to section. But I can't find one, even Blogger's Plus.

I have been seeing a few traffic from Google Plus (http://plus.url.google.com/...) and the URL link attached to that referral link is FeedBurner's. So, it's clear to me someone was able to click a link from my blog's feed on Google Plus.

But I have never posted my blog posts on Plus when I published a new post, so it can't be me, can it? And that might not be a link via FeedBurner but a direct link, not sure what Blogger will put in the Plus post for the blog post link.

Anyway, it's not me after 30 minutes of trying to find a profile. I was thinking to add a profile to a circle (follow?), so I can see what it is, well, actually, I have no idea anything about Plus. I just want to reproduce by finding a profile like mine.

I didn't find one, but finally I realized that I could just search for the post title, then I found the source. It was someone who shared via Google Reader. Duh, all mysteries are solved at once.

Google Reader uses my FeedBurner feed and it can post on Plus when you want to share, which I stop using because I don't want to share on Plus. I used to add my comments, but since Google started pushing and squeezing everything of Google into that 4-letter word P-L-U-S, I don't like doing so any more.

During the profile hunting, I saw a few posts are about programming. Some of them have code included. Heck! That's just like reading NASA launch procedure programming code in variable-width text. It's gonna crash, at least in my brain. And...

Wait, where was I? Oh, right, the referrer thing...

So, what does this post tell you?

Simple, only two things:
  1. I wasted 30 minutes on clicking profiles.
  2. You just wasted 3 minutes on reading this post.

In the last few days, I got three Disqus comments and Disqus did email me about them. I use email notification to know if I have new comments.

Two of them were marked as spam by Gmail.

But both times, I have to see the comment on the post's page. First time, I was editing that post. Second time, moments ago, I scrolled down the home page to see how many posts I have published this month using the Archive dropdown list. I saw the post at the bottom has one comment, that's how I knew that comment.

The first case, that comment has six links, five to YouTube, one to Vimeo. It's a real comment, not a spam at all. The second comment, which has no links but only a simple question, "What is the table id?"

It's a real one, too. As I said before, spam detection isn't the solution, it's not fighting but avoiding the truth, which is we have lots of spam bombing us. It's like someone hates cockroach or mouse but this guys do kill them, he catches and moves them out of his house. But they keep coming back and breeding more and more. All the energy of this guy is used to move them out, silly.

These two incidents were not the first time. I had saved a few from spam folders a few times before and they were lucky. I currently have 450 over last 30-day period, so that's 15 spam emails a day. I guess I will clean/check up my spam folder every day from now on.

I now subscribe to Disqus comments feed of this blog. Just in case.

May I call myself by Spam Detection Victim if someday someone does want to give me one million dollars and Gmail put it into spam folder?

I have just merged five old blogs of mine into this blog, yes five blogs! See how many I have here.

364 posts merged, that is.

If you are reading this post with a mass of posts in your feed aggregator, that's because I also redirected the feeds to this blog's feed. You should unsubscribe from old feed and subscribe to /feeds/posts/default.

I will write a post about how I merged those post into this blog, I am still doing some task to finish up.

PS. just a side note, it seems that Blogger has a good protection mechanism:

2010-09-28--11:02:48

PPS. This post becomes #500.

Google announced their latest creation: Google New.

Nice but still missing a couple of things I would like to see:
I think a simple feed is a huge plus for Google New. You can manually subscribe to every product's blog feed. But you wouldn't be interested in all Google products therefore it's unlikely that you would find all feeds and subscribe to them, not to mention it's not easy to subscribe to all and even get to know if there is a new product.

So, if there is a feed for this, I might unsubscribe all Google feeds and use it only.

There is another a couple of interesting thing in the source code of Google New:

    <script src="/newproducts/js/modernizr-1.5.min.js"></script>

    <script src="/newproducts/js/main.js?10"></script>
  </head>
  <!--[if lt IE 7 ]><body class="ie6"><![endif]-->
  <!--[if IE 7 ]><body class="ie7"><![endif]-->
  <!--[if IE 8 ]><body class="ie8"><![endif]-->
  <!--[if IE 9 ]><body class="ie9"><![endif]-->
  <!--[if (gt IE 9)|!(IE)]><!--><body><!--<![endif]-->

It uses Modernizr library and includes a messy code for IEs. IE always is so special, even within between its own different versions.

I just updated to latest nightly build. At first, I noticed the loading indicator has gone and there is a loading progress bar above background tabs when they are loading pages, which is very nice and neat.


But then, I found out that feed icon in location bar is missing! It is discussed here and I honestly couldn't understand why do they develop backwards. (I didn't read the comments)

Do I have to get a RSS Subscription Extension (by Mozilla) addon as the one in Chrome?

Currently the fast way that I know of to look if there are feeds is to check Page Info dialog.

When Google Reader started to recommend blogs and then some special items for your own personal tastes. I like them. But there is a problem still unresolved for recommendations of blogs, some just keep showing up in that list again and again even everyday. (Google Reader gives your new recommendations everyday)

Recently, I started to read recommended items.


I have problems with it too, that Not interested has no shortcut key, I have to use mouse to click on it after I opened it using o shortcut key, hence I rarely use that function and that would cause the recommendations become less interesting to me.

The other issue is more common for both recommended item and recommended sources, we should have had a blacklist feature something like Never show this source again.

I have very special requirements for feeds I would like to read in Google Reader. First, the feed must not be an excerpt-only feed. Second, the representation of feed must remain a good style in Reader. The opened item in the screenshot above is an example. It's an excerpt and the layout has been removed, and it's not an interested item to me.

And the recommended sources should never give you sources which have no recent items, you will see empty page unless you switch to All items mode. Well, a new option to choose if we want those sources would be great.

Just few thoughts about Google Reader, but wgasa, just use Mark all as read.

Warning

This project is dead and some links are removed. (2015-12-02T00:31:55Z)

A quick post. I wrote a script for GAE, it generates Gentoo Forums latest topic/post feed.

Note

Yahoo! Pipes is gone and all links have been removed from this post. (2015-12-11T02:14:17Z)

I saw a posting on FriendFeed, the poster want a Super Feed which has contents of his FriendFeed Feed and Comments + Likes Feed. I think Yahoo Pipes can do this task easily, and it does. However, Pipes is extremely slow right now, it wasnt about three months ago. In other words, I havent used Pipes for three months, hope the issue is just temporary.

Here is the full view of this Pipe:

http://3.bp.blogspot.com/_CLdf4ORfzWk/SNLXd163TMI/AAAAAAAABOo/LdCJdxC87nA/s600-R/FFF%2BCL.png

It uses three types of modules:

  • Fetch Feed: On the top, they grabs your two feeds.
  • Union: At middle, it combines all its inputs, which are your two feeds in this case.
  • Sort: You need to sort the output of Union, because the output of Union is all items in first feed, then all items in second feed. That wasnt what we expected, it should be sorted by published date (time).

I think you can clone my pipe, and replace feed sources with yours (in Fetch Feed modules). If you cant, just drag-and-drop by following the figure above. You can also check out the RSS feed directly.