As of 2016-02-26, there will be no more posts for this blog. s/blog/pba/
Showing posts with label shell scripting. Show all posts

When I was browsing on, I saw this entry Block the 6700 worst spamhosts: (URL edited for plain text file)
wget -q -O - | grep ^127 >> /etc/hosts

As of writing (2012-03-24T08:01:18Z), the list, made by Dan Pollock, has grown to 9,502 domains. That is insane! See how many spam websites we have, although not all are spams, some of the entries are legitimate advertising distributors.

To be honest, I was really tempted to use it, but the huge amount of entries did hold me back completely.

If you want to try it, I can propose you a short script as system cron task. I didn't test and I am writing in on the fly, so use it as your own risk:
cd /etc
# just in case, you haven't saved current hosts as hosts.local
[[ ! -f hosts.local ]] && exit 1
if [[ "$(curl -z hosts.hosts -o hosts.hosts -s -L -w %{http_code})" == "200" ]]; then
  cat hosts.local hosts.hosts > hosts
You will need to run as root first
cp /etc/hosts{,.local}
The script will concatenate your current hosts and the one downloaded from the website. Set up a daily cron task for it, it will only download the file when the files get updated, the method is described as in this blog post.

Be sure to read the comments on the website, which also provides some different modifications and even a RSS feed for notification.

A few days ago, I read Bash Script Templates and I think this might be a good idea to start one of my own. I later found two more posts about it, it's not easy to find one. Either search engine was playing with me or I was too dumb to enter perfect search terms.

I read them and got some concepts from all three of them. Some I like, some I don't. I mingled them altogether with my own thoughts. I created a new repository:, it could be a good idea for it to have own repo, so you may contribute to it easily.

It's still a work-in-progress, but it already has 177 lines (159 sloc) for first version. I have been thinking that a library could be a better idea, especially when I was writing the parse_options function. It could be working like argument parsing in C or in Python with their libraries, and the code would be more clean.

I had also thought of creating user interface, you can click buttons or be asked questions to enable/disable features, then a template will be generated. But that's a lot of works, not sure if that's worth. I didn't go that way eventually, but a simple file.

Please be aware that I haven't used it yet, so I don't really know if it's practical enough. If you run into trouble, please open an issue. Any feedback is welcome, leave a comment.

As I mentioned in previous post about Call Stack, the thing is ready on GitHub, It's a library for easy logging, here is a quick screenshot of the examples:

As you can see, it's colorful, little too fancy, I hope it didn't blind you. It's new , so I hope someone can jump in and improve it. There are some in it I didn't like, e.g. the templating.

It's easy to use, just source, then you are good to go. Please read more on GitHub.

More than a year ago, I wrote It is very helpful to me, I use it to show the uptime and the Portage tree's freshness. I recently moved it to GitHub, even it's only one short script, less than 100 hundred lines, 19 of them are MIT License text. It deserves its own repository.

It's fairly easy to use, you just feed it the number of seconds.

After moving to GitHub, I added a few features because there was a picky alignment issue from me. I have a script which records Unix timestamps, it calculate those timestamps to current time and show the ages using They are not aligned and it looks really ugly as you see in the first case below.

$ ./ 1 2 60 61
1 second
2 seconds
1 minute
1 minute 1 second
$ ./ -p -P 1 2 60 61
 1 second 
 2 seconds
 1 minute 
 1 minute   1 second 
$ ./ -p -P -a 1 2 60 61
 0 days  0 hours  0 minutes  1 second 
 0 days  0 hours  0 minutes  2 seconds
 0 days  0 hours  1 minute   0 seconds
 0 days  0 hours  1 minute   1 second 

Now it looks great. Also I updated the test script:

$ ./ 
Passed:          0            => "0 seconds"
Passed:          1            => "1 second"
Passed:         -1            => "1 second"
Passed:          2            => "2 seconds"
Passed:         60            => "1 minute"
Passed:         61            => "1 minute 1 second"
Passed:       3599            => "59 minutes 59 seconds"
Passed:       3600            => "1 hour"
Passed:       3601            => "1 hour 1 second"
Passed:       3660            => "1 hour 1 minute"
Passed:       3661            => "1 hour 1 minute 1 second"
Passed:      86400            => "1 day"
Passed:     172799            => "1 day 23 hours 59 minutes 59 seconds"
Passed:     259199            => "2 days 23 hours 59 minutes 59 seconds"
Passed:   31622401            => "366 days 1 second"
Passed:          1 -P         => "1 second "
Passed:         60 -P         => "1 minute "
Passed:         60 -P -p      => " 1 minute "
Passed:         60 -P -p0     => "01 minute "
Passed:         60 -p -a      => " 0 days  0 hours  1 minute  0 seconds"
Passed:         60 -P -a      => "0 days 0 hours 1 minute  0 seconds"
Passed:         60 -P -p -a   => " 0 days  0 hours  1 minute   0 seconds"
0 failures of 22 tests.
1652 conversions per second via function calls.
183 conversions per second via script executions.

If you compare the performance,  it's down about 10% after the new features added, but I think I can live with that.

I recently found out I could use qlop -gH package or qlop -tH package to get merge time:

But it doesnt have an option to list packages merged in a session. So I wrote one to parse /var/log/emerge.log on its own,

I tried to mimic the result format.

The merge time calculation is different than qlop, you might see difference in a few seconds. The script uses sed to filter unwanted merge log and keeps the last merge, then uses awk to format the output. You probably noticed that interrupted in the screenshot1 above, its a result of user interruption (pressing Ctrl+C) while merging. The timestamp is the start time, not the end time of merging as shown by qlop, I am just too lazy to change my code.

Of course, there is also the last sync time.

[1]The screenshot shows the result by feeding the script with hand-modified raw log.

$ sqlite3 places.sqlite 'select rev_host, sum(visit_count), sum(visit_count) * 100.0 / (select sum(visit_count) from moz_places) from moz_places group by rev_host order by sum(visit_count) desc limit 20;' | sed 's/\.|/ /;s/|/ /' | while read rev_host count percent ; do printf '%7d (%6.2f%%) %s\n' $count $percent $(echo "$rev_host" | rev) ; done
  18182 ( 23.25%)
  14258 ( 18.23%)
   4585 (  5.86%)
   3652 (  4.67%)
   2994 (  3.83%)
   2973 (  3.80%)
   1809 (  2.31%) localhost
   1683 (  2.15%)
   1338 (  1.71%)
   1175 (  1.50%) dfed
   1033 (  1.32%)
    991 (  1.27%)
    764 (  0.98%)
    740 (  0.95%)
    658 (  0.84%)
    655 (  0.84%)
    569 (  0.73%)
    559 (  0.71%)
    552 (  0.71%)
    521 (  0.67%)

I really need to quit visiting this page.

You can find places.sqlite in ~/.mozilla/firefox/<profile>/. The counts do not much what Page Info dialog reports, I have no idea what cause the differences. Page Info dialog gives smaller counts.

I don't see any place you can get a list, so I decided to dig into the database of Firefox. It's quite interesting that it has a reverse host field not a host field, the characters of host string in reverse. To get the order back, just pass it to rev.