As of 2016-02-26, there will be no more posts for this blog. s/blog/pba/
Showing posts with label Google App Engine. Show all posts

I haven't used Google App Engine for a while and I just needed to deploy for one line change in my code. I was prompted about the update of SDK when I tried to push the code. It can be ignored, but I always update SDK when the newer version is around.

I wrote the following Bash script to do that for that one line change:


This script downloads the latest version, checks the SHA1 (which I never bothered when I did update manually), removes the existing version, then unzip at current directory. It doesn't check versions and only works for Python on Linux SDK.

It could be buggy since I ran it virtually once and it did what I expected after I finished the code. So, basically, I didn't test it. Use at your own risk.

Used to do those steps manually, it didn't really take much time, less a minute (for typing commands) I would guess, but it's a boring task whenever Google App Engine SDK gets new update and that's quite frequent. As of 1.7.3, since the beginning of 2012 (v1.6.2), it's the ninth release of the year. Roughly one release per month.

I wasted quite some time to do such task, not anymore. I didn't even need to search for download page (didn't have it bookmarked), only need to run that script from now on when am told new version is available.

Warning

The project is dead and some links have been removed from this post. (2015-12-02T00:25:07Z)

ItchApe is a simple solution which enables you showing off your apes current itch to the world. You scratch your ape and its itch can be read.

1   Features

  • An itch can be described in up to 140 characters. (Its not a bird, its an ape!) Every character will be shown literally, no HTML will take effective.

2   Notes

  • An itch can be kept up to an hour, but there is no guarantee since itches are stored in memory cache.
  • All itches will not be stored in database. Once they are gone from memory cache, they are gone.

3   Get started

3.1   Adopt an Ape

You need to adopt an ape first, you will get a Secret Key and Ape ID after you submit your Secret Phrase. Make sure you write down these three information.

3.2   Install the code

Once you have your Ape ID, you can install the following HTML code,

<script src="https://ajax.googleapis.com/ajax/libs/jquery/1/jquery.min.js"></script>
<script src="http://lilbtn.appspot.com/itchape/itchape.js"></script>
<script>get_itch('<YOUR_APE_ID>', 'itchdiv')</script>
<div>My ItchApe: <span id="itchdiv"></span></div>

The itch will be shown in itchdiv. It may read like:

My ItchApe: This is my itch (3 minutes ago)

3.3   Scratch your ape

You can scratch your ape, enter the description of the itch and the phrase, key, and ID.

3.4   Scripts

There are two basic Bash scripts for scratching and getting itch, you can download them on Google Code.

4   Developers Information

4.1   Rendered code

The rendered HTML code by /itchape/itchape.js looks like

<span class="itch">The description of itch.</span> <span class="itch_timesince">(3 mintues ago)</span>

4.2   /itchape/getitch.json API

If you want to write your own script, here is how you get the itch. Send a GET request to http://lilbtn.appspot.com/itchape/getitch.json?ape_id=<APE_ID>, the return data is a JSON or JSONP if you also give callback in query string,

{
"ape_says": "...",
"itch": "...",
"scratched_at": 123456789.123
}
  • ape_says is actually the error message, it may have the values listed in ape_says section below.
  • itch is the description of the itch.
  • scratched_at is time of the ape gets scratched, the seconds after Unix epoch, its a float number.

4.3   /itchape/scratch API

If you request using GET method, then it will be a normal page. If you request using POST method, its the API for scratching.

You need to supply secret_phrase, secret_key, ape_id, and itch. If its a successful call, then the data will be sent back as if you make a getitch.json call; if not, then you will get this json {"ape_says":"I'm not your ape"}.

You can also supply callback for JSONP.

4.4   ape_says (error message)

  • "Yeah, I was itching for that!": An itch description is retrieved successfully.
  • "Not itching, yet!": There is no data in memory cache for that Ape ID.
  • "I'm not your ape!": The phrase, key, and ID do not match, there you cant scratch this ape.
  • "Oooh... that feels good!": Scratch is successful and wonderful.

You have to parse these message, there is no error codes or simple true/false to know if its successful or not. Ape doesnt know about whats an API, they say what they want.

5   Support

If you have anything want to report or to request, please submit an issue to issue tracker.

Warning

The projects are dead and some links have been removed from this post. (2015-12-02T00:35:06Z)

Twimonial and Let Secrets Out (LSO) are my 7th and 6th GAE apps, it created both in a month. I decided to post about them because I couldnt get anyone to use them.

You can read about why I created LSO in its blog, the code is licensed under the modified BSD. Here is a screenshot of it:

https://lh6.googleusercontent.com/-omMX0E2efTM/Sya5PyOkUvI/AAAAAAAACYI/ib3Dnms1hbU/s800/gae-gallery-screenshot.png

As the title describes, its a place let you post your secrets, anonymously. I believe I created for good of the world, but I just could get it to the people who need it.

Twimonial is an webapp to let you read or add testimonial about other Twitter users. I got the idea when I saw someone tweet a Follow Friday recommendation. Its just a list of username, I wondered would anyone really follow by just seeing that? I really doubted, at least, I would not follow. And a screenshot of it:

https://lh5.googleusercontent.com/-2wwZgfYquEc/Sya5WVbpRZI/AAAAAAAACYM/CoBu7vOD0dE/s800/screenshot.png

So I thought what if I could read more about those Twitter users? Then, that is testimonial. And here comes Twimonial. The code is not released, its not because I didnt want to. Its because why I wasted my time again. Not really much people are interested in my stuff. The only code I got someone (probably only one) to use is BRPS, which might be the only thing I could say I made it! I think I should also mention a failure of mine, I Thank.

If you are interested in participating or giving feedback, free feel to contact me or leave a comment. If you would use any of them, I just want to say you are the best!

PS. I also submitted a link to reddit for Twimonial.

Note

Someone asked if I could make Twimonial support Identi.ca and I did but it is a separate app called Dentimonial. Go check out if you are a Identi.ca user. (2009-12-16)

https://lh3.googleusercontent.com/-aBJbS_QjOhw/SyghgeYpCNI/AAAAAAAACYQ/V4A4lSYCZHU/s800/screenshot.png

This article, Updating Your Model's Schema, is already great and clear, but it does not have a complete code example. I decided to make one and write down some explanations. Just in case I might need it later.

It has one two stages to remove a property from a data model:
  1. Inherit from db.Expando if the model does not inherit from that.
  2. Remove the obsolete property from model definition.
  3. Delete the attribute, the property, of each entity del entity.obsolete
  4. Inherit from db.Model if the model originally inherited from.

How to actually do it:

Assume a model look like:
class MyModel(db.Model):
foo = db.TextProperty()
obsolete = db.TextProperty()

Re-define the model to:
class MyModel(db.Expando):
#class MyModel(db.Model):
foo = db.TextProperty()
# obsolete = db.TextProperty()

Make sure the model inherit from db.Expando and comment out (or just delete the line) the obsolete property.

Here is the example code to delete the attribute, the property:

from google.appengine.runtime import DeadlineExceededError

def del_obsolete(self):

count = 0
last_key = ''
try:
q = MyModel.all()
cont = self.request.get('continue')
if cont:
q.filter('__key__ >=', db.Key(cont))
q.order('__key__')
entities = q.fetch(100)
while entities:
for entity in entities:
last_key = str(entity.key())
try:
del entity.obsolete
except AttributeError:
pass
entity.put()
count += 1
q.filter('__key__ >', entities[-1].key())
entities = q.fetch(100)
except DeadlineExceededError:
self.response.out.write('%d processed, please continue to %s?continue=%s' % (count, self.request.path_url, last_key))
return
self.response.out.write('%d processed, all done.' % count)

Note that this snippet is to be used as a webapp.RequestHandler's get method, so it has self.response.

It use entities' keys to walk through every entity, it is efficient and safe. But you may also want to put your application under maintenance, preventing other code to add new entities, even though the values of keys seem to be increased only for new entities, but you really don't need to waste CPU time since new entities has no obsolete property.

Because it have to go through all entities and therefore it takes a lot of time to process. A mechanism to continue the process on the rest of entities is necessary. The code will catch google.appengine.runtime.DeadlineExceededError if it can not finish in one request, it then return a link which allows you to continue if you follow it. If you have lots of entities, you may want to use task instead of manual continuation. You may also want to set up the maximal amount of processing entities like 1000 entities in one request.

Once it has done its job, change the model definition back to db.Model and remove obsolete property line:
class MyModel(db.Model):
foo = db.TextProperty()


That's it.

I need to count how many entity of kind Blog has boolean property accepted set to True, but I suddenly realized that OFFSET in query is no use for me (In fact, it is not really useful).

In SDK 1.1.0, OFFSET does what you think on Development Server if you first use GAE and have experience of SQL, but it's still different than on Production Server.

Basically, if you have 1002 entities in Blog and you want to get the 1002nd entity. The follows will not get you that entity:
q = Blog.all()
# Doing filter here
# Order here
# Then fetch
r = q.fetch(1, 0)[0] # 1st
r = q.fetch(1, 1)[0] # 2nd
r = q.fetch(1, 999)[0] # 1000th
r = q.fetch(1, 1000)[0] # 1001st
r = q.fetch(1, 1001)[0] # 1002nd

You will get an exception on the last one like:
BadRequestError: Offset may not be above 1000.
BadRequestError: Too big query offset.
First one is on Production Sever, second is on Development Server.

The OFFSET takes effective after:
  1. filter data (WHERE clause)
  2. sort data (ORDER clause)
  3. truncate to first 1001 entities (even though count() only returns 1000 at most)
After filtering, sorting, truncating to first 1001 entities, then you can have your OFFSET. If you have read Updaing Your Model's Schema, it warns you:
A word of caution: when writing a query that retrieves entities in batches, avoid OFFSET (which doesn't work for large sets of data) and instead limit the amount of data returned by using a WHERE condition.
The only way is to filtering data (WHERE clause), you will need a unique property if you need to walk through all entities.

An amazing thing is you don't need to create new property, there is already one in all of you Kinds, the __key__ in query, the Key.

The benefits of using it:
  • No additional property,
  • No additional index (Because it's already created by default), and
  • Combination of two above, you don't need to use additional datastore quota. Index and Property use quota.
Here is a code snippet that I use to count Blog entities, you should be able to adapt it if you need to process data:
def get_count(q):
r = q.fetch(1000)
count = 0
while True:
count += len(r)
if len(r) < 1000:
break
q.filter('__key__ >', r[-1])
r = q.fetch(1000)
return count

q = db.Query(blog.Blog, keys_only=True)
q.order('__key__')
total_count = get_count(q)

q = db.Query(blog.Blog, keys_only=True)
q.filter('accepted =', True)
q.order('__key__')
accepted_count = get_count(q)

q = db.Query(blog.Blog, keys_only=True)
q.filter('accepted =', False)
q.order('__key__')
blocked_count = get_count(q)

Note that
  • Remove keys_only=True if you need to process data. And you will need to use r[-1].key() to filter.
  • Add a resuming functionality because it really uses a lot of CPU time if it works on large set of data.

I just download the data from one of my App Engine application by following Uploading and Downloading, I used this new and experimental bulkloader.py to download data into a sqlite3 database. You don't need to create the Loader/Exporter classes with this new method

It does explain how to download and upload, but, as for, uploading is only for production server. You have to look into the command line options, it's not complicated.

Here is a complete example to dump data:
$ python googleappengine/python/bulkloader.py --dump --kind=Kind --url=http://app-id.appspot.com/remote_api --filename=app-id-Kind.db /path/to/app.yaml/
[INFO ] Logging to bulkloader-log-20091111.001712
[INFO ] Throttling transfers:
[INFO ] Bandwidth: 250000 bytes/second
[INFO ] HTTP connections: 8/second
[INFO ] Entities inserted/fetched/modified: 20/second
[INFO ] Opening database: bulkloader-progress-20091111.001712.sql3
[INFO ] Opening database: bulkloader-results-20091111.001712.sql3
[INFO ] Connecting to brps.appspot.com/remote_api
Please enter login credentials for app-id.appspot.com
Email: [email protected]
Password for [email protected]:
.[INFO ] Kind: No descending index on __key__, performing serial download
.......................................................................................................................................................................................
.................................
[INFO ] Have 2160 entities, 0 previously transferred
[INFO ] 2160 entities (0 bytes) transferred in 134.6 seconds

And the following is for upload to Development Server using the sqlite3 database which we just download (not the CSV):
$ python googleappengine/python/bulkloader.py --restore --kind=Kind --url=http://localhost:8080/remote_api --filename=app-id-Kind.db --app_id=app-id
[INFO ] Logging to bulkloader-log-20091111.004013
[INFO ] Throttling transfers:
[INFO ] Bandwidth: 250000 bytes/second
[INFO ] HTTP connections: 8/second
[INFO ] Entities inserted/fetched/modified: 20/second
[INFO ] Opening database: bulkloader-progress-20091111.004013.sql3
Please enter login credentials for localhost
Email: [email protected] <- This does not matter, type anything
Password for [email protected]: <- Does not matter
[INFO ] Connecting to localhost:8080/remote_api
[INFO ] Starting import; maximum 10 entities per post
........................................................................................................................................................................................................................
[INFO ] 2160 entites total, 0 previously transferred
[INFO ] 2160 entities (0 bytes) transferred in 31.3 seconds
[INFO ] All entities successfully transferred

You will need to specify the app id, which must match the Development server is running on.

This may be no need once the bulkloader.py is stable.

I just tried to add two entity counts to my app's statistics page. Then I found out, the statistics APIreleased on 10/13/2009, version 1.2.6is not available for development server.

You can run the following code without errors:
from google.appengine.ext.db import stats
global_stat = stats.GlobalStat.all().get()

But global_stat is always None.

So I ended up with a code as follows:
db_blog_count = memcache.get('db_blog_count')
if db_blog_count is None:
blog_stat = stats.KindStat.all().filter('kind_name =', 'Blog').get()
if blog_stat is None:
db_blog_count = 'Unavailable'
else:
db_blog_count = blog_stat.count
memcache.set('db_blog_count', db_blog_count, 3600)

The documentation didn't explicit mention whether if the statistics is available for development server or notmaybe I didn't read carefully, neither did Release Notes.

PS. I know the code is awful, str / int types mixed, terrible. But I am lazy to add and if clause in template file to check if db_blog_count is None or something like -1, or anything represents the data is not available.

PS2. The code should be just if blog_stat: (fourth line) and swap the next two statements if you know what I meant.

Warning

This project is dead and some links are removed. (2015-12-02T00:31:55Z)

A quick post. I wrote a script for GAE, it generates Gentoo Forums latest topic/post feed.

Google App Engine just announced the free quota will be reduced in 90 days, by 2009-05-25. The detail changes are:
  • CPU Time: 46.3 down to 6.5 hours, about 14% remaining.
  • Bandwidth In/Out: 10.0 GB down to 1.0 GB, 10 % remaining.
It's not all reductions, they also doubled the storage quota from 0.5 GB to 1.0 GB.

If you just signed in the dashboard, you would need to agree new Terms of Service, then you would see the new billing section. The most important change of ToS possibly is 4.4. You may not develop multiple Applications to simulate or act as a single Application or otherwise access the Service in a manner intended to avoid incurring fees.

Even though it cut off much more free quota on CPU Time and Bandwidth. My apps will stay in free quota, they are not hot. :)

Recently, my GAE application started to get few timeouts on operations on datastore.

Here is a sample traceback:
datastore timeout: operation took too long.
Traceback (most recent call last):
File "/base/python_lib/versions/1/google/appengine/ext/webapp/__init__.py", line 498, in __call__
handler.get(*groups)
File "/base/data/home/apps/brps/1.330624965687476780/index.py", line 104, in get
p = post.get(blog_id, post_id)
File "/base/data/home/apps/brps/1.330624965687476780/brps/post.py", line 85, in get
p = db.run_in_transaction(transaction_update_relates, blog_id, post_id, relates)
File "/base/python_lib/versions/1/google/appengine/api/datastore.py", line 1451, in RunInTransaction
raise _ToDatastoreError(err)
File "/base/python_lib/versions/1/google/appengine/api/datastore.py", line 1637, in _ToDatastoreError
raise errors[err.application_error](err.error_detail)
Timeout: datastore timeout: operation took too long.

Here is how you can catch it:
from google.appengine.api.datastore_errors import Timeout

try:
pass
except Timeout:
pass

Updated on 2008-11-22: 1.1.6 had a short life, it has been replaced by 1.1.7 in order to fix #877.

Google releases the 1.1.6 of Google App Engine SDK. You can read the announcement blog post and release notes. The main improvements are matching production server environment and more capabilities with model key name.

I catgorized the release notes:

Development Server
Production Server

Datastore
URLFetch
  • URLFetch response headers are combined. #412
  • URLFetch now uses original method when following a redirect. #363
  • URLFetch logs a warning when using a non standard port. #436
  • URLFetch allows integers as values in request headers.
Miscellaneous
  • Fixed an issue with regular expressions in static_files in app.yaml. #711
  • Support the bufsize positional arg in open()/file().
  • lstat is aliased to stat.
  • appcfg handles index building errors more gracefully.
  • Fixed an issue with symlinks in the path to the Python core libraries.
Links: Discussion thread for 1.1.6

Today, Last Tweets run into another issue. It's about the encoding

d = {'msg': u'Is still rather 17\xb0 in Auckland.Brr'}
print d['msg']
# Is still rather 17 in Auckland.Brr
print pickle.dumps(d, 0)
# "(dp0\nS'msg'\np1\nVIs still rather 17\xb0 in Auckland.Brr\np2\ns."

As you can see \xb0 is not in ASCII. If you assign the pickled result to a db.TextProperty[link], you will see an track back like

pickle.dumps(d, 0).decode('ascii')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xb0 in position 34: ordinal not in range(128)
TextProperty will try to decode with ASCII decoder if you assign a str.

A bigger character set can resolve this issue:

print pickle.dumps(d, 0).decode('latin-1')
# u"(dp0\nS'msg'\np1\nVIs still rather 17\xb0 in Auckland.Brr\np2\ns."
to_db = pickle.dumps(d, 0).decode('latin-1')
print pickle.loads(to_db.encode('latin-1'))
# {'msg': u'Is still rather 17\xb0 in Auckland.Brr'}
print pickle.loads(to_db.encode('latin-1'))['msg']
# Is still rather 17 in Auckland.Brr

An working code should look like:

model.my_text = db.Text(pickle.dumps(my_dict), encoding='latin-1') 
When this model try to set my_text, it sees a type db.Text object. It won't try to decode. I think you can also give an type unicode object directly (not tested):
model.my_text = pickle.dumps(my_dict).decode('latin-1') 

On Development Server

When I use Mail API with sendmail using the example as in Sending Mail doc, the recipient has to be pure email address:
cannot be
User <[email protected]>
Or sendmail complains:
INFO     2008-10-29 06:57:53,884 mail_stub.py] MailService.Send
INFO     2008-10-29 06:57:53,884 mail_stub.py]   From: [email protected]
INFO     2008-10-29 06:57:53,885 mail_stub.py]   To: User <[email protected]>
INFO     2008-10-29 06:57:53,885 mail_stub.py]   Subject: Your account has been approved
INFO     2008-10-29 06:57:53,885 mail_stub.py]   Body:
INFO     2008-10-29 06:57:53,885 mail_stub.py]     Content-type: text/plain
INFO     2008-10-29 06:57:53,885 mail_stub.py]     Data length: 261
/bin/sh: -c: line 0: syntax error near unexpected token `newline'
/bin/sh: -c: line 0: `sendmail User <[email protected]>'
ERROR    2008-10-29 06:57:53,927 mail_stub.py] Error sending mail using sendmail: [Errno 32] Broken pipe
I think this can be fixed by patching the mail_stub.py.

On Production Server

Sender must be:
The sender must be the email address of a registered administrator for the application, or the address of the current signed-in user.