2008年7月23日水曜日

UnicodeEncodeError と webapp.RequestHandler

class MainPage(webapp.RequestHandler):
 の中で
str.decode("utf-8",'ignore')
 とすると
UnicodeEncodeError
 となるようなのですが、
webapp.RequestHandler を介さない別のところで処理している場合、
エラーは発生していない。

UnicodeEncodeError はまだよく理解できていないところがありますが。

メモ

UnicodeEncodeError: 'ascii' codec can't encode characters in position 422-424
: ordinal not in range(128)

Unicode 文字列をバイト列に符号化 (encode) するとき, range(128) つまり 0 から 127 までの文字コードしか扱えない 'ascii' エンコーディングが使われ,日本語文字に対して例外 UnicodeEncodeError が起こったもの
http://www.okisoft.co.jp/esc/cygwin-15a.html

webapp を調べたよ (前編) - Google App Engine
http://d.hatena.ne.jp/hamatsu1974/20080422/1208802967

2008年7月17日木曜日

DeadlineExceededError

DeadlineExceededError のエラー対応を検討

対応例
http://stage.vambenepe.com/archives/category/implementation

結果 google/appengine/runtime/apiproxy.py を参考に

from google.appengine import runtime
from google.appengine.runtime import apiproxy_errors
from google3.apphosting.runtime import _apphosting_runtime___python__apiproxy

これらを import して対応

except runtime.DeadlineExceededError:

File "/base/python_lib/versions/1/google/appengine/runtime/apiproxy.py", line 161, in Wait
rpc_completed = _apphosting_runtime___python__apiproxy.Wait(self)
DeadlineExceededError
だけでなく CancelleError というものもある。
File "/base/python_lib/versions/1/google/appengine/runtime/apiproxy.py", line 189, in CheckSuccess
raise self.exception
CancelledError: The API call datastore_v3.Count() was explicitly cancelled.

2008年7月8日火曜日

Key Limitations: 500 bytes

key_name を設定すると、その分 key が長くなる。
parent を設定すると、親の key が自分の key の先頭に付く。
parent が parent を持っていると、孫の key は長くなる。
限界が 500 byte

従って、当然、 親を設定し、さらにこれに親が設定されていれば
ancestor is  により祖先を検索条件に指定することができる。
b = db.GqlQuery("select * from Blog where ancestor is :1 ", granpa.key())


Tools for storing data: Keys

* Key corresponds to the Bigtable row for an Entity
* Bigtable accessible as a distributed hashtable
* Get() by Key: Very fast! No scanning, just copying data


* Limitations:
o Only one ID or key_name per Entity
o Cannot change ID or key_name later
o 500 bytes

2008年7月7日月曜日

Building Scalable Web Applications/Building a Blog


from google.appengine.ext import db

class BlogIndex(db.Model):
max_index = db.IntegerProperty(required=True,default=0)

class BlogEntry(db.Model):
index = db.IntegerProperty(required=True)
title = db.StringProperty(required=True)
body = db.TextProperty(required=True)


def post_entry(blogname, title, body):
def txn():
blog_index = BlogIndex.get_by_key_name(blogname)
if blog_index is None:
blog_index = BlogIndex(key_name=blogname)
new_index = blog_index.max_index
blog_index.max_index += 1
blog_index.put()
new_entry = BlogEntry(key_name=blogname + str(new_index),parent=blog_index, index=new_index,title=title, body=body)
new_entry.put()
db.run_in_transaction(txn)

def get_entry(blogname, index):
entry = BlogEntry.get_by_key_name(blogname + str(index),parent=Key.from_path('BlogIndex',blogname))
return entry


def get_entries(start_index):
extra = None
if start_index is None:
entries = BlogEntry.gql(
'ORDER BY index DESC').fetch(POSTS_PER_PAGE + 1)
else:
start_index = int(start_index)
entries = BlogEntry.gql(
'WHERE index <= :1 ORDER BY index DESC', start_index).fetch(POSTS_PER_PAGE + 1) if len(entries) > POSTS_PER_PAGE:
extra = entries[-1]
entries = entries[:POSTS_PER_PAGE]
return entries, extra


Building Scalable Web Applications から

Disk Seek を意識している。
Oracle などで Create sequence するように Counter 専用のModel を作成して、
increment("xxxx")するようにしたほうがよいのか。


Writes are expensive!
  • Datastore is transactional: writes require disk access
    • Disk access means disk seeks


  • Rule of thumb: 10ms for a disk seek
  • Simple math:
    • 1s / 10ms = 100 seeks/sec maximum


  • Depends on:
    • The size and shape of your data
    • Doing work in batches (batch puts and gets)

Reads are cheap!
  • Reads do not need to be transactional, just consistent


  • Data is read from disk once, then it's easily cached
  • All subsequent reads come straight from memory


  • Rule of thumb: 250usec for 1MB of data from memory
  • Simple math:
    • 1s / 250usec = 4GB/sec maximum
    • For a 1MB entity, that's 4000 fetches/sec


Tools for storing data: Entities
  • Fundamental storage type in App Engine
  • Set of property name/value pairs
  • Most properties indexed and efficient to query
  • Other large properties not indexed (Blobs, Text)


  • Think of it as an object store, not relational
    • Kinds are like classes
    • Entities are like object instances
  • Relationship between Entities using Keys
    • Reference properties
    • One to many, many to many





Tools for storing data: Entity groups 2

Hierarchical

  • Each Entity may have a parent
  • A "root" node defines an Entity group
  • Hierarchy of child Entities can go many levels deep
    • Watch out! Serialized writes for all children of the root


Datastore scales wide

  • Each Entity group has serialized writes
  • No limit to the number of Entity groups to use in parallel
  • Think of it as many independent hierarchies of data







class CounterConfig(db.Model):
 name = db.StringProperty(required=True)
 num_shards = db.IntegerProperty(required=True,default=1)

class Counter(db.Model):
 name = db.StringProperty(required=True)
 count = db.IntegerProperty(required=True,default=0)

def get_count(name):
 total = 0
 for counter in Counter.gql('WHERE name = :1', name):
  total += counter.count
 return total

def increment(name):
 config = CounterConfig.get_or_insert(name,name=name)
 def txn():
  index = random.randint(0, config.num_shards - 1)
  shard_name = name + str(index)
  counter = Counter.get_by_key_name(shard_name)
  if counter is None:
   counter = Counter(key_name=shard_name, name=name)
  counter.count += 1
  counter.put()
 db.run_in_transaction(txn)

increment("test")
print get_count("test")


def get_count(name):
 total = memcache.get(name)
 if total is None:
 total = 0
 for counter in Counter.gql('WHERE name = :1', name):
  total += counter.count
  memcache.add(name, str(total), 60)
 return total

def increment(name):
 config = CounterConfig.get_or_insert(name,name=name)
 def txn():
  index = random.randint(0, config.num_shards - 1)
  shard_name = name + str(index)
  counter = Counter.get_by_key_name(shard_name)
  if counter is None:
   counter = Counter(key_name=shard_name,name=name)
  counter.count += 1
  counter.put()
 db.run_in_transaction(txn)
 memcache.incr(name)















http://sites.google.com/site/io/building-scalable-web-applications-with-google-app-engine

2008年7月2日水曜日

The parent of an entity is defined when the entity is created, and cannot be changed later.

The parent of an entity is defined when the entity is created, and cannot be changed later.
後から親は替えられない。
http://code.google.com/appengine/docs/datastore/keysandentitygroups.html

Entity Groups, Ancestors and Paths

Every entity belongs to an entity group, a set of one or more entities that can be manipulated in a single transaction. Entity group relationships tell App Engine to store several entities in the same part of the distributed network. A transaction sets up datastore operations for an entity group, and all of the operations are applied as a group, or not at all if the transaction fails.

When the application creates an entity, it can assign another entity as the parent of the new entity. Assigning a parent to a new entity puts the new entity in the same entity group as the parent entity.

An entity without a parent is a root entity. An entity that is a parent for another entity can also have a parent. A chain of parent entities from an entity up to the root is the path for the entity, and members of the path are the entity's ancestors. The parent of an entity is defined when the entity is created, and cannot be changed later.

Every entity with a given root entity as an ancestor is in the same entity group. All entities in a group are stored in the same datastore node. A single transaction can modify multiple entities in a single group, or add new entities to the group by making the new entity's parent an existing entity in the group.

2008年7月1日火曜日

インデックスが必要です no matching index found. This query needs this index

TemplateSyntaxError: Caught an exception while rendering: no matching index found
This query needs this index:
と指摘されたからといって、すぐに index.yaml を修正してはいけない。

1,000 件以上ある kind の先頭のいくらかのデータを
特に検索条件なしで ordery by をかけて fetch するだけのためにも
index が必要になることがあるが、
この index の作成には非常に時間がかかる。

それと並び替えなしの single-property には index は不要
Creating a composite index failed: ascending single-property indexes are not necessary

MacOS Sequoia で Apache+PHP の再設定

 新しいXcodeが使いたかったので MacOS15(Sequoia) にアップグレードしたところ、やはりApacheでPHPが動かなくなっていた。 結論としては brew, openssl, php, httpd を全て再度インストールしたところ動くようになった。 以下、作業ロ...