python计数器 - Python的隐藏功能





Doctest : documentation and unit-testing at the same time.

Example extracted from the Python documentation:

def factorial(n):
    """Return the factorial of n, an exact integer >= 0.

    If the result is small enough to fit in an int, return an int.
    Else return a long.

    >>> [factorial(n) for n in range(6)]
    [1, 1, 2, 6, 24, 120]
    >>> factorial(-1)
    Traceback (most recent call last):
        ...
    ValueError: n must be >= 0

    Factorials of floats are OK, but the float must be an exact integer:
    """

    import math
    if not n >= 0:
        raise ValueError("n must be >= 0")
    if math.floor(n) != n:
        raise ValueError("n must be exact integer")
    if n+1 == n:  # catch a value like 1e300
        raise OverflowError("n too large")
    result = 1
    factor = 2
    while factor <= n:
        result *= factor
        factor += 1
    return result

def _test():
    import doctest
    doctest.testmod()    

if __name__ == "__main__":
    _test()

Named formatting

% -formatting takes a dictionary (also applies %i/%s etc. validation).

>>> print "The %(foo)s is %(bar)i." % {'foo': 'answer', 'bar':42}
The answer is 42.

>>> foo, bar = 'question', 123

>>> print "The %(foo)s is %(bar)i." % locals()
The question is 123.

And since locals() is also a dictionary, you can simply pass that as a dict and have % -substitions from your local variables. I think this is frowned upon, but simplifies things..

New Style Formatting

>>> print("The {foo} is {bar}".format(foo='answer', bar=42))

They're the magic behind a whole bunch of core Python features.

When you use dotted access to look up a member (eg, xy), Python first looks for the member in the instance dictionary. If it's not found, it looks for it in the class dictionary. If it finds it in the class dictionary, and the object implements the descriptor protocol, instead of just returning it, Python executes it. A descriptor is any class that implements the __get__ , __set__ , or __delete__ methods.

Here's how you'd implement your own (read-only) version of property using descriptors:

class Property(object):
    def __init__(self, fget):
        self.fget = fget

    def __get__(self, obj, type):
        if obj is None:
            return self
        return self.fget(obj)

and you'd use it just like the built-in property():

class MyClass(object):
    @Property
    def foo(self):
        return "Foo!"

Descriptors are used in Python to implement properties, bound methods, static methods, class methods and slots, amongst other things. Understanding them makes it easy to see why a lot of things that previously looked like Python 'quirks' are the way they are.

Raymond Hettinger has an excellent tutorial that does a much better job of describing them than I do.


字典有一个get()方法

字典有一个'get()'方法。 如果你做了['钥匙']并且钥匙不在那里,你会得到一个异常。 如果你做了d.get('key'),如果'key'不存在,你将返回None。 您可以添加第二个参数来获取该项目而不是None,例如:d.get('key',0)。

这对于添加数字这样的东西很有用:

sum[value] = sum.get(value, 0) + 1


for ... else语法(请参阅http://docs.python.org/ref/for.html

for i in foo:
    if i == 0:
        break
else:
    print("i was never 0")

除非中断被调用,否则“else”块通常会在for循环结束时执行。

上面的代码可以模拟如下:

found = False
for i in foo:
    if i == 0:
        found = True
        break
if not found: 
    print("i was never 0")

将值发送到生成器函数 。 例如有这个功能:

def mygen():
    """Yield 5 until something else is passed back via send()"""
    a = 5
    while True:
        f = (yield a) #yield a and possibly get f in return
        if f is not None: 
            a = f  #store the new value

您可以:

>>> g = mygen()
>>> g.next()
5
>>> g.next()
5
>>> g.send(7)  #we send this back to the generator
7
>>> g.next() #now it will yield 7 until we send something else
7

Interactive Interpreter Tab Completion

try:
    import readline
except ImportError:
    print "Unable to load readline module."
else:
    import rlcompleter
    readline.parse_and_bind("tab: complete")


>>> class myclass:
...    def function(self):
...       print "my function"
... 
>>> class_instance = myclass()
>>> class_instance.<TAB>
class_instance.__class__   class_instance.__module__
class_instance.__doc__     class_instance.function
>>> class_instance.f<TAB>unction()

You will also have to set a PYTHONSTARTUP environment variable.


Main messages :)

import this
# btw look at this module's source :)

De-cyphered :

The Zen of Python, by Tim Peters

美丽胜于丑陋。
显式比隐式更好。
简单胜于复杂。
复杂比复杂更好。
Flat is better than nested.
Sparse is better than dense.
可读性计数。
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess. There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than right now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


Re-raising exceptions :

# Python 2 syntax
try:
    some_operation()
except SomeError, e:
    if is_fatal(e):
        raise
    handle_nonfatal(e)

# Python 3 syntax
try:
    some_operation()
except SomeError as e:
    if is_fatal(e):
        raise
    handle_nonfatal(e)

The 'raise' statement with no arguments inside an error handler tells Python to re-raise the exception with the original traceback intact , allowing you to say "oh, sorry, sorry, I didn't mean to catch that, sorry, sorry."

If you wish to print, store or fiddle with the original traceback, you can get it with sys.exc_info(), and printing it like Python would is done with the 'traceback' module.


iter()可以采用可调参数

例如:

def seek_next_line(f):
    for c in iter(lambda: f.read(1),'\n'):
        pass

iter(callable, until_value)函数重复调用callable并产生结果,直到返回until_value


函数参数解包

您可以使用***将列表或字典解压缩为函数参数。

例如:

def draw_point(x, y):
    # do some magic

point_foo = (3, 4)
point_bar = {'y': 3, 'x': 2}

draw_point(*point_foo)
draw_point(**point_bar)

非常有用的快捷方式,因为列表,元组和字典被广泛用作容器。


创建生成器对象

如果你写

x=(n for n in foo if bar(n))

你可以走出发生器并将其分配给x。 现在这意味着你可以做

for n in x:

这样做的好处是,你不需要中间存储,如果你这样做,你将需要中间存储

x = [n for n in foo if bar(n)]

在某些情况下,这会导致显着的加速。

您可以将多个if语句附加到生成器的末尾,基本上可以复制嵌套for循环:

>>> n = ((a,b) for a in range(0,2) for b in range(4,6))
>>> for i in n:
...   print i 

(0, 4)
(0, 5)
(1, 4)
(1, 5)

可读的正则表达式

在Python中,您可以将正则表达式分成多行,为匹配命名并插入注释。

示例详细语法(从Dive into Python ):

>>> pattern = """
... ^                   # beginning of string
... M{0,4}              # thousands - 0 to 4 M's
... (CM|CD|D?C{0,3})    # hundreds - 900 (CM), 400 (CD), 0-300 (0 to 3 C's),
...                     #            or 500-800 (D, followed by 0 to 3 C's)
... (XC|XL|L?X{0,3})    # tens - 90 (XC), 40 (XL), 0-30 (0 to 3 X's),
...                     #        or 50-80 (L, followed by 0 to 3 X's)
... (IX|IV|V?I{0,3})    # ones - 9 (IX), 4 (IV), 0-3 (0 to 3 I's),
...                     #        or 5-8 (V, followed by 0 to 3 I's)
... $                   # end of string
... """
>>> re.search(pattern, 'M', re.VERBOSE)

示例命名匹配(来自Regular Expression HOWTO

>>> p = re.compile(r'(?P<word>\b\w+\b)')
>>> m = p.search( '(((( Lots of punctuation )))' )
>>> m.group('word')
'Lots'

感谢字符串连接,你也可以在不使用re.VERBOSE情况下写一个正则表达式。

>>> pattern = (
...     "^"                 # beginning of string
...     "M{0,4}"            # thousands - 0 to 4 M's
...     "(CM|CD|D?C{0,3})"  # hundreds - 900 (CM), 400 (CD), 0-300 (0 to 3 C's),
...                         #            or 500-800 (D, followed by 0 to 3 C's)
...     "(XC|XL|L?X{0,3})"  # tens - 90 (XC), 40 (XL), 0-30 (0 to 3 X's),
...                         #        or 50-80 (L, followed by 0 to 3 X's)
...     "(IX|IV|V?I{0,3})"  # ones - 9 (IX), 4 (IV), 0-3 (0 to 3 I's),
...                         #        or 5-8 (V, followed by 0 to 3 I's)
...     "$"                 # end of string
... )
>>> print pattern
"^M{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})$"

小心可变的默认参数

>>> def foo(x=[]):
...     x.append(1)
...     print x
... 
>>> foo()
[1]
>>> foo()
[1, 1]
>>> foo()
[1, 1, 1]

相反,您应该使用表示“未给定”的标记值,并将其替换为默认的可变值:

>>> def foo(x=None):
...     if x is None:
...         x = []
...     x.append(1)
...     print x
>>> foo()
[1]
>>> foo()
[1]

获取python正则表达式分析树来调试你的正则表达式。

正则表达式是python的一大特性,但调试它们可能会很痛苦,并且正确表达式很容易导致错误。

幸运的是,python可以通过将未记录的实验性隐藏标志re.DEBUG (实际上是128)传递给re.compile来打印正则表达式分析树。

>>> re.compile("^\[font(?:=(?P<size>[-+][0-9]{1,2}))?\](.*?)[/font]",
    re.DEBUG)
at at_beginning
literal 91
literal 102
literal 111
literal 110
literal 116
max_repeat 0 1
  subpattern None
    literal 61
    subpattern 1
      in
        literal 45
        literal 43
      max_repeat 1 2
        in
          range (48, 57)
literal 93
subpattern 2
  min_repeat 0 65535
    any None
in
  literal 47
  literal 102
  literal 111
  literal 110
  literal 116

一旦你理解了语法,你就可以发现你的错误。 在那里我们可以看到,我忘了在[/font]

当然,你可以将它与任何你想要的标志结合起来,比如评论正则表达式:

>>> re.compile("""
 ^              # start of a line
 \[font         # the font tag
 (?:=(?P<size>  # optional [font=+size]
 [-+][0-9]{1,2} # size specification
 ))?
 \]             # end of tag
 (.*?)          # text between the tags
 \[/font\]      # end of the tag
 """, re.DEBUG|re.VERBOSE|re.DOTALL)

装饰

Decorators允许在另一个函数中包装函数或方法,该函数可以添加功能,修改参数或结果等。您可以在函数定义的上方一行写入装饰器,并以“at”符号(@)开头。

示例显示了一个print_args装饰器,它在调用装饰函数的参数之前先打印它:

>>> def print_args(function):
>>>     def wrapper(*args, **kwargs):
>>>         print 'Arguments:', args, kwargs
>>>         return function(*args, **kwargs)
>>>     return wrapper

>>> @print_args
>>> def write(text):
>>>     print text

>>> write('foo')
Arguments: ('foo',) {}
foo

Nested list comprehensions and generator expressions:

[(i,j) for i in range(3) for j in range(i) ]    
((i,j) for i in range(4) for j in range(i) )

These can replace huge chunks of nested-loop code.


To add more python modules (espcially 3rd party ones), most people seem to use PYTHONPATH environment variables or they add symlinks or directories in their site-packages directories. Another way, is to use *.pth files. Here's the official python doc's explanation:

"The most convenient way [to modify python's search path] is to add a path configuration file to a directory that's already on Python's path, usually to the .../site-packages/ directory. Path configuration files have an extension of .pth, and each line must contain a single path that will be appended to sys.path. (Because the new paths are appended to sys.path, modules in the added directories will not override standard modules. This means you can't use this mechanism for installing fixed versions of standard modules.)"


切片运算符中的step参数。 例如:

a = [1,2,3,4,5]
>>> a[::2]  # iterate over the whole list in 2-increments
[1,3,5]

特殊情况x[::-1]是'x颠倒'的有用成语。

>>> a[::-1]
[5,4,3,2,1]

如果您不喜欢使用空白来表示作用域,则可以通过发出以下命令来使用C风格{}:

from __future__ import braces




hidden-features