Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Web Development

Template Processing Classes for Python


Dr. Dobb's Journal February 1998: What's New in Python 1.5?

Dr. Dobb's Journal February 1998

What's New in Python 1.5?

Dr. Dobb's Journal February 1998

By Guido van Rossum

Guido, Python's creator, works at the Corporation for National Research Initiatives in Reston, Virginia. He can be contacted at [email protected].

Python 1.5 has some powerful improvements over previous versions of the language. I'll briefly describe some of the major modifications here. For more information, see the Python web site at http://www.python.org/.

Packages. Perhaps the most important change is the addition of packages. A Python "package" is a named collection of modules, grouped together in a directory. A similar feature was available in earlier releases through the ni module (named after the Knights Who Say "new import"), but was found to be too important to be optional. Starting with 1.5, it is a standard feature, reimplemented in C, although it is not exactly compatible with ni.

A package directory must contain a file __init__.py -- this prevents subdirectories that happen to be on the path or in the current directory from accidentally preempting modules with the same name. (The __init__.py file was optional with ni.) When the package is first imported, the __init__.py file is loaded in the package namespace. (This is the other main incompatibility.)

For example, the package named "test" (in the Python 1.5 library) contains the expanded regression test suite. The driver for the regression test is the submodule regrtest, and the tests are run by invoking the function main() in this submodule. There are several ways to invoke it:

import test.regrtest
test.regrtest.main()

If you don't want to use fully qualified names for imported functions and modules, you can write:

from test import regrtest
regrtest.main()

or even:

from test.regrtest import main
main()

Assertions. There's now an assert statement to ease the coding of input requirements and algorithm invariants. For example,

assert x >= 0

will raise an AssertionError exception when x is negative. The argument can be any Boolean expression. An optional second argument can give a specific error message; for example:

assert L <= x <= R,\"x out of range"

Once a program is debugged, the assert statements can be disabled without editing the source code by invoking the Python interpreter with the -O command-line flag. This also removes code like this:

if __debug__: statements

This form can be used for coding more complicated requirements, such as a loop asserting that all items in a list have the same type.

Perl-style regular expressions. A new module, re, provides a new interface to regular expressions. The regular expression syntax supported by this module is identical to that of Perl 5.0 to the extent that this is feasible, with Python-specific extensions to support named subgroups. The interface has been redesigned to allow sharing of compiled regular expressions between multiple threads. A new form of string literals, dubbed "raw strings" and written as r"...", has been introduced, in which backslash interpretation by the Python parser is turned off. Example 5, for instance, searches for identifiers and integers in its argument string.

import re, sys
text = sys.argv[1]
prog = re.compile(
		 r"\b([a-z_]\w*|\d+)\b",
		 re.IGNORECASE)
hit = prog.search(text)
while hit:
	print hit.span(1),
	print hit.group(1)
	hit = prog.search(text, 
		  hit.end(0))

Example 5: Using Python 1.5 regular expressions.

Standard exception classes. All standard exceptions are now classes. There's a (shallow) hierarchy of exceptions, with Exception at the root of all exception classes, and its subclass StandardError as the base class of all standard exception classes. Since this is a potential compatibility problem (some code that expects exception objects to have string objects will inevitably break), it can be turned off by invoking the Python interpreter with the -X command-line flag. To minimize the incompatibilities, str() of a class object returns the full class name (prefixed with the module name) and list/tuple assignment now accepts any sequence with the proper length on the right side.

Performance. The 1.5 implementation has been benchmarked as being up to twice as fast as Python 1.4. The standard Python benchmark, pystone, is now included in the test package (import test.pystone; test.pystone.main()).

The biggest speed increase is obtained in the dictionary lookup code. It is aided by a better, more uniformly randomizing hash function for string objects, and automatic "string interning" for all identifiers used in a program (this turns string comparisons into more efficient pointer comparisons). Some new dictionary methods make faster code possible if you don't mind changing your program: d.clear(), d.copy(), d.update(), d.get().

Other speed increases include some inlining of common operations and improved flow control in the main loop of the virtual machine.

I/O speed has also been improved. On some platforms (notably Windows) the speed of file.read() (for large files) has improved dramatically by checking the file size and allocating a buffer of that size, instead of extending the buffer a few KB at a time.

Miscellaneous. The default module search path is chosen much more intelligently, so that a binary distribution for UNIX no longer requires a fixed installation directory. There are also provisions for site additions to the path without recompilation.

If you are embedding Python in an application of your own, you will appreciate the vastly simplified linking process -- everything is now in a single library. There's also much improved support for nonPython threads, multiple interpreters, and explicit finalization and reinitialization of the interpreter.

For those of us who like to read the source, the code now uses a uniform naming scheme (the "Great Renaming") wherein all names have a "Py" prefix. For example, the function known as getListitem() is now called PyList.GetItem().

DDJ


Copyright © 1998, Dr. Dobb's Journal


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.