INFO-LINK




How Perl Saved the Human Genome Project


DDJ, Spring97: How Perl Saved Human Genome Project

Lincoln is the director of informatics at the MIT Center for Genome Research, and the author of How to Set Up and Maintain a World Wide Web Site (Addison-Wesley, 1996). He can be contacted at lstein@genome.wi.mit.edu.


Insert

The human genome project was inaugurated at the beginning of the decade as an ambitious international effort to determine the complete DNA sequence of human beings and several experimental animals. The justification for this undertaking is both scientific and medical. By understanding the genetic makeup of an organism in excruciating detail, we hope to better understand how organisms develop from single eggs into complex multicellular beings, how food is metabolized and transformed into the constituents of the body, and how the nervous system assembles itself into a smoothly functioning ensemble. From the medical point of view, the wealth of knowledge that will come from knowing the complete DNA sequence will greatly accelerate the process of finding the causes of, and potential cures for, human diseases.

Six years after its inception, the genome project is ahead of schedule. Detailed maps of humans and experimental animals have been completed (mapping out the DNA using a series of landmarks is an obligatory first step before determining the complete DNA sequence). The sequence of the smallest model organism—yeast—is nearly complete, and the sequence of the next smallest—a tiny soil-dwelling worm—isn't far behind. Large-scale sequencing efforts for human DNA started at several centers several months ago and will be in full swing within the year.

The scale of the human DNA sequencing project is enough to send your average UNIX system administrator running for cover. From a descriptive standpoint, DNA is a very long string consisting of the four letters G, A, T, and C. These letters are abbreviations for the four chemical units that form the "rungs" of the DNA double-helix ladder. The goal of the project is to determine the order of letters in the string. The size of the string is impressive, but not particularly mind-boggling: 3*109 letters long, or some three gigabytes of storage space if you use one byte to store each letter and you use no compression.

Three gigabytes is substantial, but certainly manageable by today's standards. Unfortunately, this is just the storage space for the finished data. The storage requirements for the experimental data needed to determine this sequence is comparatively vast. The essential problem is that DNA sequencing technology is currently limited to reading stretches of, at most, 500 contiguous letters. In order to determine longer sequences, the DNA must be sequenced as small overlapping fragments called "reads." Then the jigsaw puzzle must be reassembled by algorithms that look for areas where the sequences match.

Because the DNA sequence is not random (similar, but not entirely identical, motifs appear many times throughout the genome) and because DNA sequencing technology is noisy and error-prone, each region of DNA must be sequenced five to ten times in order to reliably assemble the reads into the true sequence. This increases the amount of data to manage by an order of magnitude. In addition, all the associated information that goes along with laboratory work must be stored: who performed the experiment, when it was performed, which section of the genome was sequenced, the identity and version of the software used to assemble the sequence, any comments to be attached to the experiment, and so forth. Generally, one also wants to store the raw output from the machine that performs the sequencing itself. Each 500 letters of sequence generates a data file that's 20­30 KB long!

That's not the whole of it. It's not enough just to determine the sequence of the DNA. Within the sequence are functional areas scattered among long stretches of nonfunctional areas. There are genes, control regions, structural regions, and even a few viruses that got entangled in human DNA long ago and persist as fossilized remnants. Because the genes and control regions are responsible for health and disease, they have to be identified and marked as the DNA sequence is assembled. This type of annotation generates even more data that needs to be tracked and stored.

Altogether, it is estimated that some one to ten terabytes of information will be generated by the conclusion of the human genome project.

So What's Perl Got to Do With It?

From the beginning, researchers realized that informatics would have to play a large role in the genome project. An informatics core formed an integral part of every genome center that was created. The mission of this core was twofold: to provide computer support and database services for their affiliated laboratories, and to develop data analysis and management software for use by the genome community as a whole.

Initial results of the informatics groups' efforts were mixed. Things were slightly better on the laboratory-management side of the coin. Some groups attempted to build large, monolithic systems on top of complex relational databases. However, they were thwarted time and again by the highly dynamic nature of biological research. By the time the software engineers had designed, implemented, and debugged a system that could deal with the ins and outs of a complex laboratory protocol, new technology had superseded the protocol, and the software engineers had to go back to the drawing board.

Most groups, however, learned to build modular, loosely coupled systems with parts that could be swapped in and out without retooling the entire system. In my group, for example, we discovered that many data-analysis tasks involve a sequence of semi-independent steps.

Consider the steps that may be performed on a bit of newly sequenced DNA. First, there's a basic quality check on the sequence: Is it long enough, and are the number of ambiguous letters below the maximum limit? Then, there's the "vector check." For technical reasons, the human DNA must be passed through a bacterium before it can be sequenced (this is the process of "cloning"). Not infrequently, the human DNA gets lost somewhere in the process, and the sequence that's read consists entirely of the bacterial vector. The vector check ensures that only human DNA gets into the database.

Next, there's a check for repetitive sequences. Human DNA is full of repetitive elements that make fitting the sequencing jigsaw puzzle together challenging. The repetitive-sequence check tries to match the new sequence against a library of known repetitive elements. The penultimate step is to attempt to match the new sequence against other sequences in a large community database of DNA sequences. Often, a match at this point will provide a clue to the function of the new DNA sequence. After performing all these checks, the sequence (along with the information that's been gathered about it along the way) is loaded into the local laboratory database.

The process of passing a DNA sequence through these independent, analytic steps looks like a pipeline, and we realized that a UNIX pipe could handle the job. We developed a simple Perl-based data-exchange format called "boulderio" that allowed loosely coupled programs to add information to a pipe-based I/O stream. Boulderio is based on tag/value pairs. A Perl module makes it easy for programs to reach into the input stream, pull out only the tags it is interested in, do something with them, and drop new tags into the output stream. Tags that the program isn't interested in are just passed through to standard output so that other programs in the pipeline can get to them.

Using this type of scheme, the process of analyzing a new DNA sequence looks something like Example 1. This is not exactly the set of scripts that we use, but it's close enough. A file containing the new DNA sequence is processed by the Perl script name_sequence.pl; its only job is to give the sequence a new unique name and to put it into boulderio format. Figure 1 shows sample output from name_sequence.pl.

Example 1: Using several scripts to analyze a new DNA sequence.

name_sequence.pl < new.dna | quality_check.pl | vector_check.pl |\
        find_repeats.pl | search_big_database.pl | load_lab_database.pl 

Figure 1: Sample output from name_sequence.pl.

NAME=L26P93.2
SEQUENCE=GATTTCAGAGTCCCAGATTTCCCCCAGGGGGTTTCCAGAGAGCCC...... 

This output is next passed to the quality checking program, which looks for the SEQUENCE tag, runs the quality checking algorithm, and writes its conclusion to the data stream. The data stream now looks like Figure 2.

Figure 2: Output from the quality-checking program.

NAME=L26P93.2
SEQUENCE=GATTTCAGAGTCCCAGATTTCCCCCAGGGGGTTTCCAGAGAGCCC......
QUALITY_CHECK=OK

Next, the data stream enters the vector checker. It pulls the SEQUENCE tag out of the stream and runs the vector-checking algorithm. The data stream now looks like Figure 3.

Figure 3: The data stream from the vector checker.

NAME=L26P93.2
SEQUENCE=GATTTCAGAGTCCCAGATTTCCCCCAGGGGGTTTCCAGAGAGCCC......
QUALITY_CHECK=OK
VECTOR_CHECK=OK
VECTOR_START=10
VECTOR_LENGTH=300

This continues down the pipeline, until at last, the load_lab_database.pl script collates all the data collected, makes some final conclusions about whether the sequence is suitable for further use, and enters all the results into the laboratory database.

One of the nice features of the boulderio format is that multiple sequence records can be processed sequentially in the same UNIX pipeline. An equal sign marks the end of one record and the beginning of the next; see Figure 4. There is also a way to create subrecords within records, allowing for structured data types.

Figure 4: Multiple sequence records stored in boulderio format.

NAME=L26P93.2
SEQUENCE=GATTTCAGAGTCCCAGATTTCCCCCAGGGGGTTTCCAGAGAGCCC......
=
NAME=L26P93.3
SEQUENCE=CCCCTAGAGAGAGAGAGCCGAGTTCAAAGTCAAAACCCATTCTCTCTCCTC...
= 

Example 2 is a script that processes the boulderio format. It uses an object-oriented style in which records are pulled out of the input stream, modified, and dropped back in. More information about the boulderio format and the Perl libraries to manipulate it can be found at http://www.genome.wi.mit.edu/ftp/pub/software/boulderio/.

Example 2: A script that processes the boulderio format.

use Boulder::Stream;
$stream = new Boulder::Stream;
while ($record = $stream->read_record('NAME','SEQUENCE')) {
  $name = $record->get('NAME');
  $sequence = $record->get('SEQUENCE');
    .
  [continue processing]
    .
  $record->add(QUALITY_CHECK=>"OK");
  $stream->write_record($record);
} 

One interesting occurrence is that multiple informatics groups independently converged on solutions that were similar to the boulderio idea. For example, several groups involved in the worm sequencing project began using a data-exchange format called ".ace." Although this format was initially designed as the data dump and reload format for the ACE database (a specialized database for biological data), it happens to use a tag/value format that's very similar to boulderio. Soon, .ace files were being processed by Perl script pipelines and loaded into the ACE database at the very last step.

Perl was also useful in other aspects of laboratory management. For example, many centers (including my own) use web-based interfaces for displaying the status of projects and allowing researchers to take action. Perl scripts are the perfect engine for web CGI scripts. Similarly, Perl scripts run e-mail database query servers, supervise cron jobs, prepare nightly reports summarizing laboratory activity, create instruction files to control robots, and handle almost every other information management task that a busy genome center needs.

As far as laboratory management went, the informatics cores were reasonably successful. The systems-integration picture, however, was not so rosy.

The Challenge of Systems Integration

The problem will be familiar to anyone who has worked on a large, loosely organized software project. Despite the best intentions, the project begins to drift. Programmers go off to work on ideas that interest them, modules that need to interface with one another are designed independently, and the same problems get solved several times in different, mutually incompatible ways. When the time comes to put all the parts together, nothing works.

This happened in the genome project. Although everyone was working on the same problems, no two groups took exactly the same approach. Programs to solve a given problem were written multiple times, duplicating effort. While a given piece of software wasn't guaranteed to work better than its counterpart developed elsewhere, you could always count on it to sport its own idiosyncratic user interface and data format. A typical example is the central algorithm that assembles thousands of short DNA reads into an ordered set of overlaps. At last count, there were at least six different programs in widespread use, and no two of them use the same data input or output formats.

This lack of interchangeability presents a terrible dilemma for the genome centers. Without interchangeability, an informatics group is locked into using the software that it developed in-house. If another genome center develops a better software tool to attack the same problem, a tremendous effort is required by the first center to retool their system in order to use that tool.

The long-range solution to this problem is to come up with uniform data-interchange standards that genome software must adhere to. This would allow common modules to be swapped in and out easily. However, standards require time to develop, and while the various groups are involved in discussion and negotiation, there is still an urgent need to adapt existing software to the immediate needs of the genome centers.

This is where Perl again came to the rescue. In February 1996, a summit was called in Cambridge, England, in part to deal with the data-interchange problem. Despite the fact that the two groups involved were close collaborators and superficially seemed to be using the same tools to solve the same problems, on closer inspection, nothing they were doing was exactly the same.

The main software components in a DNA-sequencing project are:

  • A trace editor to analyze, display, and allow biologists to edit the short DNA read chromatograms from DNA-sequencing machines.

  • A read assembler to find overlaps between the reads and assemble them in long contiguous sections.

  • An assembly editor to view the assemblies and make changes in places where the assembler went wrong.

  • A database to keep track of everything.

Over the course of a few years, the two groups had developed suites of software that worked well in their respective groups. Following the familiar genome center model, some of the components were developed in house while others were imported from outside. Perl was used as the glue to fit these pieces together (Figure 5). Between each pair of interacting modules were one or more Perl scripts responsible for massaging the output of one module into the expected input for another.

Figure 5: Perl was used to aid communication between components developed separately by two groups.

Diagram

When the time came to interchange data, however, the two groups hit a snag. Between them, they were now using two trace editors, three assemblers, two assembly editors, and (thankfully) one database. If two Perl scripts were required for each pair of components (one for each direction), as many as 62 different scripts would be needed to handle all the possible interconversion tasks. Each time the input or ouput format of one of these modules changed, 14 scripts might need to be examined and fixed.

Figure 6 shows the solution that was worked out during this meeting. The two groups decided to adopt a common data-exchange format known as CAF (the exact designation of this acronym was forgotten over the course of the meeting). CAF would contain a superset of the data that each of the analysis and editing tools needed. For each module, two Perl scripts would be responsible for converting from CAF into whatever format Module A expected ("CAF2ModuleA") and converting Module A's output back into CAF ("ModuleA2CAF"). This simplified the programming and maintenance task considerably. Now there were only 16 Perl scripts to write; when one module changed, only two scripts would need to be examined.

Figure 6: A common data-exchange format (CAF) solution.

Diagram

This episode is not unique. Perl has been the solution of choice for genome centers whenever they need to exchange data or to retrofit one center's software module to work with another center's system.

Conclusion

Perl has become the software mainstay for computation within genome centers as well as the glue that binds them together. Although genome informatics groups are constantly tinkering with other "high-level" languages such as Python, Tcl, and recently Java, nothing comes close to Perl's popularity. How has Perl achieved this remarkable position? Several factors are responsible.

  • Perl is remarkably good for slicing, dicing, twisting, wringing, smoothing, summarizing, and otherwise mangling text. Although the biological sciences do involve a good deal of numeric analysis, most of the primary data is still text: clone names, annotations, comments, bibliographic references. Even DNA sequences are textlike. Converting incompatible data formats is a matter of text mangling combined with some creative guesswork. Perl's powerful regular-expression-matching and string-manipulation operators simplify this job in a manner unequaled by any other modern language.

  • Perl is forgiving. Biological data is often incomplete, fields can be missing, a field that is expected to be present once occurs several times (because, for example, an experiment was run in triplicate) or the data gets entered by hand and doesn't quite fit the expected format. Perl doesn't particularly mind if a value is empty or contains odd characters. Regular expressions can be written to detect and correct a variety of common errors in data entry. Of course, this flexibility can also be a curse, as I'll discuss in more detail later.

  • Perl is component-oriented. Perl encourages people to write their software in small modules, either using Perl library modules or the classic UNIX tool-oriented approach. External programs can easily be incorporated into a Perl script using a pipe, system call, or socket. The dynamic loader introduced in Perl 5 allows people to extend the Perl language with C routines or to make entire compiled libraries available for the Perl interpreter. An effort is currently under way to gather all the world's collected wisdom about biological data into a set of modules called "bioPerl."

  • Perl programs are easy to write and fast to develop. The interpreter doesn't require you to declare all your function prototypes and data types in advance, new variables spring into existence as needed, and calls to undefined functions only cause an error when the function is needed. The debugger works well with emacs and allows a comfortable interactive style of development.

  • Perl is a good prototyping language. Because Perl is quick and dirty, it often makes sense to prototype new algorithms in Perl before moving them to a fast compiled language. Sometimes, Perl is fast enough that the algorithm doesn't have to be ported. More frequently, one can write a small core of the algorithm in C, compile it as a dynamically loaded module or external executable, and leave the rest of the application in Perl (for an example of a complex genome-mapping application implemented in this way, see http://www.genome.wi.mit.edu/ftp/pub/software/RHMAPPER/).

  • Perl is a good language for web CGI scripting and is growing in importance as more labs turn to the Web for publishing their data.

While my experience in using Perl in a genome center environment has been extremely favorable overall, Perl does have its problems, too. Its relaxed programming style leads to many errors that stricter languages would catch. For example, Perl lets you read from a variable without predeclaring it or even assigning to it. This is a useful shortcut in some circumstances, but can be a disaster when you've simply mistyped the variable's name. (This type of error can be avoided using Perl's -w command-line switch and the new use strict pragma.)

There are more subtle gotchas in the language that are not so easy to fix. A major one is Perl's lack of type checking. Strings, floats, and integers all interchange easily. While this greatly speeds development, it can cause major headaches.

Consider a typical genome center Perl script that's responsible for recording the information of short named subsequences within a larger DNA sequence. When the script was written, the data format was expected to consist of tab-delimited fields: a string followed by two integers representing the name and the starting position and length of a DNA subsequence within a larger sequence. An easy way to parse this would be to split() the string into a list using the code in Example 3. Later in this script, some arithmetic is performed with the two integer values and the result is written to a database or to standard output for further processing.

Example 3: Splitting a string into a list.

($name,$start_position,$length) = split("\t"); 

Then one day, the input file's format changes without warning. Someone bumps the field count up by one by sticking a comment field between the name and the first integer. Now the unknowing script assigns a string to a variable that's expected to be numeric and silently discards the last field on the line. Rather than crashing or returning an error code, the script merrily performs integer arithmetic on a string, assuming a value of zero for the string (unless it happens to start with a digit). Although the calculation is meaningless, the output may look perfectly good, and the error may not be caught until much later.

In short, when the genome project was foundering in a sea of incompatible data formats, rapidly changing techniques, and monolithic data-analysis programs, Perl saved the day. Though it's not perfect, Perl seems to meet the needs of the genome centers remarkably well, and is usually the first tool we turn to when we have a problem to solve.


Around the Web

An Events Based Algorithm for Distributing Concurrent Tasks on Multi-Core Architectures

Here's a programming model which enables scalable parallel performance on multi-core shared memory architectures.

Quick Read

Swarm: A True Distributed Programming Language

The Swarm prototype is a simple stack-based language, akin to a primitive version of the Java bytecode interpreter.

Quick Read

Key Software Development Trends

Several trends are emerging within the area of software development. Here are some of the most important trends S. Somasegar has been thinking about recently.

Quick Read

Understanding Parallel Performance

Understanding parallel performance. How do you know when good is good enough?

Quick Read

Short and Tweet: Experiments on Recommending Content from Information Streams

The authors used 12 algorithms to study the URL recommendation on Twitter as a means of better directing attention in information streams.

Quick Read



Video

Forty finalists will gather in Washington, D.C. from March 11-16 to compete for $630,000 in awards.; DDJ; Intel; science; Dr. Dobb's talks with Commonsware's Mark Murphy about what's involved in developing software for the Android operating system; Android; apple; DDJ; tablet development; The new method uses analytics technology developed by the Mayo and IBM collaboration, Medical Imaging Informatics Innovation Center, and has proven a 95 percent accuracy rate in detecting aneurysm.; Algorithm; DDJ; diagnostics; ibm; imaging; T-Mobile USA is enabling phone calls to Haiti without charges for international long distance through January 31 and retroactive to the earthquake on January 12; DDJ; mobile; wireless; Al Williams gives you a demor of One-Der: The One Instruction CPU; DDJ; At the 2010 International Consumer Electronics Show, the auto industry's first working smartphone application was unveiled; DDJ; mobile; The Bluetooth Special Interest Group (SIG) has announced the adoption of BLUETOOTH low energy wireless technology.; bluetooth; DDJ; wireless; IBM has unveiled its list of five innovations that have the potential to change how people live, work and play in cities around the world over the next five to ten years; DDJ; ibm; TeliaSonera's LTE mobile broadband commercial network in Stockholm is now the fastest and largest in the world.; broadband; DDJ; ericsson; mobile; Google has introduced, google Goggles, a visual search application on Android devices that allows users to search for objects using images rather than words; Android; DDJ; google; mobile; Visual Search Applications; Dr. Dobb's talks with David Intersimone, Vice President of Developer Relations and Chief Evangelist at Embarcadero Technologies, about RAD Studio 2010, SQL optimization and his reflections on the software industry.; database programming; DDJ; sql; Researchers from Intel Labs have created an experimental, 48-core Intel processor or "single-chip cloud computer."; cloud computing; DDJ; Intel; multicore; parallelism; The Large Hadron Collider will produce roughly 15 million gigabytes of data annually, to be accessed by a distributed computing and data storage infrastructure called the LHC Computing Grid.; CERN; DDJ; grid computing; physics; A mobile handheld device designed to let users can point, shoot and listen to printed text.; DDJ; Intel; mobile; Ericsson has become the first vendor to prove end to end interoperability in TD-LTE, another standard of 4G radio technologies designed to increase the capacity and speed of mobile telephone networks.; DDJ; ericsson; mobile; TD-LTE; According to a recent study, 80 percent of US respondents feel there are unspoken rules about mobile technology usage, and approximately 69 percent agreed that violations of these unspoken mobile manners are unacceptable.; DDJ; Intel; mobile; IBM and Canonical will introduce a software package for netbooks and other thin client devices in Africa. This is the first cloud- and premise-based Linux netbook software package offered by IBM and Canonical.; cloud computing; DDJ; ibm; His unprecedented ability to manipulate individual atoms signaled a quantum leap forward in in nanoscience experimentation and heralded in the age of nanotechnology.; DDJ; ibm; nanotechnology; IBM honored for its invention of the Blue Gene family of supercomputers. Adobe founders also recognized.; adobe; DDJ; ibm; Former U.S. President Bill Clinton addressed thousands of online entrepreneurs from around the world gathered for the third APEC Business Advisory Council SME Summit in Hangzhou, China.; DDJ; e-business; With free cooling for several months a year, Sweden is an ideal location for cost-efficient data centers.; data centers; DDJ; PNC Bank introduces a new mobile App for the iPhone and iPod touch that provides Virtual Wallet customers with a high-def view of their money while on the go.; DDJ; iphone; The Swedish LTE site will be part of a commercial network scheduled to go live in 2010, bringing data rates far above what is possible in today's mobile broadband networks.; DDJ; ericsson; mobile broadband; Nanotechnology advancement could lead to smaller, faster, more energy efficient computer chips.; circuit boards; DDJ; nanotech; semiconductor; Dr Dobbs talks with with Claudia Backus, Senior Director of Ecosystem Programs at Motorola, regarding the company's recently released MotoDEV Studio for their Android-powered phones.; Android; DDJ; mobile; motodev; The Extremadura Regional Government of Spain and IBM have launched an electronic prescription system in 680 pharmacies in western Spain.; DDJ; ibm; Ericsson to Acquire Majority of Nortel's North American Wireless Business; DDJ; ericsson; mobile; telecom; Nintendo's Wii Sports Resort is an immersive, expansive active-play game that includes a dozen resort-themed activities.; DDJ; nintendo; video games; OnStar can remotely send a signal to the electronic system in the subscriber's stolen vehicle and the vehicle will not be able to be re-started.; cellular; DDJ; wireless; In celebration of the historic Apollo Moon landing, Google has released Moon in Google Earth.; DDJ; google; Ericsson has been awarded contracts with the three telecom operators in China to provide fixed broadband access.; broadband; DDJ; mobile; tv; wireless; Dr. Dobb's talks with Adobe's Adam Lehman about the upcoming release of ColdFusion specifically optimized for Flash and Adobe AIR platform delivery.; adobe; ColdFusion; DDJ; eclipse; Companies team to develop computing device and chipset architectures that will combine the performance of powerful computers with high-bandwidth mobile broadband communications and ubiquitous Internet connectivity.; broadband; DDJ; Intel; mobile; nokia; Adobe Systems and HTC recently announced that the new HTC Hero will be the first Android phone to ship with support for Adobe Flash Platform technology.; adobe; Android; cell phones; DDJ; flash; mobile; mobility; 3.2 million Euros awarded across eight prize categorie recognizing world-class scientific research and artistic creation.; DDJ; A parody of Paul Simon's "50 Ways to Leave Your Lover," but for software security nerds.; DDJ; sql; Dr. Dobb's Mike Riley talks with Jim Manias of Advanced Systems Concepts.  In this conversation, Jim discusses the new ActiveBatch 7 and how it can provide significant productivity gains for application developers and business process owners alike.; ActiveBatch; DDJ; Sun cofounder Scott McNealy and Oracle CEO Larry Ellison discussed Java's role in computing. Sun has also released OpenSolaris 2009.06.; DDJ; java; opensolaris; oracle; sun; Spotlight on NATO's centre of excellence on cyber defense in Tallinn, Estonia.; cyber defense; DDJ; nework security; security; Create Data Access Layers in ASP.NET; DDJ; In this demonstration you will learn how to layout a WPF application. We will explore the major layout panels that come with WPF, contrasting them with each other and describing when to use each.; DDJ; web development; windows; wpf; The Intel Foundation has announced the top winners of the Intel International Science and Engineering Fair; DDJ; Intel; News; science; Matt Hester demonstrates Internet Explorer’s 8 new feature Selectors API for utilizing CSS selectors for quick and easy element lookups.; DDJ; IE8; microsoft; windows; The NATO Virtual Silk Highway provides affordable, high-speed Internet access via satellite to the academic communities of the Caucasus and Central Asia.; DDJ; On a Windows Mobile device, applications are typically not closed down, but they stay in the background. Maarten Struys shows you a simple way to preserve battery power inside your own applications.; DDJ; microsoft; power consumption; windows; Windows Mobile Devices; Cadillac is now offering wireless Internet access with its CTS sedan.; DDJ; wireless broadband; By default, Windows Mobile Standard (Smartphone) applications launched from Visual Studio are not accessible on the device/emulator once they are minimized. In this video, Jim Wilson demonstrates two simple techniques to solve the problem.; DDJ; microsoft; smartphone; VIsual Studio; Mike Riley talks with the brass from Everypoint, creators of the NEMO mobile application development platform.; DDJ; Developers; development environments; mobile applications; Symmetric and asymmetric encryption algorithms, the SHA256 hash encryption algorithms, and how to implement in a simple application using Microsoft's Azure Services Platform.; Azure; DDJ; encryption; microsoft; security; windows; T-Mobile has introduced the Sidekick LX, which features enhanced video capability.; DDJ; Mobile Smartphone; Bluetooth 3.0 offers speedier transmission of large amounts of video, music and photos between devices wirelessly.; bluetooth; DDJ; mobile networks; wireless broadband; Cities around the world are battling with stressed transportation networks, so IBM has announced plans for three new smart rail projects in China, Taiwan and The Netherlands.; DDJ; ibm; ILOG; CASMOBOT is a Nintendo Wii remote controlled slope lawn mower.; DDJ; Denmark; nintendo wii; research; robotics; Project ensures documents, images, video and other Internet-based data growing at over 100 terabytes per month will live on for future generations; data storage; DDJ; history; Intenet; research; Sun Microsystems; Dr. Dobb's talks with Dave McAllister, Director of Standards and Open Source for Adobe, about the Open Screen Project.; adobe; DDJ; Open Screen Project; open source; The Facebook Connect SDK provides the code to let third-party developers embed hooks into their applications so users can connect to their Facebook accounts and exchange information using iPhone apps.; apple; cocoa; DDJ; Facebook; iphone; Mars in Google Earth Updated; DDJ; google; google earth; Google mars; red planet; The Sun Cloud is built on the Sun Open Cloud Platform that leverages the best in world-class open source technologies. The Sun Open Cloud Platform brings together Java, MySQL, OpenSolaris and OpenStorage.; cloud computing; DDJ; java; open solaris; sun; DDJ; High School; Intel; science; ILOG Elixir is a suite of professional user interface controls that gives developers a rich collection of innovative and interactive data display components for Adobe Flex and Adobe Air.; adobe; air; DDJ; elixir; flash; flex; ILOG; The inaugural San Diego Science Festival being held this month is touted as one of the largest multicultural, multigenerational, multidisciplinary celebrations of science ever seen on the West Coast; DDJ; lockheed; News; science; IBM has announced Innov8 version 2, a new version of its serious game that helps students and professionals hone their business and technology skills in a compelling, familiar video game format.; DDJ; ibm; serious games; Swiss Automobile Visionary Frank M. Rinderknecht builds a concept car with adaptive energy concept and iPhone controls.; apple; Concept Car; DDJ; iphone; j; siemens; Two-Year Plan to Focus on 32 Nanometer Manufacturing Technology; 32 nanometer technology; chip; cpu; DDJ; gpu; Intel; manufacturing; Nehalem; Westmere; New version features ocean layer, historical imagery, and more.; DDJ; google; Dr. Dobb's talks with Marty Alchin, author of "Pro Django" about his book and the deep internals of the Django framework.; DDJ; Django; A new content-authoring solution for learning professionals; adobe; DDJ; toolkits; web authoring; In a Second Life setting, Danny Coward discusses Java FX with Dr. Dobb's Jon Erickson.; DDJ; java; JavaFX; sun; The Core i7 processor is the first member of a new family of Nehalem processor designs with new technologies that boost performance on demand.; chip; DDJ; Intel; processors; Dan Diephouse, creator of XFire, a high-performance open-source SOAP framework (which became the Apache CXF project), shares the five common mistakes in SOA governance and insight about the Apache CXF and Mule RESTpack development environments.; apache; Apache CXF; DDJ; mule; open source; soa; soap; Xfire; Adrian Kaehler and Gary Bradski discuss the Open Computer Vision Library (sourceforge.net/projects/opencvlibrary/) and their book "Learning OpenCV".; DDJ; Open Computer Vision Library; OpenCV; In the first part of this two-part interview, Stephen Wolfram reflects on the 20-year anniversary of Wolfram Research.; DDJ; Mathematica; Mathematics; science; In the second part of this two-part interview, Stephen Wolfram discusses his book "A New Kind of Science."; DDJ; Mathematica; Mathematics; science; Nick Hodges talks about Delphi 2009, a RAD tool for Windows, and Delphi Prism, a database engine for Windows, Mac OS X, and Linux.; DDJ; delphi; RAD; windows; Dr. Dobb's talks with Tony Lombardo, lead Technical Evangelist at Infragistics, about all new UI tools for Windows and .NET.; .net; DDJ; silverlight; ui; windows; wpf; Dr. Dobb's talks with Eric Schulz about his International Mathematica User's Conference 2008 presentation on the Mathematica Essentials Palette and the future digital educational material; DDJ; Mathematica; Mathematics; Dr. Dobb's talks with ActiveState's Trent Mick about the recently released Komodo IDE 5.0.; DDJ; ide; open source; Dr. Dobb's talks with Continuity Logic's Kris Carlson about "Why We Die: Simulation of the Evolution of Senescence" and why he programs with Mathematica's functional programming language.; DDJ; functional programming; Mathematica; simulation; Ericsson collaborates with Intel; DDJ; ericsson; Intel; Mobile technology; Dr. Dobb's talks with Schoeller Porter about the grid and cloud versions of Mathematica; clouds; DDJ; Grid; Mathematica; Dr Dobb's interviews Yehuda Katz, maintainer of the Merb project, about the advantages this highly optimized Ruby on Rails alternative offers to web application developers.; DDJ; Ruby on Rails; Dr. Dobb's talks with Thomas Roman, Professor of Mathematics at Central Connecticut State University, about "Mathematica Visualization in a Theoretical Physics Problem - Negative Energy in an Unusual Quantum State."; DDJ; Mathematica; physics; quantum; science; The Forbidden City: Beyond Space & Time is a fully immersive, three-dimensional virtual world that recreates a visceral sense of space and time.; Blade Server; China; DDJ; ibm; linux; mac; online; virtual world; windows; Dr. Dobb's interviews open source luminary Miguel de Icaza about his latest milestone of achieving Microsoft .NET 2.0 Framework compatibility with the Mono Project .; DDJ; Dr. Dobb/s interviews Paul Kimmel, author of "LINQ Unleashed for C#", about Microsoft's new query technology that lets developers poll any information from any data source regardless of location or structure. I; C#; DDJ; Dr. Dobb's; LINQ; microsoft; It takes a supercomputer to build a super car. ; DDJ; HPC; simulation; Dr. Dobb's shows how to install and execute cross-platform scripting languages on the Windows Mobile platform. In this installment, Mike Riley examines Perl for Windows Mobile devices.; DDJ; mobile devices; perl; windows; Dr. Dobb's shows how to install and execute cross-platform scripting languages on the Windows Mobile platform. In this installment, Mike Riley examines Python CE which is optimized for Windows Mobile devices.; DDJ; mobile devices; python; windows; Dr. Dobb's shows how to install and execute cross-platform scripting languages on the Windows Mobile platform. In this installment, Mike Riley examines Ruby for Windows Mobile devices.; DDJ; mobile devices; ruby; windows; Young participants at ITU TELECOM ASIA 2008 in Bangkok, Thailand received free laptops as part of ITU’s initiative to promote affordable devices to increase access to information and communication technologies.; communication; DDJ; itu; Currently technical strategist to Microsoft's Chief Software Architect, Rebecca Norlander has had a tremendous impact on Excel, Internet Explorer, Windows XP SP2, and Windows Vista Security. ; DDJ; microsoft; Contributing authors to the book "Beautiful Code" got together at Dr. Dobb's SD West Conference in March, 2008. Part 1 of 3.; DDJ; programming; software development; Contributing authors to the book "Beautiful Code" got together at Dr. Dobb's SD West Conference in March, 2008. Part 2 of 3.; DDJ; programming; software development; Contributing authors to the book "Beautiful Code" got together at Dr. Dobb's SD West Conference in March, 2008. Part 3 of 3.; DDJ; programming; software development; Anders Hejlsberg discusses C#, Turbo Pascal, and what it means to design a programming language. ; C#; DDJ; microsoft; Turbo Pascal; Solar powered laptops given to youths at ITU Asia 2008.; DDJ; News; telecommunications; IBM breakthrough stands to impact future direction of information technology.; DDJ; Mike Riley spoke to ActiveState's Jeff Hobbes about the new features in Tcl Dev Kit and Perl Dev Kit including the code coverage and hot-spot analysis tool and Mac OSX support.; DDJ; Tim O'Reilly addressed the OSCON convention in his Wednesday keynote titled "Degrees of Freedom, Open Source in the Wed 2.0 Era.; DDJ;


Enabling People and Organizations to Harness the Transformative Power of Technology