DependencyAnalyzer.py
Let's examine the DependencyAnalyzer.py module. The entry point is the get_project_depenencies() function. It accepts a project_dir, the static libraries dir, the include search path, and a list of extensions. It returns a list of projects that the current project depends on. The algorithm is pretty simple: Get the dependencies of each file in the project (that has the right extension) and prune duplicates.
def get_project_dependencies(project_dir, libs_dir, search_path,
extensions=['.cpp', '.hpp', '.c', '.h']):
"""Get all the projects in the libs dir that the target project depends on
The algorithm gets the dependencies of every file in the current project and keep a list of all the directories the files reside in.
@project_dir: the target project
@libs_dir: the name of the static libraries parent dir (e.g. 'nta')
@search_path: the list of include directories
@extensions: the list of file extensions that are checked for dependencies
"""
files = glob.glob(os.path.join(project_dir, '*.*'))
all_dependencies = []
for f in files:
if not os.path.isfile(f):
continue
if not os.path.splitext(f)[1] in extensions:
continue
file_dependencies = []
get_file_dependencies(f, libs_dir, file_dependencies, search_path)
all_dependencies += file_dependencies
temp = [os.path.dirname(f) for f in all_dependencies]
dependencies = []
for p in temp:
if not p in dependencies:
dependencies.append(p)
dependencies.remove(project_dir)
return dependencies
How do you find the dependencies of a file? Via a recursive scan of its #include statements. There is a little bit more going on, but the whole point of the exercise is to figure out what projects you depend on so you can build them. Various system #includes that have already built libraries are irrelevant. So, the prefix serves as a filter to limit the search to #include statements that include the prefix.
def get_file_dependencies(filename, prefix, file_dependencies, search_path):
"""Get all the projects in the prefix dir that the target file depends on
All the filenames it #includes are extracted using get_file_includes().
The dependencies of each dependency are extracted recursively.
@filename: the target filename
@prefix: the prefix of interesting dependencies
@file_dependencies: the list of dependencies (grows as the function trundles along)
"""
filename = os.path.abspath(filename)
if filename in file_dependencies:
return
else:
file_dependencies.append(filename)
text = open(filename).read()
includes = get_file_includes(text, prefix, search_path)
includes = [i[0] for i in includes if starts_with_prefix(i[0], prefix, search_path)]
for i in includes:
if i not in file_dependencies:
get_file_dependencies(i, prefix, file_dependencies, search_path)
The get_file_dependencies() function uses the helper function get_file_includes() to get all the relevant #include statements. It uses a regular expression to match every line in the file. The regular expression is compiled using Python's re module and works with both double quotes (") and angled brackets (< >) around the included file. It can also handle leading whitespace and any following whitespace or comments. The regex also uses groups -- the parts in the expression surrounded by braces as in (.*). This allows the extraction of the interesting parts directly without further parsing of each line.
The entire function is a nice example for using regular expressions in Python. The result of a successful match is an object that contains a list of groups that are stored as a 3-tuple in the results and filtered according to the prefix.
# Pick up both #include statements
# Also take care of comments following the #include statement
include_re = re.compile('\s*#include [<"](.*)[>"](.*)')
def get_file_includes(text, prefix, search_path):
"""Get the filenames from #include statements in a text
The text usually comes from a source file. If prefix
is not empty it will return only include statements
whose content (following the first quote or angle bracket)
begins with the prefix.
The algorithm is to extract the relative filename using a regex
and then scan the search path an try to append the relative filename
and see if it exists.
"""
includes = []
lines = text.split('\n')
for line in lines:
m = include_re.match(line)
if m is not None:
includes.append((m.group(1), line, m.group(2)))
results = []
for i in includes:
if not i[0].startswith(prefix):
continue
for d in search_path:
full_path = os.path.join(d, i[0])
if os.path.exists(full_path):
results.append((full_path, i[1], i[2]))
break
return results
Conclusion
This article discussed the architecture and implementation of the generic core of the ibs. It explored the implementation and demonstrated several interesting aspects of the architecture and the code: separating generic logic from custom logic using lightweight plug-ins (dynamically loaded helper modules), using templates (object can manage text files with placeholder and substitution dicts) to generate build files, and automatic discovery of dependencies using regular expressions to match #include statements. The next article will delve into the implementation of a specific build system (NetBeans 6) within ibs and demonstrate how the sage development manager Isaac and the dedicated Bob "the Builder" use it to build their enterprise "Hello, World!" system.


