November 18, 2009
recls 100% .NETMatthew Wilson
Implementing a 100% C# implementation of recls for .NET.
Matthew Wilson is a software development consultant and trainer for Synesis Software who helps clients to build high-performance software that does not break, specializing in C++ and C#/.NET. He is the author of the books Imperfect C++ and Extended STL, a columnist for ACCU, and a resident guru at Dr. Dobb's CodeTalk, focusing on Windows technologies. He can be contacted at matthew@synesis.com.au.
Several years ago I wrote the column Positive Integration for C/C++ Users Journal and later Dr. Dobb's Journal, which discussed issues involved in adapting C/C++ libraries to other languages. The main exemplar project used was recls ("recursive ls") [1], a platform-independent recursive filesystem search library written in C and C++, and with a C API. Adaptation to numerous languages (including Ch, C#/.NET (via P/Invoke), D, Java, Python, and Ruby) was examined, covering the development of the library from versions 1.0 through 1.6. Since that time, the library has continued to evolve, and now stands at 1.8. A new C/C++ version, 1.9, will be released in the coming weeks.
I have long planned to rework the library implementation. The two main changes will be a substantial refactoring of the source files and packaging for the core library and the C++ layer, and a rewrite of some/all of the language mappings in the form of full "100%" implementations. This article describes the first of these, a 100% C# implementation of recls for .NET. For clarity I'll refer to the original stream of work as recls 1.x and the new .NET library as recls 100% .NET in this article.
The reasons for these changes are:
Despite being written entirely in C#, the implementation of recls 100% .NET is larger than can be fully covered here. So I intend to focus on the interesting design points, language features, and the differences in functionality between recls 1.x and recls 100% .NET.
API Differences
The first difference is a cosmetic one. To placate FxCop [2], and also to clearly distinguish the new recls .NET API from the old for anyone who wishes to port their code to it, I changed the old recls namespace to Recls.
Similarly, the RECLS_FLAG enumeration is now SearchOptions (see Listing 1), and its enumerators are Files not FILES, Directories not DIRECTORIES, and so on. There are also fewer enumerators. Notably absent from the original [3] are RECURSIVE, LINKS, DEVICES, NO_FOLLOW_LINKS, DIRECTORY_PARTS, DETAILS_LATER, PASSIVE_FTP, and ALLOW_REPARSE_DIRS. The changes reflect the intended increase in portability and improvements to discoverability and transparency [4, 5] of the new API, based on user feedback.
Listing 1: The SearchOptions enumeration
The FileEntry class is gone, replaced by the IEntry interface (see Listing 2). The FtpSearch class goes entirely, as the first version of recls 100% .NET does not support FTP search. The DirectoryParts class is no longer externally visible; the DirectoryParts getter-property now returns (an instance implementing) the interface IDirectoryParts; see Listing 3. The FileSearch class goes, and search is now provided by the (static) FileSearcher class.
Listing 2: The IEntry interface
Listing 3: The IDirectoryParts interface
IEntry vs. FileEntry
Table 1 compares the public interfaces of the old FileEntry class and recls 100% .NET's IEntry interface. The differences, highlighted in bold, involve changes to both syntax and semantics, and result from lessons learned by users of recls 1.x.
Table 1: Mappings Between Old and New Entry class/interface Methods and Properties
Drive changed from a character to a string so that there'd be less hassle when manipulating UNC-based paths: Now users can deal with a single property, rather than a drive letter character in one, and a (UNC) drive string in another. The spellings of UNCDrive and IsUNC changed to follow .NET idiom. The Size property changed from ulong to long to be CLS compatible (for example, to be able to be used from VB.NET and other .NET languages that don't support unsigned integral types). IsLink and ShortFile had to go by the wayside because of the need to be implemented 100% in terms of the CLR facilities (and not go to P/Invoke). The Attributes property was added to allow recls to stay relevant in light of evolution in the CLR of the file attributes that may be made available to managed programmers.
There are also some semantic changes. The form of the file extension has changed, and now includes the dot, so "abc.net" will have an extension of ".net", rather than "net" as was the case with recls 1.x. Since this is a breaking change, I've removed the previous name, FileExt, and given it a new name FileExtension. (This also fits better with the .NET way of doing things, which is to avoid unnecessary contractions in names.)
It's useful to be able to paste the extension to another file name without having to pollute client code with logic to determine whether or not to insert the dot. Now, all of the following combinations will reproduce the full path (and, to be useful, may be used in combination with other strings to build correctly-formed new paths):
|
|
||||||||||||||||||||||||||||||
|
|
|
|