Special Search Functions
As UNIX programmers will know, the stat() system call provides status information about a given path, in the form of the struct stat type. The recls core C API provides the function Recls_Stat(), which provides status information about a given path, in the form of the recls_info_t type (a multi-attribute type analogous to IEntryj). Several recls mappings provide a stat()/Stat() method that returns a file entry object, or null/nil if no such entry exists. I have found this a handy tool over the years, particularly when working in Python and Ruby, and I wanted to continue to offer it for .NET users, as FileSearcher.Stat(). This method either returns null if the file does not exist, or an instance implementing IEntry representing the filesystem entry if it can be accessed, or throws an exception if it cannot. (In other words, System.IO.FileNotFoundException and System.IO.DirectoryNotFoundException are caught, and null returned.)
The other function set, FileSearcher.CalculateDirectorySize(), does exactly what it says on the tin: it calculates the size of a directory, as the sum of the sizes of all files in that directory or in any of its sub-directories (up to a given depth). Since this is an expensive operation, I chose not to have directory size automatically calculated during a b>Search()-based enumeration. But it's a useful thing to have available, as in the following example, which displays the sizes of all immediate subdirectories of the current directory:
foreach(IEntry entry in FileSearcher.Search(
null, null, SearchOptions.Directories,
0 // Don't recurse
))
{
Console.WriteLine("{0} : {1}", entry.Path
, FileSearcher.CalculateDirectorySize(entry.Path));
}
Path Utility Functions
As well as the FileSearcher methods, recls 100% .NET provides a number of additional utility functions via the static class PathUtil (see Listing 9).
public static class PathUtil
{
public static string DeriveRelativePath(string origin, string target);
public static string CanonicalizePath(string path);
public static string GetAbsolutePath(string path);
public static string GetDirectoryPath(string path);
public static string GetFile(string path);
public static string GetDrive(string path);
}
Each of these represents some functionality essential to the proper workings of Recls's searching that is not available in, or corrects defective alternatives in, the CLR's path manipulation facilities:
- DeriveRelativePath(), CanonicalizePath(), and GetDrive() do not have CLR equivalents
- GetAbsolutePath() corrects drive-only UNC paths, i.e. "\\server\share" to append a slash, in the same way that System.IO.Path.GetFullPath() does for drive-only volume paths, such as "C:"
- PathUtil.GetDirectoryPath() yields the directory path -- a recls notion of encapsulating drive (for operating systems that have the concept of a drive) + directory -- and corrects the (in my opinion) defective behaviour of System.IO.Path.GetDirectoryName(), which returns the empty string when given a root path such as "C:\" or "\\server\share\"
- PathUtil.GetFile() yields the file component - file name + extension - of a path and works correctly with UNC paths such as "\\server\share" (for which System.IO.Path.GetFileName() returns "share"!)
Extension Methods
With C# 3 comes the ability to enhance the (apparent) operations available on existing types by the use of Extension Methods [8, 9]. I've taken advantage of this for recls 100% .NET by adding the ForEach, Select, and Where methods, as shown in Listing 10. We'll see an example of how these are used (with LINQ [8, 9]) shortly.
public static class SearchExtensions
{
public static void ForEach(
this IEnumerable<IEntry> sequence
, Action<IEntry> action
)
{
foreach(IEntry entry in sequence)
{
action(entry);
}
}
public static IEnumerable<TTarget> Select<TTarget>(
this IEnumerable<IEntry> sequence
, Func<IEntry, TTarget> function
)
{
foreach(IEntry entry in sequence)
{
yield return function(entry);
}
}
public static IEnumerable<IEntry> Where(
this IEnumerable<IEntry> sequence
, Func<IEntry, bool> predicate
)
{
foreach(IEntry entry in sequence)
{
if(predicate(entry))
{
yield return entry;
}
}
}
}
In C++ terms, this is akin to a partial template specialization, because the extension methods are defined only for IEnumerable<IEntry>.
Predicates or Functions?
There was one interesting twist here, with implementing Where. Since it requires a predicate -- a decision function that returns a Boolean value -- I defined it in terms of System.Predicate, which is a delegate defined as follows:
namespace System
{
public delegate bool Predicate<T>(T arg);
}
That works fine with IEnumerable<IEntry>, as in Listing 11.
namespace WhereDemo
{
using Recls;
using System;
class WhereDemo
{
public static void WhereDemo()
{
// with lambda expression
foreach(IEntry entry in FileSearcher.Search(null, null)
.Where((e) => e.IsReadOnly))
{
Console.WriteLine(entry);
}
// with anonymous delegate
foreach(IEntry entry in FileSearcher.Search(null, null)
.Where(delegate(IEntry e) { return e.IsReadOnly; }))
{
Console.WriteLine(entry);
}
}
}
}
However, if we add in a "using System.Linq;" statement to the WhereDemo namespace, we get a compile error (with some namespace qualifications removed for clarity):
error CS0121: The call is ambiguous between the following methods or properties: 'System.Linq.Enumerable.Where<Recls.IEntry>(IEnumerable<IEntry>, System.Func<IEntry,bool>)' and 'Recls.SearchExtensions.Where(IEnumerable<IEntry>, System.Predicate<IEntry>)'
What appears to be happening here is that the compiler resolves the lambda expression (e) => e.IsReadOnly) (or the equivalent anonymous delegate expression, also shown) to System.Func<IEntry, bool>, rather than System.Predicate<IEntry>.
namespace System
{
public delegate TResult Func<T, TResult>(T arg);
}
Consequently, the two possible Where (extension) functions each have one precisely matching argument and one possibly matching argument, hence the ambiguity. This is why I had to implement the recls Where extension in terms of System.Func<IEntry, bool>, giving two precisely matching arguments, and removing the ambiguity. Obviously, if the C# team ever decide to change the compiler to interpret one-parameter Boolean-returning anonymous delegates / lambda expressions as System.Predicate<>, any such "partial specialisations" will be broken, so I'm guessing that'll never happen, and we just need to get used to using System.Func<T, bool>, even though a predicate makes more sense.


