[C#] How to filter Directory.EnumerateFiles with multiple criteria?


Answers

Stripped from the LINQ context, this comes down to how to find out if a file matches a list of extensions. System.IO.Path.GetExtension() is a better choice here than String.EndsWith(). The multiple || can be replaced with .Contains() or .IndexOf() depending on the collection.

var extensions = new HashSet<string>(StringComparer.OrdinalIgnoreCase)  
   {  ".mp3", ".wma", ".mp4", ".wav" };

...  s => extensions.Contains(Path.GetExtension(s))
Question

I have the following code:

List<string> result = new List<string>();

foreach (string file in Directory.EnumerateFiles(path,"*.*",  
      SearchOption.AllDirectories)
      .Where(s => s.EndsWith(".mp3") || s.EndsWith(".wma")))
       {
          result.Add(file);                 
       }

It works fine and does what I need. Except for one small thing. I would like to find a better way to filter on multiple extensions. I would like to use a string array with filters such as this:

string[] extensions = { "*.mp3", "*.wma", "*.mp4", "*.wav" };

What is the most efficient way to do this using NET Framework 4.0/LINQ? Any suggestions?

I'd appreciate any help being an occasional programmer :-)




The most elegant approach is probably:

var directory = new DirectoryInfo(path);
var masks = new[] { "*.mp3", "*.wav" };
var files = masks.SelectMany(directory.EnumerateFiles);

But it might not be the most efficient.




how to read all files in particular folder and filter more than one type?

You can concatenate two results like this

foreach (string file in Directory.EnumerateFiles(folderPath, "*.txt").Concat(Directory.EnumerateFiles(folderPath, "*.bmp")))
{
    // (code here)
}

Or make it a function like so

    IEnumerable<string> EnumerateFiles(string folderPath, params string[] patterns)
    {
        return patterns.SelectMany(pattern => Directory.EnumerateFiles(folderPath, pattern));
    }

    void Later()
    {
        foreach (var file in EnumerateFiles(".", "*.config", "*.exe"))
        {
         // (code here)
        }
    }



var filteredFiles = Directory
    .GetFiles(path, "*.*")
    .Where(file => file.ToLower().EndsWith("aspx") || file.ToLower().EndsWith("ascx"))
    .ToList();

Edit 2014-07-23

You can do this in .NET 4.5 for a faster enumeration:

var filteredFiles = Directory
    .EnumerateFiles(path) //<--- .NET 4.5
    .Where(file => file.ToLower().EndsWith("aspx") || file.ToLower().EndsWith("ascx"))
    .ToList();

Directory.EnumerateFiles in MSDN




Caselessly comparing strings in C#

The Remarks section of the MSDN article should explain things. Essentially, the reason is for compatibility across different cultures settings.




When comparing strings you should always use an explicit StringComparison member. The String functions are somewhat inconsistent in how they choose to compare strings. The only way to guarantee the comparision used is to a) memorize all of them (this includes both you and everyone on your team) or b) use an explicit comparison for every function.

It's much better to be explicit and not rely on group knowledge being perfect. Your teammates will thank you for this.

Example:

if ( StringComparison.OrdinalIgnoreCase.Equals(a,b) )

Using ToLower for comparison has 2 problems I can think of off the top of my head

  1. It allocates memory. Comparison functions should not allocate memory unless they absolutely have to.
  2. Strings can be lowered in several ways. Most notable Ordinal or Culture Sensitive lower. Which way does .ToLower() work? Personally, I don't know. It's much better to pass an explicit culture than rely on the default.