Find sequence in IEnumerable<T> using Linq


3 Answers

The code you say you want to be able to use isn't LINQ, so I don't see why it need be implemented with LINQ.

This is essentially the same problem as substring searching (indeed, an enumeration where order is significant is a generalisation of "string").

Since computer science has considered this problem frequently for a long time, so you get to stand on the shoulders of giants.

Some reasonable starting points are:

http://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm

http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm

http://en.wikipedia.org/wiki/Rabin-karp

Even just the pseudocode in the wikipedia articles is enough to port to C# quite easily. Look at the descriptions of performance in different cases and decide which cases are most likely to be encountered by your code.

Question

What is the most efficient way to find a sequence within a IEnumerable<T> using LINQ

I want to be able to create an extension method which allows the following call:

int startIndex = largeSequence.FindSequence(subSequence)

The match must be adjacent and in order.




Why there is no IndexOf(string value)-like method in Linq?

It sounds like your referring to List<T>.FindIndex().

--EDIT--

A more general method for any IEnumerable<T> would be:-

public static int FindIndex<T>(this IEnumerable<T> source, Predicate<T> equality)
{
    return source
        .Select((item, index) => new {Item = item, Index = index})
        .First(x => equality(x.Item)).Index;
}



I have an extension method that uses the existing Contains()-method. I find it more intuitive than using Instersect() or Except().

public static bool ContainsAll<T>(this IEnumerable<T> source, IEnumerable<T> values)
{
    return values.All(value => source.Contains(value));
}



Determine if a sequence contains all elements of another sequence using Linq

Count? How about Not Any?

bool contained = !subset.Except(superset).Any();



Related



Tags