c# tutorial Why doesn't String.Contains call the final overload directly?



windows form controls in c# tutorial (1)

Been a while (years) since I've looked at assembly, and I know close to nothing about MSIL and JIT, so it would be a nice exercise - couldn't resist, so here's just a bit of, possibly redundant, empirical data. Does the IndexOf overload get inlined?

Here's a tiny Console app:

class Program
{
    static void Main(string[] args)
    {
        "hello".Contains("hell");
    }
}

The JIT generates this in an optimized Release build, Any CPU, running in 32 bit. I've shortened the addresses, and removed some irrelevant lines:

--- ...\Program.cs 
            "hello".Contains("hell");
[snip]
17  mov         ecx,dword ptr ds:[0320219Ch] ; pointer to "hello"
1d  mov         edx,dword ptr ds:[032021A0h] ; pointer to "hell"
23  cmp         dword ptr [ecx],ecx 
25  call        680A6A6C                     ; String.Contains()
[snip]

The call at 0x00000025 goes here:

String.Contains

00  push        0                 ; startIndex = 0
02  push        dword ptr [ecx+4] ; count = this.Length (second DWORD of String)
05  push        4                 ; comparisonType = StringComparison.Ordinal
07  call        FF9655A4          ; String.IndexOf()
0c  test        eax,eax 
0e  setge       al                ; if (... >= 0)
11  movzx       eax,al 
14  ret 

Sure enough, it seems to call, directly, the final String.IndexOf overload with four arguments: three pushed; one in edx (value: "hell"); this ("hello") in ecx. To confirm, this is where the call at 0x00000005 goes:

00  push        ebp 
01  mov         ebp,esp 
03  push        edi 
04  push        esi 
05  push        ebx 
06  mov         esi,ecx                  ; this ("hello")
08  mov         edi,edx                  ; value ("hell")
0a  mov         ebx,dword ptr [ebp+10h] 
0d  test        edi,edi                  ; if (value == null)
0f  je          00A374D0 
15  test        ebx,ebx                  ; if (startIndex < 0)
17  jl          00A374FB 
1d  cmp         dword ptr [esi+4],ebx    ; if (startIndex > this.Length)
20  jl          00A374FB 
26  cmp         dword ptr [ebp+0Ch],0    ; if (count < 0)
2a  jl          00A3753F 
[snip]

... which would be the body of:

public int IndexOf(string value, 
                   int startIndex, 
                   int count, 
                   StringComparison comparisonType)
{
  if (value == null)
    throw new ArgumentNullException("value");
  if (startIndex < 0 || startIndex > this.Length)
    throw new ArgumentOutOfRangeException("startIndex",
             Environment.GetResourceString("ArgumentOutOfRange_Index"));
  if (count < 0 || startIndex > this.Length - count)
    throw new ArgumentOutOfRangeException("count",
             Environment.GetResourceString("ArgumentOutOfRange_Count"));
  ...
}

The String.Contains method looks like this internally

public bool Contains(string value)
{
   return this.IndexOf(value, StringComparison.Ordinal) >= 0;
}

The IndexOf overload that is called looks like this

public int IndexOf(string value, StringComparison comparisonType)
{
   return this.IndexOf(value, 0, this.Length, comparisonType);
}

Here another call is made to the final overload, which then calls the relevant CompareInfo.IndexOf method, with the signature

public int IndexOf(string value, int startIndex, int count, StringComparison comparisonType)

Therefore, calling the final overload would be the fastest (albeit may be considered a micro optimization in most cases).

I may be missing something obvious but why does the Contains method not call the final overload directly considering that no other work is done in the intermediate call and that the same information is available at both stages?

Is the only advantage that if the signature of the final overload changes, only one change needs to be made (that of the intermediate method), or is there more to the design than that?

Edit from the comments (see update 2 for speed difference explanation)

To clarify the performance differences I'm getting in case I've made a mistake somewhere: I ran this benchmark (looped 5 times to avoid jitter bias) and used this extension method to compare against the String.Contains method

public static bool QuickContains(this string input, string value)
{
   return input.IndexOf(value, 0, input.Length, StringComparison.OrdinalIgnoreCase) >= 0;
}

with the loop looking like this

for (int i = 0; i < 1000000; i++)
{
   bool containsStringRegEx = testString.QuickContains("STRING");
}
sw.Stop();
Console.WriteLine("QuickContains: " + sw.ElapsedMilliseconds);

In the benchmark test, QuickContains seems about 50% faster than String.Contains on my machine.

Update 2 (performance difference explained)

I've spotted something unfair in the benchmark which explains a lot. The benchmark itself was to measure case-insensitive strings but since String.Contains can only perform case-sensitive searches, the ToUpper method was included. This would skew the results, not in terms of final output, but at least in terms of simply measuring String.Contains performance in non case-sensitive searches.

So now, if I use this extension method

public static bool QuickContains(this string input, string value)
{
   return input.IndexOf(value, 0, input.Length, StringComparison.Ordinal) >= 0;
}

use StringComparison.Ordinal in the 2 overload IndexOf call and remove ToUpper, the QuickContains method actually becomes the slowest. IndexOf and Contains are pretty much on par in terms of performance. So clearly it was the ToUpper call skewing the results of why there was such a discrepancy between Contains and IndexOf.

Not sure why the QuickContains extension method has become the slowest. (Possibly related to the fact that Contains has the [__DynamicallyInvokable, TargetedPatchingOptOut("Performance critical to inline across NGen image boundaries")] attribute?).

Question still remains as to why the 4 overload method isn't called directly but it seems performance isn't impacted (as Adrian and delnan pointed out in the comments) by the decision.





contains