c# - with - java runtime generics

What are the differences between Generics in C# and Java… and Templates in C++? (9)

11 months late, but I think this question is ready for some Java Wildcard stuff.

This is a syntactical feature of Java. Suppose you have a method:

public <T> void Foo(Collection<T> thing)

And suppose you don't need to refer to the type T in the method body. You're declaring a name T and then only using it once, so why should you have to think of a name for it? Instead, you can write:

public void Foo(Collection<?> thing)

The question-mark asks the the compiler to pretend that you declared a normal named type parameter that only needs to appear once in that spot.

There's nothing you can do with wildcards that you can't also do with a named type parameter (which is how these things are always done in C++ and C#).

I mostly use Java and generics are relatively new. I keep reading that Java made the wrong decision or that .NET has better implementations etc. etc.

So, what are the main differences between C++, C#, Java in generics? Pros/cons of each?

C++ rarely uses the “generics” terminology. Instead, the word “templates” is used and is more accurate. Templates describes one technique to achieve a generic design.

C++ templates is very different from what both C# and Java implement for two main reasons. The first reason is that C++ templates don't only allow compile-time type arguments but also compile-time const-value arguments: templates can be given as integers or even function signatures. This means that you can do some quite funky stuff at compile time, e.g. calculations:

template <unsigned int N>
struct product {
    static unsigned int const VALUE = N * product<N - 1>::VALUE;

template <>
struct product<1> {
    static unsigned int const VALUE = 1;

// Usage:
unsigned int const p5 = product<5>::VALUE;

This code also uses the other distinguished feature of C++ templates, namely template specialization. The code defines one class template, product that has one value argument. It also defines a specialization for that template that is used whenever the argument evaluates to 1. This allows me to define a recursion over template definitions. I believe that this was first discovered by Andrei Alexandrescu.

Template specialization is important for C++ because it allows for structural differences in data structures. Templates as a whole is a means of unifying an interface across types. However, although this is desirable, all types cannot be treated equally inside the implementation. C++ templates takes this into account. This is very much the same difference that OOP makes between interface and implementation with the overriding of virtual methods.

C++ templates are essential for its algorithmic programming paradigm. For example, almost all algorithms for containers are defined as functions that accept the container type as a template type and treat them uniformly. Actually, that's not quite right: C++ doesn't work on containers but rather on ranges that are defined by two iterators, pointing to the beginning and behind the end of the container. Thus, the whole content is circumscribed by the iterators: begin <= elements < end.

Using iterators instead of containers is useful because it allows to operate on parts of a container instead of on the whole.

Another distinguishing feature of C++ is the possibility of partial specialization for class templates. This is somewhat related to pattern matching on arguments in Haskell and other functional languages. For example, let's consider a class that stores elements:

template <typename T>
class Store { … }; // (1)

This works for any element type. But let's say that we can store pointers more effciently than other types by applying some special trick. We can do this by partially specializing for all pointer types:

template <typename T>
class Store<T*> { … }; // (2)

Now, whenever we instance a container template for one type, the appropriate definition is used:

Store<int> x; // Uses (1)
Store<int*> y; // Uses (2)
Store<string**> z; // Uses (2), with T = string*.

C++ templates are actually much more powerful than their C# and Java counterparts as they are evaluated at compile time and support specialization. This allows for Template Meta-Programming and makes the C++ compiler equivalent to a Turing machine (i.e. during the compilation process you can compute anything that is computable with a Turing machine).

I'll add my voice to the noise and take a stab at making things clear:

C# Generics allow you to declare something like this.

List<Person> foo = new List<Person>();

and then the compiler will prevent you from putting things that aren't Person into the list.
Behind the scenes the C# compiler is just putting List<Person> into the .NET dll file, but at runtime the JIT compiler goes and builds a new set of code, as if you had written a special list class just for containing people - something like ListOfPerson.

The benefit of this is that it makes it really fast. There's no casting or any other stuff, and because the dll contains the information that this is a List of Person, other code that looks at it later on using reflection can tell that it contains Person objects (so you get intellisense and so on).

The downside of this is that old C# 1.0 and 1.1 code (before they added generics) doesn't understand these new List<something>, so you have to manually convert things back to plain old List to interoperate with them. This is not that big of a problem, because C# 2.0 binary code is not backwards compatible. The only time this will ever happen is if you're upgrading some old C# 1.0/1.1 code to C# 2.0

Java Generics allow you to declare something like this.

ArrayList<Person> foo = new ArrayList<Person>();

On the surface it looks the same, and it sort-of is. The compiler will also prevent you from putting things that aren't Person into the list.

The difference is what happens behind the scenes. Unlike C#, Java does not go and build a special ListOfPerson - it just uses the plain old ArrayList which has always been in Java. When you get things out of the array, the usual Person p = (Person)foo.get(1); casting-dance still has to be done. The compiler is saving you the key-presses, but the speed hit/casting is still incurred just like it always was.
When people mention "Type Erasure" this is what they're talking about. The compiler inserts the casts for you, and then 'erases' the fact that it's meant to be a list of Person not just Object

The benefit of this approach is that old code which doesn't understand generics doesn't have to care. It's still dealing with the same old ArrayList as it always has. This is more important in the java world because they wanted to support compiling code using Java 5 with generics, and having it run on old 1.4 or previous JVM's, which microsoft deliberately decided not to bother with.

The downside is the speed hit I mentioned previously, and also because there is no ListOfPerson pseudo-class or anything like that going into the .class files, code that looks at it later on (with reflection, or if you pull it out of another collection where it's been converted into Object or so on) can't tell in any way that it's meant to be a list containing only Person and not just any other array list.

C++ Templates allow you to declare something like this

std::list<Person>* foo = new std::list<Person>();

It looks like C# and Java generics, and it will do what you think it should do, but behind the scenes different things are happening.

It has the most in common with C# generics in that it builds special pseudo-classes rather than just throwing the type information away like java does, but it's a whole different kettle of fish.

Both C# and Java produce output which is designed for virtual machines. If you write some code which has a Person class in it, in both cases some information about a Person class will go into the .dll or .class file, and the JVM/CLR will do stuff with this.

C++ produces raw x86 binary code. Everything is not an object, and there's no underlying virtual machine which needs to know about a Person class. There's no boxing or unboxing, and functions don't have to belong to classes, or indeed anything.

Because of this, the C++ compiler places no restrictions on what you can do with templates - basically any code you could write manually, you can get templates to write for you.
The most obvious example is adding things:

In C# and Java, the generics system needs to know what methods are available for a class, and it needs to pass this down to the virtual machine. The only way to tell it this is by either hard-coding the actual class in, or using interfaces. For example:

string addNames<T>( T first, T second ) { return first.Name() + second.Name(); }

That code won't compile in C# or Java, because it doesn't know that the type T actually provides a method called Name(). You have to tell it - in C# like this:

interface IHasName{ string Name(); };
string addNames<T>( T first, T second ) where T : IHasName { .... }

And then you have to make sure the things you pass to addNames implement the IHasName interface and so on. The java syntax is different (<T extends IHasName>), but it suffers from the same problems.

The 'classic' case for this problem is trying to write a function which does this

string addNames<T>( T first, T second ) { return first + second; }

You can't actually write this code because there are no ways to declare an interface with the + method in it. You fail.

C++ suffers from none of these problems. The compiler doesn't care about passing types down to any VM's - if both your objects have a .Name() function, it will compile. If they don't, it won't. Simple.

So, there you have it :-)

In Java, generics are compiler level only, so you get:

a = new ArrayList<String>()
a.getClass() => ArrayList

Note that the type of 'a' is an array list, not a list of strings. So the type of a list of bananas would equal() a list of monkeys.

So to speak.

NB: I don't have enough point to comment, so feel free to move this as a comment to appropriate answer.

Contrary to popular believe, which I never understand where it came from, .net implemented true generics without breaking backward compatibility, and they spent explicit effort for that. You don't have to change your non-generic .net 1.0 code into generics just to be used in .net 2.0. Both the generic and non-generic lists are still available in .Net framework 2.0 even until 4.0, exactly for nothing else but backward compatibility reason. Therefore old codes that still used non-generic ArrayList will still work, and use the same ArrayList class as before. Backward code compatibility is always maintained since 1.0 till now... So even in .net 4.0, you still have to option to use any non-generics class from 1.0 BCL if you choose to do so.

So I don't think java has to break backward compatibility to support true generics.

The biggest complaint is type erasure. In that, generics are not enforced at runtime. Here's a link to some Sun docs on the subject.

Generics are implemented by type erasure: generic type information is present only at compile time, after which it is erased by the compiler.