Sunday, March 28, 2010

Runtime Complexity of .NET Generic Collection

I had to implement some data structures for my computational geometry class. Deciding whether to implement the data structures myself or using the build-in classes turned out to be a hard decision, as the runtime complexity information is located at the method itself, if present at all. So I went ahead to consolidate all the information in one table, then looked at the source code in Reflector and verified them. Below is my result.
Internal Implement-
ation
Add/insert Add beyond capacity Queue/Push Dequeue/
Pop/Peek
Remove/
RemoveAt
Item[index]/ElementAt(index) GetEnumerator Contains(value)/IndexOf/ContainsValue/Find
List Array O(1) to add, O(n) to insert O(n) - - O(n) O(1) O(1) O(n)
LinkedList Doubly linked list O(1), before/after given node O(1) O(1) O(1) O(1), before/after given node O(n) O(1) O(n)
Stack Array O(1) O(n) O(1) O(1) - - O(1) O(n)
Queue Array O(1) O(n) O(1) O(1) - - O(1) O(n)
Dictionary Hashtable with links to another array index for collision O(1), O(n) if collision O(n) - - O(1), O(n) if collision O(1), O(n) if collision O(1) O(n)
HashSet Hashtable with links to another array index for collision O(1), O(n) if collision O(n) - - O(1), O(n) if collision O(1), O(n) if collision O(1) -
SortedDictionary Red-black tree O(log n) O(log n) - - O(log n) O(log n) O(log n) O(n)
SortedList Array O(n), O(log n) if added to end of list O(n) - - O(n) O(log n) O(1) O(n)
SortedSet Red-black tree O(log n) O(log n) - - O(log n) O(log n) O(log n) -
Note:
Dictionary Add, remove and item[i] has expected O(1) running time
HashSet Add, remove and item[i] has expected O(1) running time
Update 25 April 2010: Added SortedSet

17 comments:

Anonymous said...

Nice work, very useful.

MSDN should display these information for every data structures they provide.

Jeow Li Huan said...

Thanks! MSDN did for some of the structures, but I did find that the SortedDictionary complexity that they stated is wrong... I guess wrong documentation is worse than no documentation, that is why MSDN didn't want to document it.

Shu said...

Big THANK YOU, for this table!

I think you have two mistakes:
1. LinkedList haven't Item[i], it has Find(T) = O(n)
2. SortedList Item[i] = O(log n)

Jeow Li Huan said...

Thanks Shuisky. Added Find(i) to the heading.
Why do you say SortedList Item[i] = O(log n)? Internally it's an Array, so shouldn't access be instantaneous?

Shu said...

Yes, the array. But when trying to get the Item, is a binary search inside TValue this[TKey key] {get} = O(log n)

Shu said...

You don't know index of item in SortedList, so you take Item by Key. If you want take Item by Index, you need
1) int IndexOfKey(TKey key);
2) TValue GetByIndex(int index);
And it's same O(log n) in sum :)

Chris Marisic said...

Would be worth add the .NET4 Concurrent Collections to this table!

Charles said...

Still super useful, 5 years later.

Naresh Singh Dhami said...

A million dollar worth entry!.

Unknown said...

it’s ok to show some appreciation and say ‘great post’
Asp .NET developer

Roberto said...

MSDN about List says: "This method performs a linear search; therefore, this method is an O(n) operation, where n is Count."

In your table is O(1).

Did I misunderstood?

Jeow Li Huan said...

@Roberto: MSDN stated that the IndexOf and Contains operations to be O(n). I'm missing those methods in my table...

Let me add them

Roberto said...

Sorry: I forgot to say which method! I was speaking about List.Find
https://msdn.microsoft.com/en-us/library/x0b5b5bc.aspx

Jeow Li Huan said...

@Roberto: turns out it's so hard to name the heading. The Find only applies to LinkedList, but turns out that I shouldn't put that in the same column as Item[i]. I've changed the last column MoveNext to finding a value, which would require you to enumerate through the entire collection in the general case

Unknown said...

I don't understend why a SortedList has O(log n) time complexity when getting an item by its index. The only drawback of them is adding and removing items (because we have to keep the sort), other than that, accessing items by index should have the same time complexity of List, for example.

Jeow Li Huan said...

@Alisson Reinaldo Silva: because for SortedList, the indexer is not retrieving by index, but by key. It does a binary search to look for the key in the array.

Alex said...

HashSet .Contains() exists and I believe it has a O(1) complexity.
Found it: https://docs.microsoft.com/en-us/dotnet/api/system.collections.generic.hashset-1.contains?view=net-6.0