Sunday, March 28, 2010

Runtime Complexity of .NET Generic Collection

I had to implement some data structures for my computational geometry class. Deciding whether to implement the data structures myself or using the build-in classes turned out to be a hard decision, as the runtime complexity information is located at the method itself, if present at all. So I went ahead to consolidate all the information in one table, then looked at the source code in Reflector and verified them. Below is my result.

  Internal Implement-
ation
Add/insert Add beyond capacity Queue/Push Dequeue/
Pop/Peek
Remove/
RemoveAt
Item[i]/Find(i) GetEnumerator MoveNext
List Array O(1) to add, O(n) to insert O(n) - - O(n) O(1) O(1) O(1)
LinkedList Doubly linked list O(1), before/after given node O(1) O(1) O(1) O(1), before/after given node O(n) O(1) O(1)
Stack Array O(1) O(n) O(1) O(1) - - O(1) O(1)
Queue Array O(1) O(n) O(1) O(1) - - O(1) O(1)
Dictionary Hashtable with links to another array index for collision O(1), O(n) if collision O(n) - - O(1), O(n) if collision O(1), O(n) if collision O(1) O(1)
HashSet Hashtable with links to another array index for collision O(1), O(n) if collision O(n) - - O(1), O(n) if collision O(1), O(n) if collision O(1) O(1)
SortedDictionary Red-black tree O(log n) O(log n) - - O(log n) O(log n) O(log n) O(1)
SortedList Array O(n) O(n) - - O(n) O(1) O(1) O(1)
SortedSet Red-black tree O(log n) O(log n) - - O(log n) O(log n) O(log n) O(1)

Note:

Dictionary Add, remove and item[i] has expected O(1) running time
HashSet Add, remove and item[i] has expected O(1) running time

Update 25 April 2010: Added SortedSet

6 comments:

Anonymous said...

Nice work, very useful.

MSDN should display these information for every data structures they provide.

Jeow Li Huan said...

Thanks! MSDN did for some of the structures, but I did find that the SortedDictionary complexity that they stated is wrong... I guess wrong documentation is worse than no documentation, that is why MSDN didn't want to document it.

Shuisky said...

Big THANK YOU, for this table!

I think you have two mistakes:
1. LinkedList haven't Item[i], it has Find(T) = O(n)
2. SortedList Item[i] = O(log n)

Jeow Li Huan said...

Thanks Shuisky. Added Find(i) to the heading.
Why do you say SortedList Item[i] = O(log n)? Internally it's an Array, so shouldn't access be instantaneous?

Shuisky said...

Yes, the array. But when trying to get the Item, is a binary search inside TValue this[TKey key] {get} = O(log n)

Shuisky said...

You don't know index of item in SortedList, so you take Item by Key. If you want take Item by Index, you need
1) int IndexOfKey(TKey key);
2) TValue GetByIndex(int index);
And it's same O(log n) in sum :)