if I have a large collection of elements and my goal is to search and see if there exist to identical elements in my collection,
I used hash function & hash table inorder to make my search easier (is this the best method?)
when searching within the table, should I sorted the lists (using Python's sorted func) and then search for repeating elements or is there a better way?
Date: 11 Jan 2015 08:14
Number of posts: 2
RSS: New posts
in general, a hash table is an efficient implementation for a dictionary (insert / delete / search). It allows performing operations in O(1) on average, when elements are small, or more generally O(m) for elements of size m. You cannot expect anything in less time than that, so in this sense hash tables are optimal.
However, note that this optimum is on average, while the worst case is less efficient, and beaten by more advanced data structures you'll probably encounter in future courses (such as red black tree, skip list, B+ tree etc).
So it depends on what you're trying to optimize - best or average case, and on additional features of the problem, which may help you choose among the alternatives.
Sorting the inner lists of s hash table may improve searching, but also requires maintaining the sorted order, thus hurting insert's efficiency. Since we expect each such list to contain O(1) elements, simply searching it linearly is usually fast enough.