Difference between Set and List in Python

In Python, both sets and lists are used to store collections of items. However, they have some fundamental differences in terms of their properties, functionalities, and use cases.

Below is a table summarizing the differences between lists and sets in Python:


Property List Set
Order Ordered Unordered
Mutability Mutable Immutable
Duplicates Allowed Not allowed
Access Index-based Not index-based
Created with square brackets [] curly braces {} or set() function

Following is an example that demonstrates the difference between lists and sets:

color_list = ['red', 'green', 'blue', 'yello', 'brown'] color_set = {'red', 'green', 'blue', 'yello', 'brown'} print(color_list) print(color_set) #Output: ['red', 'green', 'blue', 'yello', 'brown'] {'brown', 'green', 'red', 'blue', 'yello'}
color_list[1] = 'black' print(color_list) #Output: ['red', 'black', 'blue', 'yello', 'brown']
color_set.add('black') print(color_set) #Output: {'green', 'yello', 'blue', 'black', 'red', 'brown'}

In the above example, create a list and a set with the same elements. When print the two, see that the list retains its order and allows duplicates, while the set is unordered and contains only unique elements. You can also see that you can change an element in the list using indexing, but cannot do the same with a set. Instead, can add elements to the set using the add() method.

Performance: list Vs set in Python

In terms of performance, there are some differences between lists and sets that you should be aware of. The performance of lists and sets depends on the specific use case and the size of the data. Lists are generally faster for adding and removing elements, while sets are faster for searching for elements. It's important to choose the appropriate data structure based on the specific requirements of your program.

Adding and Removing Elements

Adding or removing elements in a list involves shifting all the subsequent elements up or down the list, which can be time-consuming for large lists. In contrast, adding or removing elements in a set does not require any shifting of elements, as the set does not maintain any order.

Follwoing is an example to compare the performance of adding elements to a list versus a set:

import time # Adding 100,000 elements to a list start_time = time.time() my_list = [] for i in range(100000): my_list.append(i) end_time = time.time() print("Time taken to add 100,000 elements to a list: ", end_time - start_time) # Output: Time taken to add 100,000 elements to a list: 0.00587916374206543
# Adding 100,000 elements to a set start_time = time.time() my_set = set() for i in range(100000): my_set.add(i) end_time = time.time() print("Time taken to add 100,000 elements to a set: ", end_time - start_time) # Output: Time taken to add 100,000 elements to a set: 0.008870363235473633

In the above example, you can see that adding 100,000 elements to a list is slightly faster than adding the same number of elements to a set.

Searching for Elements

Searching for an element in a set is generally faster than searching for an element in a list, especially for large collections of data. This is because sets use a hash table to store elements, allowing for constant-time lookup (O(1)). Lists, on the other hand, require a linear search through each element in the list until the desired element is found, which can be slower for large lists.

Following is an example to compare the performance of searching for elements in a list versus a set:

import time # Creating a list of 100,000 elements my_list = [i for i in range(100000)] # Creating a set of 100,000 elements my_set = set(my_list) # Searching for an element in the list start_time = time.time() if 99999 in my_list: print("Element found in list") end_time = time.time() print("Time taken to search for an element in a list: ", end_time - start_time) #Output: Element found in list Time taken to search for an element in a list: 0.0007030963897705078
# Searching for an element in the set start_time = time.time() if 99999 in my_set: print("Element found in set") end_time = time.time() print("Time taken to search for an element in a set: ", end_time - start_time) #Output: Element found in set Time taken to search for an element in a set: 6.67572021484375e-06

In the above example, you can see that searching for an element in a set is much faster than searching for the same element in a list, even though both collections contain the same elements.

Memory Usage: list Vs set in Python

Memory use is another important consideration when choosing between lists and sets.

Memory Usage of Lists

Lists in Python are implemented as dynamic arrays, which means that the underlying memory allocation of a list is contiguous. When you add elements to a list and the list becomes full, the underlying array is reallocated to a larger size, and the elements are copied over to the new location.

The amount of memory used by a list depends on several factors, including the number of elements in the list, the size of each element, and the amount of memory allocated for the list beyond the current number of elements. Lists can use more memory than sets because they maintain the order of elements and can store duplicates.

Following is an example to demonstrate the memory usage of a list:

import sys # Creating a list of 100,000 integers my_list = [i for i in range(100000)] # Printing the size of the list in bytes print("Size of the list in bytes: ", sys.getsizeof(my_list)) #Output: Size of the list in bytes: 900120

In this example, you can see that the size of the list is around 900120.

Memory Usage of Sets

Sets in Python are implemented as hash tables, which means that the underlying memory allocation is not contiguous. When you add elements to a set, the hash table is resized dynamically as needed, and the elements are rehashed and distributed across the new table.

The amount of memory used by a set depends on several factors, including the number of elements in the set, the size of each element, and the load factor of the hash table. Sets typically use less memory than lists because they don't store duplicates and don't maintain order.

Following is an example to demonstrate the memory usage of a set:

import sys # Creating a set of 100,000 integers my_set = set([i for i in range(100000)]) # Printing the size of the set in bytes print("Size of the set in bytes: ", sys.getsizeof(my_set)) #Output: Size of the set in bytes: 524416

In the above example, you can see that the size of the set is around 524 KB, which is less than the size of the list in the previous example.

So, sets typically use less memory than lists because they don't store duplicates and don't maintain order. However, the actual memory usage of a list or a set depends on several factors, including the number and size of the elements, and the implementation of the Python interpreter.

Time Complexity: list vs set in Python

The time complexity of an operation is an important consideration when choosing between lists and sets. Sets have faster average-case time complexity for searching, adding, and removing elements compared to lists. However, lists have faster time complexity for accessing elements by index and concatenating two lists. The choice between lists and sets depends on the specific requirements of the task at hand, taking into account both time complexity and other factors such as memory usage and the need for duplicates and ordered elements.

Following is a comparison of the time complexities of common operations on lists and sets in Python:

Time Complexity of Lists

  1. Accessing an element by index: O(1)
  2. Searching for an element: O(n)
  3. Inserting an element at the end: O(1) amortized
  4. Inserting an element at a specific position: O(n)
  5. Deleting an element by index: O(n)
  6. Deleting an element by value: O(n)
  7. Concatenating two lists: O(k), where k is the length of the second list

Note that the time complexity of inserting an element at the end of a list is O(1) amortized, which means that on average, adding an element to the end of a list takes constant time. However, in the worst case, when the underlying array needs to be resized, adding an element to the end of a list can take O(n) time.

Time Complexity of Sets

  1. Searching for an element: O(1) on average
  2. Adding an element: O(1) on average
  3. Removing an element: O(1) on average
  4. Union of two sets: O(len(s1) + len(s2))
  5. Intersection of two sets: O(min(len(s1), len(s2)))
  6. Difference of two sets: O(len(s1))

Note that the time complexity of set operations depends on the size of the sets involved, but not on the size of the elements in the sets.

Python Lists

A list is an ordered collection of elements, which can be of any type (integers, strings, objects, etc.), and can be changed (mutable). Lists are created using square brackets [], and items within the list are separated by commas.

colors = ['red', 'green', 'blue', 'yellow']

Properties:

  1. Ordered: The items in the list are ordered and can be accessed using indexing.
  2. Mutable: Lists are mutable, which means you can add, remove or change the elements of the list after it is created.
  3. Duplicate elements: Lists can contain duplicate elements.

Python Sets

A set is an unordered collection of elements that are unique (no duplicates) and immutable (cannot be changed once created). Sets are created using curly brackets {} or the set() function.

colors = {'red', 'green', 'blue', 'yellow'}

Properties:

  1. Unordered: The items in a set are unordered and cannot be accessed using indexing.
  2. Immutable: Once a set is created, its elements cannot be changed. However, you can add or remove elements from the set.
  3. Unique elements: Sets cannot contain duplicate elements. If you try to add a duplicate element, it will be ignored.