Python Regex Split String using re.split()

In Python, the re.split() function from the re module allows you to split a string using regular expressions as the delimiter. This is particularly useful when you need more advanced splitting rules that are not achievable with the standard str.split() method.

How you can use re.split()

import re text = "red,green,blue,yello" result = re.split(",", text) print(result) # Output: ['red', 'green', 'blue', 'yello']

In this example, the string text is split into a list using a comma , as the delimiter. The resulting list contains the individual words.

You can use more complex regular expressions to split based on patterns. For instance, splitting a sentence into words:

sentence = "This is a sample sentence." # Split by whitespace words = re.split(r"\s+", sentence) print(words) # Output: ['This', 'is', 'a', 'sample', 'sentence.']

In this example, the regular expression r"\s+" matches one or more whitespace characters, including spaces and tabs, and uses that pattern as the delimiter for splitting the sentence.

You can also capture the delimiter as part of the split segments by enclosing the pattern in parentheses:

address = "123 Main St, City, Country" # Split by comma followed by space parts = re.split(r",\s", address) print(parts) # Output: ['123 Main St', 'City', 'Country']

In this case, the regular expression r",\s" matches a comma , followed by a space, splitting the address string into its components while excluding the comma and space.

Conclusion

re.split() is a powerful tool for splitting strings using complex patterns, allowing you to handle various cases where simple delimiters might not suffice. Remember that the regular expressions used in re.split() should be chosen carefully to achieve the desired splitting behavior.