Regular Expressions with the =~
Operator
Bash’s =~
operator enables powerful regular expression matching. Regular expressions offer far greater flexibility than simple wildcard matching, allowing you to define complex patterns for string manipulation and validation. The operator returns true if the string on the left matches the regular expression on the right, which must be enclosed in double quotes.
string="This is a test string with 123 digits"
if [[ "$string" =~ "test" ]]; then
echo "The string contains 'test'"
fi
if [[ "$string" =~ "string$" ]]; then # $ matches the end of the string
echo "The string ends with 'string'"
fi
if [[ "$string" =~ "[0-9]+" ]]; then # Matches one or more digits
echo "The string contains digits"
fi
if [[ "$string" =~ ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+.[a-zA-Z]{2,}$ ]]; then
echo "The string looks like an email address"
fi
The examples above demonstrate basic regular expression usage. For more complex patterns, consult a comprehensive regular expression tutorial or reference. The flexibility of regular expressions makes them ideal for tasks like validating email addresses, IP addresses, or other structured data within strings.
Wildcard Matching with the *
Operator
The *
wildcard, simpler than regular expressions, matches zero or more characters. It’s frequently used in file globbing and basic conditional checks. While less powerful, it’s efficient for straightforward scenarios.
files=(*.txt) # Expands to a list of all .txt files
if [[ "$filename" == "*.log" ]]; then
echo "This is a log file"
fi
if [[ "$variable" == "pre*suf" ]]; then
echo "The variable starts with 'pre' and ends with 'suf'"
fi
The first example showcases file globbing; the others demonstrate basic pattern matching within conditional statements. Note that ==
, not =~
, is used for wildcard matching.
Extracting Subpatterns
Both regular expressions and wildcards can extract portions of a matching string, though the methods differ significantly.
Regular Expressions (with =~
)
Regular expressions use capturing groups, defined with parentheses ()
, to isolate specific parts of the matched string. These captured groups are accessible via the BASH_REMATCH
array.
string="My user ID is 12345"
if [[ "$string" =~ "ID is ([0-9]+)" ]]; then
user_id="${BASH_REMATCH[1]}"
echo "User ID: $user_id"
fi
([0-9]+)
captures one or more digits, stored in ${BASH_REMATCH[1]}
.
Wildcards (with *
)
Wildcard matching doesn’t directly support subpattern extraction. Instead, you need string manipulation techniques after a basic match.
filename="my_report_2024-10-26.txt"
if [[ "$filename" == "my_report_*.txt" ]]; then
date="${filename%.*}" # Remove the '.txt' extension
date="${date##*_}" # Remove everything before the last '_'
echo "Report date: $date"
fi
This example uses parameter expansion to achieve subpattern extraction, demonstrating a less elegant but effective approach for simpler scenarios.
Bash offers versatile pattern-matching capabilities. Choose the method—regular expressions or wildcards—that best suits your needs, balancing power and simplicity.