Sunday, December 9, 2018

Regular Expression in Swift 3

To string in the example below matches the pattern:


let str = "5.14152834920+3.14120395872="
let pattern = "5.14[0-9]*\\+{1}3.14[0-9]*\\={1}"
Bool match = try match(source: str, pattern: pattern)
print("They matched \(match)") // True

func match(source: String, pattern: String) throws -> Bool {
        let regex = try NSRegularExpression(pattern: pattern, options: [])
        return regex.firstMatch(in: source, options: [], range: NSMakeRange(0, source.utf16.count)) != nil
}


Patterns


- [0-9]*  means zero or more digits
- \\+{1} means one plus sign. The double backslashes escape the plus sign.
- \\={1} means one equal character

Unicode


The firstMatch() method converts the given String object to NSString object but the String's count property does not always return the same value as the NSString's length property. The former relies on extended grapheme clusters, in which two or more unicode scalars combined are considered as one character.

  1. let regionalIndicatorForUS: Character = "\u{1F1FA}\u{1F1F8}"
  2. // regionalIndicatorForUS is 🇺🇸

The later relies on 16-bit code units, in which one unicode scalar is considered as one character. In the case above, there are two unicode scalars so it's two characters.

Then, to tell the firstMatch() method to iterate from start to end of the string, The utf16.count property must be used instead of the count property.


Reference
https://docs.swift.org/swift-book/LanguageGuide/StringsAndCharacters.html

No comments:

Post a Comment