Using R for Regular Expressions *RegEx

R
Natural Langugae Processing
Regular Expression
Demonstration
Published

April 9, 2022

Meetup Description

R-Ladies Gaborone joined forces with R-Ladies Cologne on foostodon to co-host an event on Using R for Regular Expressions [*RegEx] Saturday, April 09, 2022, at 6 PM CET/CAT.
Guest speaker, Pavitra Chakravarty guided us through learning R using the stringr and stringi packages - essential and useful skills for programming 🚀. Pavitra Chakravarty is a Data engineer with a PhD in Cancer Nanotechnology is a member of R -Ladies Dallas.Regular Expressions [*RegEx]

Note

You can get more from the slides, code and notes the Regular Expressions GitHub Repo

What are regular expressions?

  • Regular expression is a pattern that describes a specific set of strings with a common structure

  • Heavily used for string matching / replacing in all programming languages

  • Heart and soul for string operations

Regular expression syntax

6 basic canonical characteristics of regular expressions

  • basic pattern matching: Using functions from stringr package with exact sequence of characters

    • str_detect(), str_subset(), str_view(), str_view_all()
  • anchors: Indicate start and stop of sentence

    • ^: indicating start of sentence, $: indicating end of sentence
  • escape characters: special characters cannot be directly coded in string

    • \: if you want to find strings with single quote ', “escape” single quote by preceding it with \

The tutorial followed the following :

1. Canonical principle #1: Basic pattern-matching

2. Canonical principle #2: Anchors

3. Canonical principle #3: Escape characters

4. Canonical principle #4: Character Classes

5. Canonical principle #6: Character Clusters

Material has been borrowed heavily from

Contact Speaker

Back to top