Using R for Regular Expressions *RegEx
Meetup Description
R-Ladies Gaborone joined forces with R-Ladies Cologne on foostodon to co-host an event on Using R for Regular Expressions [*RegEx] Saturday, April 09, 2022, at 6 PM CET/CAT.
Guest speaker, Pavitra Chakravarty guided us through learning R using the stringr and stringi packages - essential and useful skills for programming 🚀. Pavitra Chakravarty is a Data engineer with a PhD in Cancer Nanotechnology is a member of R -Ladies Dallas.Regular Expressions [*RegEx]
You can get more from the slides, code and notes the Regular Expressions GitHub Repo
What are regular expressions?
Regular expression is a pattern that describes a specific set of strings with a common structure
Heavily used for string matching / replacing in all programming languages
Heart and soul for string operations
Regular expression syntax
6 basic canonical characteristics of regular expressions
basic pattern matching: Using functions from stringr package with exact sequence of characters
str_detect()
,str_subset()
,str_view()
,str_view_all()
anchors: Indicate start and stop of sentence
^: indicating start of sentence
,$: indicating end of sentence
escape characters: special characters cannot be directly coded in string
\
: if you want to find strings with single quote'
, “escape” single quote by preceding it with\
The tutorial followed the following :
1. Canonical principle #1: Basic pattern-matching
2. Canonical principle #2: Anchors
3. Canonical principle #3: Escape characters
4. Canonical principle #4: Character Classes
5. Canonical principle #6: Character Clusters
Material has been borrowed heavily from
the STAT 545 course. This course was started by Jenny Bryan: https://stat545.stat.ubc.ca/notes/notes-b05/
More STAT 545 resources: https://stat545.com/character-vectors.html, https://youtu.be/I0dJ1zpxAtU
R for Data Science chapter on Strings: https://r4ds.had.co.nz/strings.html
Solution set for R4DS on Strings: https://brshallo.github.io/r4ds_solutions/14-strings.html#matching-patterns-w-regex
FUN TIME: Regex Puzzle Builder: https://regexcrossword.com/puzzlebuilder
In a Processing Textual Data with Python from CODATA Connect Series on Research Skills Enhancement, Raphael Cobe gives a talk on how to use Regular Expressions to find patterns or remove useless data.