R Markdown

#1

Find:\s\{2, }
Replace: ,

Explanation: This looks for two or more consecutive white spaces or tabs and replaces it with a comma

#2

Find: ([A-Za-z]+), \s*([A-Za-z]+), \s*(.*)
Replace: \2\ \1\ (\3)

Explanation: This captures the last name, first name, and institution from a comma-separated format, allowing for flexible spaces. The replacement reorders the first name and last name while placing the institution inside parentheses

#3

Find: (\d{4}.*?\.mp3)
Replace: \1\n
Find:(?<=\n)\s+
Replace:(nothing)

The first find and replace removes the space at the begining and the second find replace splits the lines

#4

Find:(\d{4})\s(.*\.mp3)
Replace:\2_\1

Explanation: This captures the 4-digit number and the space after the number and then the rest of the file name and then rearranges the captured parts to place the title and .mp3 first followed by the number at the end

#5

Find: (\w)(\w+),(\w+),(.+),(\d+)
Replace:\1_\3,\5

Explanation: This captures the first letter of the first word, captures the rest of the first word, captures the entire second word, captures everything in third column, and captures only digits from 4th column. It replaces the first letter of the first word, inserts and underscore and then inserts the entire second word, inserta a comma then inserts the fourth column

#6

Find: (\w)(\w+),(\w{4})(\w+),(.+),(\d+)
Replace:\1_\3,\6

Explanation: Captures the first letter of the first word, the rest of the first word, the first 4 letters of the second word, the rest of the second word, captures everything in third column, only digits from fourth column and replaces with the first letter of first word,inserts an underscore, insets penn, inserts a comma, and then the fourth column

#7

Find: ^(\w{3})\w+?,(\w{3})\w+?,([\d.]+),(\d+)
Replace:\1\2, \4, \3

Explanation: This captures the first 3 letters of the first word, the first 3 letters of the second word, the third column,the fourth column, and then combines first 3 letters of first and second words, inserts the last column and then inserts comma and space and third column

#8

Find: \bNA\b
Replace: 0

This fixes the pathogenbinary column so instead of 1 or 0, it is all numberic values, 0 or 1

Find: [-@!^+_%()\*#\$]+
Replace: nothing

This removes special characters from bombus_spp and host_plant columns

Find:\bworker|male\b\s?=\t
Replace: \1

This removes the spaces after worker or male