Day 15: Reliably modify and replace text with the least amount of effort
To guarantee your analysis is streamlined and repeatable, get acquainted with the function regexprep.
On Day 13 and Day 14, we used regular expressions to locate text. In today’s post, we use regular expressions within the function regexprep, which is the key to making precise and effective text substitutions.
Here are a few simple use cases for regexprep:
Removing characters from filenames. In the interest of consistency, it’s great if you can take a set of data with a filename and turn that filename into a figure title. Here’s how you might do it.
% Lets suppose we have a file with some data %
file = 'C:\Files\Kaleemah\Mouse1\Images\Neuron_10_May_12_Dorsal.tif'% We would load the file and show the image %
Now you want to create a title but you don’t want all that extra stuff. Here’s how to do it in two steps. First, make sure you feel comfortable with the topics I cover in Day 13 and Day 14. Next, we’ll just get the filename:
% Extract just the filename %file = regexp( file, '(Neuron.*)(?=.tif)', 'match')
file = file{1}
Aside: Yes there’s a far more sophisticated way of doing this, which I have not covered yet because as I’ve said, regular expressions can become very complex and I suggest that if you have the time and patience, you spend a few weeks on them. Here is the alternative approach:
file = 'C:\Files\Kaleemah\Mouse1\Images\Neuron_10_May_12_Dorsal.tif'
file = regexp( file, '([^\\]+$)', 'match')
file = regexp( file, '(.*)(?=.tif)', 'match')% To recover the output as a string you'll need to use file{1}{1} %
In either case, at this point, you should have a variable file which is a string containing: ‘Neuron_10_May_12_Dorsal’
Now if you tried to create a figure and title a plot, whether it’s through the axes or title function, you will run into this issue:
Here’s how to use regexprep to get around this:
file = regexprep( file, '_', ' ' )
If you then title the plot, you will see the text rendered without the subscripts.
Another use case: loading, renaming, and saving files. If you are interested in reliable and repeatable analysis that won’t cause headaches for you or your collaborators later on, then the following image is your worst nightmare:
Accessing files through a graphical interface is not only tedious and manual, but the worst thing about it is that, in most cases, it leaves no trace of where the file you’re opening is located.
The alternative is to have written filenames which are kept in your analysis scripts or functions, so that the next person who goes to repeat your analysis can see exactly where the files are located.
Here’s what that could look like:
files{1} = 'C:\Files\Kaleemah\Mouse1\Images\Neuron_10_May_12_Dorsal.tif';
This would be the location of the loaded file. Next, suppose you want to read in the image and do some analysis of it.
% Suppose you want to load the images (For illustrative purposes only!) %
% image{1} = imread( files{1} );
% image{1} = some_analysis_fxn( image{1} );
It would be fine to write the output image to the original location (files{1}), but then you’d overwrite your original image. Here’s how you could replace the folder location and add ‘_modified’ to the filename:
% First change the folder location %
files_adjusted{1} = regexprep( files{1}, 'C:\Files\Kaleemah\Mouse1\Images\', 'C:\Files\Kaleemah\Mouse1\Images\Adjusted\' )% Next, change the file name %
files_adjusted{1} = regexprep( files_adjusted{1}, '.tif', '_modified.tif' )
The new filename should be:
“C:\Files\Kaleemah\Mouse1\Images\Neuron_10_May_12_Dorsal_modified.tif’’
You can now save your modified image to the new location contained in the variable files_adjusted{1}.