Rename – perlrex program – Renaming Movies – example 1
###############################################

NOTE: these are not my movies. Just helping a friend out with renames, and thought it would be nice for an article on “rename”. This article is strictly educational.

NOTE: See summary at the bottom for the recap of the commands for folders containing movie files, and for folders containing movie folders.

NOTE: The “rename” tool im using is the perl rename tool sometimes called “prename”, “rename.pl”, or simply rename (sidenote, there is also another similar tool called “rename” tool from a package called util-linux, which isnt as good as the one we work with – google “rename util-linux”, I have more info on the provided link). More information on perl rename and not-as-good-util-linux rename: http://ram.kossboss.com/renameperl/


 

Good article to read on perlregex: http://perldoc.perl.org/perlre.html and another example: http://www.leancrew.com/all-this/2013/03/renaming-files/

Usually movies you download have the format of

Movie Name.Year.Garbage.extension

That look like this

We want to remove the Garbage part, so it looks like this

SIDENOTE: I will talk about renaming folders as well. They usually have a similar pattern. Except they lack the extension.

Movie Name.Year.extension

So that they look like this


Making Rename.pl output prettier

So we use “rename” which is a perl program, can be downloaded here: http://ram.kossboss.com/renameperl/

Then I made a small edit in so that instead of it showing things it will rename (or already renamed as), as simply “rename as”, i changed its code to say “———–>” instead. That way its easier to see what its doing. Example:

Its much easier to see how much that helps when you have a huge list of renames.

The edit looks like this, first locate the rename program “which rename” for me its here “/usr/bin/rename“. So open up an editor “vi /usr/bin/rename” and go to line 65 (Press 65 then Shift G)

Change line 65 from saying this:

To say this:

Note: there are some spaces or tabs leading up to the print statement, make sure to leave those there. They didnt get included in the paste, when I pasted here.

Save and exit


So first I go into the directory where I have the movies (this folder must only have movie files, no movie folders – movie folder situation will be covered later), and I run this:

# rename -n ‘s/^(.*[0-9]{4}[\)\]\}]?)(.*)(\..*)$/$1$3/g’ *

Thats just a dry run. So nothing bad will happen. The output is this

# rename -n ‘s/^(.*[0-9]{4}[\)\]\}]?)(.*)(\..*)$/$1$3/g’ *

Anyhow the above translations look good, so I run it again without the -n, which will confirm the run. Also I can run it with -v if I want to see the same output above, or else without -v the output is empty.

# rename ‘s/^(.*[0-9]{4}[\)\]\}]?)(.*)(\..*)$/$1$3/g’ *

So what does this mean:

# rename -n ‘s/^(.*[0-9]{4}[\)\]\}]?)(.*)(\..*)$/$1$3/g’ *

First off a movie is named like this:

Movie Name.Year.Garbage.extension

But we want

Movie Name.Year.extension

Note: the dot can be a dot or a space or a dash, it doesnt matter, im just trying to show the seperation

So lets save each section into a variable

$1=Movie Name.Year
$2=Garbage
$3=extension

I clumped up $1 as the movie name and the year as they are usually right next to each other

Then I let it know to work on all files with * (thats outside the perl regex), no matter their extension or anything.

So lets look into this perl regex

‘s/^(.*[0-9]{4}[\)\]\}]?)(.*)(\..*)$/$1$3/g’

s/oldstuff/newstuff/g means to substitute oldstuff into newstuff (s means substitute, well it means search&replace)

for us oldstuff is ^(.*[0-9]{4}[\)\]\}]?)(.*)(\..*)$ and newstuff is what we want to look like which simple $1$3 (we want it to look like variable #1 followed by variable $3)

Each variable looks like this (it starts and ends with parenthesis – thats are not escaped, with regular “sed” you would need to escape those parenthesis, and then recall the variables with \1\3 instead of $1$3)

(.*[0-9]{4}[\)\]\}]?) thats saved as variable $1 because its the first to appear (its the Movie name.Year)
(.*) thats saved as $2 beacuse its the second to appear (its the garbage
(\..*) thats saved as $3 because its the third to appear (its the extension)

Curious Note: how would this look like with sed instead of perl regex:
‘s/^\(.*[0-9]{4}[\)\]\}]?\)\(.*\)\(\..*\)$/\1\3/g’
I believe it would just be put a \ before the ( and ) when setting the variables in the oldstuff section, and change the $ to \ when calling the variablesin the newstuff section

^ means its a start of a new line, and $ means its the end of the line… All files start after a newline, and end before a newline.

Our first variable is saved like this

(.*[0-9]{4}[\)\]\}]?)

Now that looks complicated
Originally it was like this:

(.*[0-9]{4})

So it looked like this
‘s/^(.*[0-9]{4})(.*)(\..*)$/$1$3/g’

But then we would get renames like this:

Notice its missing the ) at the end. So I let it know with [\)\]\}]?
which should be read like this [ )]} ]?
To match one occurrence of ) or ] or } after [0-9]{4} (which matches the year, by saying match 4 numbers at least)
But since ) ] } are special chars I have to escape them with a \
So the whole thing looks like this
[\)\]\}]?
The ? means match 0 or 1, because sometimes there could be 0 of those

So now you can see that (.*[0-9]{4}[\)\]\}]?) Matches the Movie name.Year
The .* Matches any character as long as its not a newline, So it will match any movie name and space and dot or whatever
After the .* it matches the Year (4 digit year)
Then after the year it looks for a single optional (because of the question mark, ?, its 1 or 0 meaning “single optional”) closing surround character like ) or ] or }

Note: of course I could of used other surround chars like <year> or -year-, but noone really uses those

So after the year we want to match the garbage, and we know it can look like anything, so we say this:
(.*)

Note if we just left the regex like this:
rename -n ‘s/^(.*[0-9]{4})(.*)$/$1/g’ *
So missing (\..*) in oldstuff section of the search&replace and the $3 in the newstuff of the search&replace
Without the last part that matches the regular expersion we would get things like this:

So its missing the extension. So lets save the extension as well. We know the extension ends with dot followed by a number of characters (usually 3, but we will assume its an infinite amount of character).

Since a dot matches any character, we need to escape it

\. matches a dot
. matches any character
* matches the previous thing any number of times

Combine
\..* would match things like:

In here
sadfkajsdfaskdhfas.dfa.sdf.asdfasdfasdf.asdf.123
It would match that .123 at the end

But we want that in a variable to (\..*) so now its saved into variable $3, as its the third set of parenthesis to appear

Finally we get to call upon how we want this to look like

$1$3 would look good, its the Movie name.Year.Extension

So how do I handle folders which have movie folders? See Summary.

Summary!

 

Leave a Reply

Your email address will not be published. Required fields are marked *