Problems to solve with sed regex

daniel.m.tripp · July 14, 2023, 10:43am

OK - I swear by “perl” rename (from CPAN : the default rename command in Debian based distros - VS the shonky piece of crap Red Hat foist on users)…

Bought a new digital download album today - a bunch of mostly doom/stoner bands covering the songs from Seattle band Soundgarden’s 1994 album “Superunknown”.

But I now have fields in filenames I want to reposition :

'Beastwars - Superunknown (Redux) - 10 The Day I Tried to Live.flac'
'Darkher - Superunknown (Redux) - 15 Like Suicide.flac'
'Dozer - Superunknown (Redux) - 14 Half.flac'
'Frayle - Superunknown (Redux) - 06 Head Down.flac'
'High Priest - Superunknown (Redux) - 02 My Wave.flac'
'Horseburner - Superunknown (Redux) - 08 Spoonman.flac'
'Jack Harlon & The Dead Crows - Superunknown (Redux) - 11 Kickstand.flac'
'Marc Urselli'\''s SteppenDoom - Superunknown (Redux) - 13 4th of July.flac'
'Marissa Nadler - Superunknown (Redux) - 03 Fell on Black Days.flac'
'Somnuri - Superunknown (Redux) - 04 Mailman.flac'
'Spotlights - Superunknown (Redux) - 07 Black Hole Sun.flac'
'The Age of Truth - Superunknown (Redux) - 12 Fresh Tendrils.flac'
'Ufomammut - Superunknown (Redux) - 01 Let Me Drown.flac'
'Valley of the Sun - Superunknown (Redux) - 05 Superunknown.flac'
'Witch Mountain - Superunknown (Redux) - 09 Limo Wreck.flac'

I want the track number as the first field - at the very least… Just for the files - I’d be happy (maybe) with everything after "(Redux) " and omit all the rest…

But FFS - I don’t know enough regex (does anyone?) to figure that out… I can get rid of " - Superunknown (Redux) - " But I don’t want that first field, “first” either - that first field is the artist or band - and that’s already in the metadata in the digital music file!

I can’t even sort it properly… WTF?

nevj · July 14, 2023, 12:11pm

@daniel.m.tripp ,
The number of fields per record is not consistent
The only consistent feature I can see is the ‘-’ signs.
Could we somehow get everything before the first - into a set of quotes,
and
everything after the second - into a set of quotes.
I would try with awk… count fields up to each minus then concatenate them and print the concatenated value. awk will quote it.

Have fun
Neville

daniel.m.tripp · July 14, 2023, 12:27pm

I might have to settle for removing everything, up to, and preceding "(Redux) - " even if it leaves the trailing space after the dash - I can get rid of that later…

But I’m not even sure how to accomplish that… and I’d rather not use awk - because perl/CPAN rename is way more “sed” than “awk” (I think they both use the same obscure regex)… i.e. I can easily - with rename rename 's/\ //' and only replace the first space later… I could even later on replace the resulting file - e.g. 09 Limo Wreck.flac" with “09-Limo Wreck” (and later on replace all spaces - looks like a prime case for CamelCase)…

I think dash/hyphen is going to have to be my field separator…

nevj · July 14, 2023, 12:41pm

Yep, thats the only option
Permute each minus into two quotes

daniel.m.tripp · July 14, 2023, 1:13pm

Okay - here’s a start :

perl-rename 's/^[^-]+-//' *.flac

And at least I end up with :

' Superunknown (Redux) - 01 Let Me Drown.flac'
' Superunknown (Redux) - 02 My Wave.flac'
' Superunknown (Redux) - 03 Fell on Black Days.flac'
' Superunknown (Redux) - 04 Mailman.flac'
' Superunknown (Redux) - 05 Superunknown.flac'
' Superunknown (Redux) - 06 Head Down.flac'
' Superunknown (Redux) - 07 Black Hole Sun.flac'
' Superunknown (Redux) - 08 Spoonman.flac'
' Superunknown (Redux) - 09 Limo Wreck.flac'
' Superunknown (Redux) - 10 The Day I Tried to Live.flac'
' Superunknown (Redux) - 11 Kickstand.flac'
' Superunknown (Redux) - 12 Fresh Tendrils.flac'
' Superunknown (Redux) - 13 4th of July.flac'
' Superunknown (Redux) - 14 Half.flac'
' Superunknown (Redux) - 15 Like Suicide.flac'

Which is a huge headstart - I can sort - and I have a common string I can escape / replace with some simpler regex…

(I made “perl-rename” a symlink to /usr/bin/prename - most Debian based systems install prename, and create an alias “alias rename=prename” - I’m sure most Ubuntu versions shipped with prename pre-installed - but apparently not with Ubuntu 23.04 - had to install it - no drama - but an alarming sign of things to come - like needing “sudo” to run dmesg, and not installing f–cking cal/nal by default)…

daniel.m.tripp · July 14, 2023, 1:56pm

My biggest problem now is that regex uses all sorts of brackets - all of them AFAIK : “([{” (and the other side) and I’ve had problems with sed before if a string contained any sort of bracket character… If you want to get rid of any of them - it makes things more complex than they should perhaps, be…

don.karon · July 14, 2023, 2:26pm

Maybe this will work:

sort -k6 -o output.txt input.txt

where

-k6 means sort using the sixth field (which is the track number)
-o means send output to a file rather than standard output.

pdecker · July 14, 2023, 2:41pm

Maybe you could find a Python library to read the information from the metadata and rename the file that way.

daniel.m.tripp · July 15, 2023, 12:19am

This did it :
rename 's/\ Superunknown\ $Redux$\ -\ //' *

then
rename 's/\ /-/' *
replace first space " " with a dash

then
rename 's/\ //g' *
then subsequently remove ALL (/g) spaces

And got this :

01-LetMeDrown.flac
02-MyWave.flac
03-FellonBlackDays.flac
04-Mailman.flac
05-Superunknown.flac
06-HeadDown.flac
07-BlackHoleSun.flac
08-Spoonman.flac
09-LimoWreck.flac
10-TheDayITriedtoLive.flac
11-Kickstand.flac
12-FreshTendrils.flac
13-4thofJuly.flac
14-Half.flac
15-LikeSuicide.flac

daniel.m.tripp · September 19, 2023, 11:50am

ran into an issue over the weekend where I had a bunch of folders with the year name as the last field… Easy enough to identify using awk ‘{print $NF}’ - problem I ran into was it wasn’t easy to do in sed (and thus perl rename “prename”) regex and I gave up and did it manually…

nevj · September 19, 2023, 12:41pm

Would *$ match the last field in sed?

Clivegg12 · September 30, 2023, 4:55pm

I have only just read this thread so late in replying. I copied the sample text into file ‘DanTripp.txt’, then assuming all text lines have 3 fields separated by ‘-’ as in Dan’s original post then
cut -d- -f3 DanTripp.txt | sed ‘s/^ //’ | sed ‘s/ /-/’ | sort
produces the output shown in the thread.
I hope this is of interest since it uses ‘cut’ which is different from solutions shown.
Cheers
Clive

daniel.m.tripp · October 1, 2023, 12:43am

Interesting - I actually used cut before I ever used sed or awk…