curious onloooker: December 2013

Question:

I have a large archive of recordings on a well-used mythbox and I know I have accumulated a lot of duplicates. I need to free some space, but going through and manually getting rid of the duplicates would be a monumental task.

Is there any mechanism or script available to track down these duplicates, and preferably delete all but the newest recording? I've searched the web and poked at my interface with no luck.

Answer:

You're welcome to use the script I wrote; I've been running it regularly to remove the duplicates.

$ wget http://evuraan.info/evuraan/stuff/myth-remove-duplicates.sh.txt -O myth-remove-duplicates.sh
$ chmod +x myth-remove-duplicates.sh

Edit to replace RECORDINGDIR, user_name and pass_word with what's applicable to your setup and run myth-remove-duplicates.sh.

Running it :

$ ./myth-remove-duplicates.sh
1091_20131221090100.mpg is duplicate
removed `/var/lib/mythtv/recordings/1091_20131221090100.mpg'
1091_20131221070100.mpg is duplicate
removed `/var/lib/mythtv/recordings/1091_20131221070100.mpg'

Here's the script, myth-remove-duplicates.sh:

#!/bin/bash 
# Authored by Evuraan_AT_gmail_DOT_com
# ABSOLUTELY NO WARRANTY, to the extent permitted by
# applicable law.
# YMMV.
# Use at your own risk.

user_name="mythtv"
pass_word="yourpassword" 
RECORDINGDIR="/var/lib/mythtv/recordings"
list="/tmp/recordings.txt-$RANDOM-$RANDOM"
SQLSCRIPT="/tmp/recordings.sql-txt-$RANDOM-$RANDOM"

gen_lists(){
mysql -u "$user_name" -p"$pass_word" -e "select starttime,basename,title,description from recorded order by starttime" mythconverg | tac > $list
}

verify_duplicate(){
# if verbatim repeats..
sum_a=$(egrep "${a:0:16}" $list | awk -F".mpg\t" 'NR>1 {print $NF}' |md5sum) 
[ ! -z "$sum_a" ] && egrep "${a:0:16}" $list | awk -F".mpg\t" 'END {print $NF}' | md5sum |egrep -q  ${sum_a//-}
}

remove_duplicates(){
[ -s "${RECORDINGDIR}/${a}" ] && ( rm -v "${RECORDINGDIR}/${a}" ; 
	echo "DELETE FROM mythconverg.recorded WHERE basename = '$a';" > $SQLSCRIPT
	)
[ -s $SQLSCRIPT ] && mysql -u "$user_name" -p"$pass_word"   mythconverg < $SQLSCRIPT
}




gen_lists
awk {'print $3'} $list  |grep mpg$ | while read a ; do 
	[[ $(egrep -c ${a:0:16} $list ) -ge 2 ]]  && verify_duplicate && sed -i /"${a:0:16}"/d $list && echo $a is duplicate && remove_duplicates
done

rm $SQLSCRIPT $list 1>/dev/null 2>&1 || :

curious onloooker

Sunday, December 22, 2013

mythtv - Auto-delete duplicate recordings with myth-remove-duplicates.sh

Followers

സൂചിക::Index