One of my side projects has been to transcode my entire media library from my preferred format (Matroska or MKV) to MP4. As much as I don't like it (MP4 doesn't support multiple soft-subtitles in the same file), the Roku doesn't support anything but MP4 files.
I have a pretty good number of video files, mostly because I break individual episodes out (so 11-22 per season) and I'm fond of anime (Bleach and Naturo are both over a hundred episodes), but I thought I was doing pretty good on the transcoding. However, since "I think" isn't really useful, I wrote up a fairly decent Bash script that tells me how many files I have of each type and it lets me break it up by directory.
This is mainly to see where I am in the transcode process but also to figure out which directory or sub-directory I need to convert next. When run, I get output like this:
$ ./convert-status A-H/? Count MP4 MKV AVI MOV MPG ----- ---- ---- ---- ---- ---- A-H/A 250 56 133 61 0 0 A-H/B 731 443 38 250 0 0 A-H/C 305 188 81 36 0 0 A-H/D 224 157 59 8 0 0 A-H/E 305 258 27 8 0 12 A-H/F 589 325 123 141 0 0 A-H/G 435 5 183 247 0 0 A-H/H 659 12 61 586 0 0 ----- ---- ---- ---- ---- ---- 3498 1444 705 1337 0 12 $
From the above format, you can see that I've gotten about 1.4k files converted into the proper format but still got quite a few left to convert. Most of these are episodes (Bleach and Hercules) but at least it gives me a sense of progress.
The Bash script itself looks like this:
#!/bin/bashWipe out our temporary directory, if we have one. This isn't likely
since we are using $$ to get the PID of the process.
rm -f /tmp/convert-status-$$
Figure out the width of the files. We do this so the columns line up
pretty and has absolutely no impact on the functionality.
for dir in “$@” do # Ignore non-directories. if [ ! -d “$dir” ] then # Create a generic placeholder for all non-directories. echo “-FILES-” >> /tmp/convert-status-$$ continue fi
# Include the directory name. echo "$dir" >> /tmp/convert-status-$$
done
This fancy little bit of AWK (which is from the Internet and I don't
exactly grok) figures out the maximum length string in the file we
just created. After this run, $m will contain the longest string
length (as an integer).
m=$(awk ' { if ( length > L ) }END' /tmp/convert-status-$$)
Keep track of all the totals. We use
printf
even though we coulduse
echo
just so all the output calls are identical.printf “%-$s Count MP4 MKV AVI MOV MPG\n” printf “%-$s —– —- —- —- —- —-\n”
These are the counters for the grand totals (max) and the
non-directory counts (files).
max=0 max_mkv=0 max_mp4=0 max_avi=0 max_mov=0 max_mpg=0
files=0 files_mkv=0 files_mp4=0 files_avi=0 files_mov=0 files_mpg=0
Go through a list of all the directories in the parameters.
for dir in “$@” do # Ignore non-directories. if [ ! -d “$dir” ] then # If this is a file, we just add to the counters. case ${dir#*.} in “mp4”) files_mp4=$(expr $files_mp4 + 1);; “mkv”) files_mkv=$(expr $files_mkv + 1);; “avi”) files_avi=$(expr $files_avi + 1);; “mov”) files_mov=$(expr $files_mov + 1);; “mpg”) files_mpg=$(expr $files_mpg + 1);; *) continue;; esac
# Increment the general file counter. files=$(expr $files + 1) # Don't bother doing anything else. continue fi # Count the number of files of a given type inside that # directory. Since we are using `find`, this will recursively get # all the files inside subdirectories also. We don't care about # the file names, just how many we find. This does have a slight # bug if you have a .filename.extension file (which I use for # temporary files), but usually that is okay. mkv=$(find "$dir" -name "*.mkv" | wc -l) mp4=$(find "$dir" -name "*.mp4" | wc -l) avi=$(find "$dir" -name "*.avi" | wc -l) mov=$(find "$dir" -name "*.mov" | wc -l) mpg=$(find "$dir" -name "*.mpg" | wc -l) # Add up all the counts above so we have a "total files per # directory" variable. count=$(expr $mkv + $mp4 + $avi + $mov + $mpg) # Increment the grand totals for the bottom line. max_mp4=$(expr $max_mp4 + $mp4) max_mkv=$(expr $max_mkv + $mkv) max_avi=$(expr $max_avi + $avi) max_mov=$(expr $max_mov + $mov) max_mpg=$(expr $max_mpg + $mpg) max=$(expr $max + $count) # Write out a single record for everything, but only if we have # something. if [ $count -gt 0 ] then printf "%-${m}s %5d %4d %4d %4d %4d %4d\n" \ "$dir" \ $count $mp4 $mkv $avi $mov $mpg fi
done
Write out the file totals, but only if we have at least one file.
if [ $files -gt 0 ] then printf “%-$s %5d %4d %4d %4d %4d %4d\n”
“-FILES-”
$files $files_mp4 $files_mkv $files_avi $files_mov $files_mpg fiWrite out the grand totals.
max=$(expr $max + $files) max_mp4=$(expr $max_mp4 + $files_mp4) max_mkv=$(expr $max_mkv + $files_mkv) max_avi=$(expr $max_avi + $files_avi) max_mov=$(expr $max_mov + $files_mov) max_mpg=$(expr $max_mpg + $files_mpg)
printf “%-$s —– —- —- —- —- —-\n” printf “%-$s %5d %4d %4d %4d %4d %4d\n”
“”
$max $max_mp4 $max_mkv $max_avi $max_mov $max_mpg
It should be pretty easy to convert it to fit other file formats (say text, OpenOffic.org, StarOffice, and Word) or just to get an idea of the file types.