One of my side projects has been to transcode my entire media library from my preferred format (Matroska or MKV) to MP4. As much as I don't like it (MP4 doesn't support multiple soft-subtitles in the same file), the Roku doesn't support anything but MP4 files.

I have a pretty good number of video files, mostly because I break individual episodes out (so 11-22 per season) and I'm fond of anime (Bleach and Naturo are both over a hundred episodes), but I thought I was doing pretty good on the transcoding. However, since "I think" isn't really useful, I wrote up a fairly decent Bash script that tells me how many files I have of each type and it lets me break it up by directory.

This is mainly to see where I am in the transcode process but also to figure out which directory or sub-directory I need to convert next. When run, I get output like this:

$ ./convert-status A-H/?
       Count  MP4  MKV  AVI  MOV  MPG
       ----- ---- ---- ---- ---- ----
A-H/A    250   56  133   61    0    0
A-H/B    731  443   38  250    0    0
A-H/C    305  188   81   36    0    0
A-H/D    224  157   59    8    0    0
A-H/E    305  258   27    8    0   12
A-H/F    589  325  123  141    0    0
A-H/G    435    5  183  247    0    0
A-H/H    659   12   61  586    0    0
       ----- ---- ---- ---- ---- ----
        3498 1444  705 1337    0   12

From the above format, you can see that I've gotten about 1.4k files converted into the proper format but still got quite a few left to convert. Most of these are episodes (Bleach and Hercules) but at least it gives me a sense of progress.

The Bash script itself looks like this:


# Wipe out our temporary directory, if we have one. This isn't likely
# since we are using $$ to get the PID of the process.
rm -f /tmp/convert-status-$$

# Figure out the width of the files. We do this so the columns line up
# pretty and has absolutely no impact on the functionality.
for dir in "$@"
    # Ignore non-directories.
    if [ ! -d "$dir" ]
        # Create a generic placeholder for all non-directories.
        echo "-FILES-" >> /tmp/convert-status-$$

    # Include the directory name.
    echo "$dir" >> /tmp/convert-status-$$

# This fancy little bit of AWK (which is from the Internet and I don't
# exactly grok) figures out the maximum length string in the file we
# just created. After this run, $m will contain the longest string
# length (as an integer).
m=$(awk ' { if ( length > L ) { L=length} }END{ print L}' /tmp/convert-status-$$)

# Keep track of all the totals. We use printf even though we could
# use echo just so all the output calls are identical.
printf "%-${m}s  Count  MP4  MKV  AVI  MOV  MPG\n"
printf "%-${m}s  ----- ---- ---- ---- ---- ----\n"

# These are the counters for the grand totals (max) and the
# non-directory counts (files).


# Go through a list of all the directories in the parameters.
for dir in "$@"
    # Ignore non-directories.
    if [ ! -d "$dir" ]
        # If this is a file, we just add to the counters.
        case ${dir#.} in
            "mp4") files_mp4=$(expr $files_mp4 + 1);;
            "mkv") files_mkv=$(expr $files_mkv + 1);;
            "avi") files_avi=$(expr $files_avi + 1);;
            "mov") files_mov=$(expr $files_mov + 1);;
            "mpg") files_mpg=$(expr $files_mpg + 1);;
            ) continue;;

        # Increment the general file counter.
        files=$(expr $files + 1)

        # Don't bother doing anything else.

    # Count the number of files of a given type inside that
    # directory. Since we are using find, this will recursively get
    # all the files inside subdirectories also. We don't care about
    # the file names, just how many we find. This does have a slight
    # bug if you have a .filename.extension file (which I use for
    # temporary files), but usually that is okay.
    mkv=$(find "$dir" -name ".mkv" | wc -l)
    mp4=$(find "$dir" -name ".mp4" | wc -l)
    avi=$(find "$dir" -name ".avi" | wc -l)
    mov=$(find "$dir" -name ".mov" | wc -l)
    mpg=$(find "$dir" -name "*.mpg" | wc -l)

    # Add up all the counts above so we have a "total files per
    # directory" variable.
    count=$(expr $mkv + $mp4 + $avi + $mov + $mpg)

    # Increment the grand totals for the bottom line.
    max_mp4=$(expr $max_mp4 + $mp4)
    max_mkv=$(expr $max_mkv + $mkv)
    max_avi=$(expr $max_avi + $avi)
    max_mov=$(expr $max_mov + $mov)
    max_mpg=$(expr $max_mpg + $mpg)
    max=$(expr $max + $count)

    # Write out a single record for everything, but only if we have
    # something.
    if [ $count -gt 0 ]
        printf "%-${m}s  %5d %4d %4d %4d %4d %4d\n" \
            "$dir" \
            $count $mp4 $mkv $avi $mov $mpg

# Write out the file totals, but only if we have at least one file.
if [ $files -gt 0 ]
    printf "%-${m}s  %5d %4d %4d %4d %4d %4d\n" \
        "-FILES-" \
        $files $files_mp4 $files_mkv $files_avi $files_mov $files_mpg

# Write out the grand totals.
max=$(expr $max + $files)
max_mp4=$(expr $max_mp4 + $files_mp4)
max_mkv=$(expr $max_mkv + $files_mkv)
max_avi=$(expr $max_avi + $files_avi)
max_mov=$(expr $max_mov + $files_mov)
max_mpg=$(expr $max_mpg + $files_mpg)

printf "%-${m}s  ----- ---- ---- ---- ---- ----\n"
printf "%-${m}s  %5d %4d %4d %4d %4d %4d\n" \
    "" \
    $max $max_mp4 $max_mkv $max_avi $max_mov $max_mpg

It should be pretty easy to convert it to fit other file formats (say text,, StarOffice, and Word) or just to get an idea of the file types.