Getting number of different files
One of my side projects has been to transcode my entire media library from my preferred format (Matroska or MKV) to MP4. As much as I don't like it (MP4 doesn't support multiple soft-subtitles in the same file), the Roku doesn't support anything but MP4 files.
I have a pretty good number of video files, mostly because I break individual episodes out (so 11-22 per season) and I'm fond of anime (Bleach and Naturo are both over a hundred episodes), but I thought I was doing pretty good on the transcoding. However, since "I think" isn't really useful, I wrote up a fairly decent Bash script that tells me how many files I have of each type and it lets me break it up by directory.
This is mainly to see where I am in the transcode process but also to figure out which directory or sub-directory I need to convert next. When run, I get output like this:
$ ./convert-status A-H/? Count MP4 MKV AVI MOV MPG ----- ---- ---- ---- ---- ---- A-H/A 250 56 133 61 0 0 A-H/B 731 443 38 250 0 0 A-H/C 305 188 81 36 0 0 A-H/D 224 157 59 8 0 0 A-H/E 305 258 27 8 0 12 A-H/F 589 325 123 141 0 0 A-H/G 435 5 183 247 0 0 A-H/H 659 12 61 586 0 0 ----- ---- ---- ---- ---- ---- 3498 1444 705 1337 0 12 $
From the above format, you can see that I've gotten about 1.4k files converted into the proper format but still got quite a few left to convert. Most of these are episodes (Bleach and Hercules) but at least it gives me a sense of progress.
The Bash script itself looks like this:
#!/bin/bashWipe out our temporary directory, if we have one. This isn't likely
since we are using $$ to get the PID of the process.
rm -f /tmp/convert-status-$$
Figure out the width of the files. We do this so the columns line up
pretty and has absolutely no impact on the functionality.
for dir in “$@” do # Ignore non-directories. if [ ! -d “$dir” ] then # Create a generic placeholder for all non-directories. echo “-FILES-” >> /tmp/convert-status-$$ continue fi
# Include the directory name. echo "$dir" >> /tmp/convert-status-$$
done
This fancy little bit of AWK (which is from the Internet and I don't
exactly grok) figures out the maximum length string in the file we
just created. After this run, $m will contain the longest string
length (as an integer).
m=$(awk ' { if ( length > L ) }END' /tmp/convert-status-$$)
Keep track of all the totals. We use
printf
even though we coulduse
echo
just so all the output calls are identical.printf “%-$s Count MP4 MKV AVI MOV MPG\n” printf “%-$s —– —- —- —- —- —-\n”
These are the counters for the grand totals (max) and the
non-directory counts (files).
max=0 max_mkv=0 max_mp4=0 max_avi=0 max_mov=0 max_mpg=0
files=0 files_mkv=0 files_mp4=0 files_avi=0 files_mov=0 files_mpg=0
Go through a list of all the directories in the parameters.
for dir in “$@” do # Ignore non-directories. if [ ! -d “$dir” ] then # If this is a file, we just add to the counters. case ${dir#*.} in “mp4”) files_mp4=$(expr $files_mp4 + 1);; “mkv”) files_mkv=$(expr $files_mkv + 1);; “avi”) files_avi=$(expr $files_avi + 1);; “mov”) files_mov=$(expr $files_mov + 1);; “mpg”) files_mpg=$(expr $files_mpg + 1);; *) continue;; esac
# Increment the general file counter. files=$(expr $files + 1) # Don't bother doing anything else. continue fi # Count the number of files of a given type inside that # directory. Since we are using `find`, this will recursively get # all the files inside subdirectories also. We don't care about # the file names, just how many we find. This does have a slight # bug if you have a .filename.extension file (which I use for # temporary files), but usually that is okay. mkv=$(find "$dir" -name "*.mkv" | wc -l) mp4=$(find "$dir" -name "*.mp4" | wc -l) avi=$(find "$dir" -name "*.avi" | wc -l) mov=$(find "$dir" -name "*.mov" | wc -l) mpg=$(find "$dir" -name "*.mpg" | wc -l) # Add up all the counts above so we have a "total files per # directory" variable. count=$(expr $mkv + $mp4 + $avi + $mov + $mpg) # Increment the grand totals for the bottom line. max_mp4=$(expr $max_mp4 + $mp4) max_mkv=$(expr $max_mkv + $mkv) max_avi=$(expr $max_avi + $avi) max_mov=$(expr $max_mov + $mov) max_mpg=$(expr $max_mpg + $mpg) max=$(expr $max + $count) # Write out a single record for everything, but only if we have # something. if [ $count -gt 0 ] then printf "%-${m}s %5d %4d %4d %4d %4d %4d\n" \ "$dir" \ $count $mp4 $mkv $avi $mov $mpg fi
done
Write out the file totals, but only if we have at least one file.
if [ $files -gt 0 ] then printf “%-$s %5d %4d %4d %4d %4d %4d\n”
“-FILES-”
$files $files_mp4 $files_mkv $files_avi $files_mov $files_mpg fiWrite out the grand totals.
max=$(expr $max + $files) max_mp4=$(expr $max_mp4 + $files_mp4) max_mkv=$(expr $max_mkv + $files_mkv) max_avi=$(expr $max_avi + $files_avi) max_mov=$(expr $max_mov + $files_mov) max_mpg=$(expr $max_mpg + $files_mpg)
printf “%-$s —– —- —- —- —- —-\n” printf “%-$s %5d %4d %4d %4d %4d %4d\n”
“”
$max $max_mp4 $max_mkv $max_avi $max_mov $max_mpg
It should be pretty easy to convert it to fit other file formats (say text, OpenOffic.org, StarOffice, and Word) or just to get an idea of the file types.
Metadata
Categories:
Tags: