I have been working my way through about 300 GB of drone and GoPro videos from this summer. One of the lingering thoughts I was having throughout this process centered around the concept of breaking videos up into individual slides. Once I stumbled across Algorithmia's Deep Learning, I found myself thinking about how I could break up videos into separate images, and apply algorithmic texture filters to each individual image, and then reassemble back into a video.
FFMPEG stood out as the solution I needed to accomplish this. It was pretty easy to export each video as separate .jpg files, as well as the ability to assemble any images into a video. What did take me some time was getting the frame rate, size, and other finer aspects of video to work as I desired, and in sync with my drone videos. To achieve an acceptable video I needed 60 individual images for each second of video, potentially making the work a compute and storage intensive endeavor.
Using Amazon EC2 and S3 it isn't too much work to leverage FFMPEG to break up videos into separate images. Once I did this, I started publishing a JSON representation of each video, along with each individual image to Github, leverage version control, forking, and the other benefits of the platform, but for use in managing a video, instead of code. This process has given each of my videos, a machine readable, Git managed the representation of each video for me to work with, manipulate, and evolve independently.
When you combine the separation of video, with Github in this way, with an API-driven approach to separation, assembly, as well as individual and overall image manipulation, the possibilities are pretty limitless. After I separate out each video into images, I am then applying textures and filters to each individual image using Algorithmia's deep filters, a process that has numerous opportunities, but I think the wider approach holds more potential than just fun filters. This approach is the primary reason I did this work. Playing with the videos, filters, and images was a welcome distraction, but I think the opportunity around breaking up videos into separate images is much bigger than algorithmic rotoscope.