×
Home About Updates Deep Filters Youtube Github Contact

algorithmic rotoscope

Experimentation in applying texture to video using machine learning via APIs.

Updates

Algorithmic Rotoscope runs on Github, and I use Github Pages to power this work, and publish regular updates using a Jekyll blog. I publish my regular work and thoughts regarding what I'm doing in this area and with this type of approach to video rotoscope. You can subscribe to the atom feed above to get updates, or tune in via Twitter and I'll publish out updates regularly there as well.


The Russian Propaganda Distortion Field Around The White House

I am having a difficult time reconciling what is going on with the White House right now. The distortion field around the administration right now feels like some bad acid trip from the 1980s, before I learned how to find the good LSD. After losing their shit over her emails and Benghazi, they are willing to overlook Russia fucking with our election on so many levels and infiltrating the White House. Wacky. Just fucking wacky!

The way Russia has fed the poor folk in this country a steady diet of bullshit is pretty crafty, as well as disturbing. Their approach to disinformation has dovetailed nicely with the approach of the GOP in this country. As usual, I am trying to understand and visualize the algorithmic distortion in this conversation, and how our current administration could be so heavily under the influence of Russian propaganda.

I'm going through Russian propaganda archives looking for the right colors and textures to shine a light on the algorithmic distortion raining down on the White House as part of this ongoing Russian cyber attack. I'm using the posters I've found to train some machine learning models, and the first one has come off the cloud pipeline and was ready for applying to some images to see what the effect might be. I started with a couple photos I've taken of the White House. One from the lawn, and one from inside the Eisenhower Executive Office Building (EEOB) while I was working there. 

I like the results. It makes it look like the distortion field around the White House are just dense pamphlets raining down from above, dens like Internet packets aggregating on a wireless network fighting to get in. It's fascinating to watch people be so willfully ignorant to see the algorithmic distortion around them. Even with all the talk of wireless, mobile, the web, and cyber warfare. They don't see how they are under assault from information and disinformation--something the Russians seem to excel at.

There are 3 other posters being used to train machine learning models right now. I can only do one at a time and each one takes about 12 hours. Then it will take me about another week or so of applying them to images to find what works and doesn't work with the filters. I have about 40 individual filters currently, and I have been focusing heavily on dystopian textures in the previous round. I am thinking that this round I'll focus on colors and textures I can use to highlight the effects of the cyber on our reality -- I hear it is going to be huge. 

Here is the look from next door....

Here is the view from the front lawn...


Algorithmic Reflections On The Immigration Debate

We are increasingly looking through an algorithmic lens when it comes to politics in our everyday lives. I spend a significant portion of my days trying to understand how algorithms are being used to shift how we view and discuss politics. One of the ongoing themes in my research is focused on machine learning, which is an aspect of technology currently being applied to news curation, identifying fake news, all the way to how we monitor and see the world online with images and video. 

Algorithms are painting a real-time picture that colors how we see the physical world around us--something that is increasingly occurring online for many of us. Because many of the creators of algorithms are white men, they often are blind and even willfully ignorant of how their algorithms and technological tools are used for evil purposes. With a focus on revenue and the interests of their investors, Twitter, Facebook, Reddit and other platforms often do not see (or are willing to turn a blind eye to) how hateful groups are using their platforms to spread misinformation and hate. When you combine this with a lack of awareness when it comes history, we end up in the current situation we find ourselves in with the Trump administration.

As part of my work to understand how algorithms are shaping our world views I am playing with different ways of applying machine learning to my images and videos for use across my storytelling -- I am calling @algorotoscope. It's helping me understand how machine learning works (or not), while also giving me an artistic distraction from the increasingly inhuman world of technology. Taking photos and videos, as well as the process of training and applying the filters gives me relief, allowing me to find some balance in the very toxic digital environment I find myself in today.

I feel that we are allowing algorithms to amplify some very hateful views of the world right now, something that is being leveraged to produce some very damaging outcomes in the immigration debate. To help paint a picture of what I'm seeing from my vantage point, I took an old World War II nazi propaganda poster and used it to train a machine learning model, which I could then apply to any image or video using a platform called Algorithmia. Here is the resulting image....

The image is a photo I took from the waiting area at Ellis Island, with sunlight reflecting through the windows, lighting up the tiles in the room where millions of immigrants waiting to be admitted into this country. I feel like we are allowing our willful ignorance of history as Americans to paint the immigration debate today, something that is being accelerated and fueled by a small hateful portion of our society, with the assistance of algorithms. Facebook, Twitter, Reddit, and other platforms are allowing their algorithms to be gamed by this very vocal minority in a way that is shaping the views of the larger population--making for a very destructive and divisive debate about something very core to our country's origin -- immigration.

If we are going to get to the bottom of this recent shift in how we operate as a society, we are going to have to work to shine a light on how these algorithms are operating, and how advertising is incentivizing platforms to be blind to their damaging effects. We are allowing algorithms and digital technology to reflect and amplify the worst within us and pushing us to be more polarized. I'm hoping to continue stimulating a more constructive conversation about how technology is being deployed, one that is NOT fueled by greed or hate, through my storytelling, programming, and imagery.


Algorithmia's Multi-Platform Data Storage Solution For Machine Learning Workflows

I've been working with Algorithmia to manage a large number of images as part of my algorithmic rotoscope side project, and they have a really nice omni-platform approach to allowing me to manage my images and other files I am using in my machine learning workflows. Images, files, and the input and output of heavy object is an essential part of almost any machine learning task, and Algorithmia makes easy to do across the storage platforms we use the most (hopefully). 

Algorithmia provides you with local data storage--pretty standard stuff, but they also allow you to connect your Amazon S3 account, or your Dropbox account, and connect to specific folders, buckets, while helping you handle all of your permissions. Maybe I have my blinders on with this because I heavily use Amazon S3 as me default online storage, and Dropbox is my secondary store, but I think the concept still is worth sharing..

This allows me to seamlessly manage the objects, documents, files, and other images I store across my operation as part of my machine learning workflow.  Algorithmia even provides you with an intuitive way of referencing files, by allowing each Data URI to uniquely identifies files and directories, with each composed of a protocol and a path, with each service having its own unique protocol:

  • data:// Algorithmia hosted data
  • dropbox:// Dropbox default connected accounts
  • S3:// Amazon S3 default connected account

This approach dramatically simplifies my operations when working with files, and allows me to leverage the API driven storage services I am already putting to work, while also taking advantage of the growing number of algorithms available to me in Algorithmia's catalog. In my algorithmic rotoscope project I am breaking videos into individual images, producing 60 images per second of video, and uploading to Amazon S3. Once images are uploaded, I can then run Algorithmia's Deep Filter algorithm against all images, sometimes thousands of images, using their text models, or any of the 25+ I've trained myself. 

This approach is not limited to just video and images, this is generic to any sort of API driven machine learning orchestration. Just swap out video and images, with mapping, content, or other resource, and then find the relevant machine learning workflow you need to apply, and get to work. While I am having fun playing with my drone videos and texture filters, the approach can be just as easily applied to streamline any sort of marchine learning workflow.

One additional benefit of storing data this way is I've found Dropbox to be a really amazing layer for including humans in the workflow. I leverage Amazon S3 for my wholesale, compute grade storage, but Dropbox is where I publish images, videos, and documents that I need to put in front of humans, or include them in the machine learning workflow. I find this gives them a role in the process, in a way that gives them control over the data, images, videos, and other objects, on a platform they are already comfortable with. I'd encourage Algorithmia, and other providers to also consider including Google Drive as part of this--it would go a long way logically connected with the human portion of the wokflows.

Anyways, I thought Algorithmia's approach to storage was interesting, worth highlight, and something that other providers might consider implementing themselves.


Learning About Machine Learning APIs With My Algorithmic Rotoscope Work

I was playing around with Algorithmia for a story about their business model back in December, when I got sucked into playing with their DeepFilter service, resulting in a 4-week long distraction which ultimately became what I am calling my algorithmic rotoscope work. After weeks of playing around, I have a good grasp of what it takes to separate videos into individual images, applying the Algorithmia machine learning filters, and reassembling them as videos. I also have several of my own texture filters created now using the AWS AMI and process provided Algorithmia--you can learn more about algorithmic rotoscope, and details of what I did via the Github project updatese.

The project has been a great distraction from what I should be doing. After the election, I just did not feel like doing my regular writing, scheduling of Tweets, processing of press releases, and the other things I do on a regular basis. Algorithmic Rotoscope provided a creative, yet a still very API focused project to take my mind off things during the holidays. It was a concept I couldn't get out of my head, which is always a sign for me that I should be working on a project. The work was more involved than I anticipated, but after a couple weeks of tinkering, I have the core process for applying filters to videos working well, allowing me to easily apply the algorithmic textures.

Other than just being a distraction, this project has been a great learning experience for me, with several aspects keeping me engaged:

  • Algorithmia's Image Filters  - Their very cool DeepFilter service, which allows you to apply artistically and stylish filters to your images using their API or CLI, providing over 30 filters you can use right away.
  • Training Style Transfer Models - Firing up an Amazon GPU instance, look through art books and find interesting pieces that can be used to train the machine learning models, so you can define your own filters.
  • Applying Filters To Images - I spent hours playing with Algorithmia's filters, applying to my photo library, experimenting, and playing around with what looks good, and what is possible.
  • Applying Filters To Videos - Applying Algorithmia's, and my own filters video I have laying around, especially what is possible when applied to the GB's of drone video I have laying around, something that is only going to grow.

Why is this an API story? Well, first of all, it uses the Algorithmia API, but I also developed the separation of the videos, applying filters to images, and reassembling the videos as an API. It isn't anything that is production stable, but I've processed thousands of images, many minutes of video, and made over 100K API calls to Algorithmia. Next, I am going to write-up Algorithmia's business model, using my algorithmic rotoscope work as a hypothetical API-driven business--helping me think through the economics of building a SaaS or retail API solution on top of Algorithmia. 

Beyond being an API story, it has been a lot of fun to engineer, and play with. I still have a GPU instance fired up, training filters, as well as recording more drone and other video footage specifically so I can apply some of the new filters I've produced. I have no intention of doing it as a business. algorithmic rotoscope is just a side project, that I hope will continue to be a creative distraction for me, and give me another reason to keep flying drones, and getting away from the computer when I can. In the end I am learning a lot about drones, videography, and machine learning, but the best of all it has helped me regain my writing mojo--with this being the first post I've written on API Evangelist since LAST YEAR! ;-)


Finding The Right Dystopian Filter To Represent The World Unfolding Around Us

I got sucked into a project over the holidays, partly because it was an interesting technical challenge, but mostly because it provided me with a creative distraction after the election. I started playing with image filters from Algorithmia, using their Deep Filter service, which some may recognize as being similar to services like Prisma. The difference is with Algorithmia is you can use their 30+ filters, or if you want you can train your own image filters using their AWS machine learning AMI.

As I was playing with Algorithmia after the election, I had many images in my head of the dystopian landscape that is unfolding around us. Many of these images were reminiscent of my childhood in the 70's and 80's, during the cold war, where the future perpetually seemed very bleak to me. I wanted a way to take these images from my head and apply to the photos I was taking, and even better, what if I could to it to the video, and more specifically, the drone videos I am making. Four weeks later, I have gotten to the first set of filters, that when applied to my photography that gets me closer to the visions I had in my head.

Here is an original photo taken by me on January 2nd, 2017 in East Los Angeles:

Next, I wanted to reduce the world around us to be less than real, comic, or drawn. I wanted a way to algorithmically reduce the outlines of the world into something that resembled our real world, to make things as familiar as possible, but then quickly bending and skewing it, so that I could help us see how dark things are becoming.

To borrow a phrase from my partner in crime, I wanted to be able to reduce everything I captured in my photography and videos down to a transaction. I wanted to show us how the world around us is being digitized, de-humanized, and rendered into an even more hostile landscape, that has very little concern for the humans living in it.

I wanted to be able to go even further and visualize how noisy the world has become, not because of cars and airplanes, but because of our bits and bytes that were flowing around us every day. Help us visualize the constant assault on us, the people we love, and that increasingly there is no escape from this constant assault--it is in our homes, cars, business, and public spaces.

I want to paint a dark dystopian digital landscape, but ultimately I want as wide as a possible palette as I can. I needed an algorithmic palette of colors and textures that were born from the true artists who came before us, making the colors and textures familiar, and even soft before I took things to a much darker level. I didn't want to just shock, I wanted to slowly shift the world around us down a dystopian road.

Transforming our world into a cartoon or painting in a way that didn't make you feel completely uncomfortable. You were slowly slipping into a dream, falling asleep, and things haven't gotten too scary yet. The world is still familiar, with bright and colorful elements that still keep you smiling, and hopeful that things will get better--believing in the story I was telling.

Then slowly I want to be able to turn up the heat. Allowing the sun to set on the world you once knew. Then begin to reveal some of the darker outlines of the shadows and some of the darker aspects of our reality. Bringing some of the scarier aspects of the unfolding world around us out of the dark, and into the open.

Then looking to make you feel like you just dropped a substantial hit of LSD, allowing the sun to set on reality, where we let all the demons out to play and roam the streets. If you still live on the 10th floor or above the world might still look beautiful, but if you live on the ground floor, the world is a very scary place, where nobody is safe.

Something that will eventually affect everyone. The rich, the poor. Nobody will be safe, and nobody is immune from the dystopian effects technology and politics is having on our world. Just because you do not see the negative effects of the surveillance economy on your floor, doesn't mean it won't eventually reach you--at some point, we'll all be impacted.

I wanted to play with ways of taking us back in time. Take modern images, and make them feel like we were in the 50's, 40's, or any other decade or time period the conservatives want us to live in. I needed filters to apply to the current photos and videos I was taking, and shift them in time, keeping some of the context, while also allowing me to tell other stories that take us anywhere in the past.

I wanted a variety of ways to visualize the impending doom on the horizon. I wanted to be able to force the sun to set on the current day and paint an ominous picture what will happen once the sun goes down, and tomorrow begins. What did we do today, that will impact us tomorrow? How can I paint a picture that grabs our attention and potentially avoid a darker tomorrow?

And as the world begin to bend out of control, and we begin to lose our grip on reality, how can I point out how dark things are on the landscape, and show you that things are slipping? What is the right color palette, and texture for showing us that we are slipping into a darker reality, and potentially going down a road where there is no return?

In the same way, how to I paint a hopeful picture of tomorrow, either as the sun is setting, or right before it is coming up? Things might be a little dark, but this is a new day, and there is a little hope out there if do the right thing today, or maybe not make the same mistakes today that we made yesterday.

How do I articulate depression, and the mental illness around us, which we are in denial of? How do I take the color out of everything that matters to us, suck all the hot air out of the marketing hype and advertising polish that exists everywhere? How do I limit the color palette we have access to be more realistic, allowing us to have an honest conversation about what the fuck is really going on?

Most importantly, how do I avoid us heading down the darkest, most dystopian landscapes we can imagine? How do I make textures, colors, and filters that show the bombed out landscape ahead of us if we do not pull our shit together? How can I take the buildings, streets, and roads around us, and make the main street look like Syria, reminding us of what is just around the corner?

This is just the beginning. I have trained 25 separate filters, using Algorithmia's style transfer model machine learning process. I have another week or so of training these filters. I'm also working to gather more video and image footage that I can apply these filters. At each step of the process 1) capture images, 2) train models 3) apply filters I am learning a lot, something I hope never really ends. Training filters are costly, so I won't be able to continue indefinitely, but I wanted to mark the point on the calendar where I had achieved the results I had envisioned early on in my head. 

Now I just ned to rinse and repeat. I am going to the US / Mexico border next week to gather some footage, and I will be going to DC later this month for some work, where I will also be working to gather some valuable footage. By then I am hoping I have a palette of about 50 separate machine learning filters I can apply to images and to videos using my algorithmic rotoscope process.  Then I should have enough footage to begin telling more stories about the world around us, and help quantify the uneasy feelings we are all having about the world unfolding around us.


Exploring The Economics of Wholesale and Retail Algorithmic APIs

I got sucked into a month long project applying machine learning filters to video over the holidays. The project began with me doing the research on the economics behind Algorithmia's machine learning services, specifically the DeepFilter algorithm in their catalog. My algorithmic rotoscope work applying Algorithmia's Deep Filters to images and drone videos has given me a hands-on view of Algorithmia's approach to algorithms, and APIs, and the opportunity to think pretty deeply about the economics of all of this. I think Algorithmia's vision of all of this has a lot of potential for not just image filters, but any sort of algorithmic and machine learning API.

Retail Algorithmic and Machine Learning APIs
Using Algorithmia is pretty straightforward. With their API or CLI you can make calls to a variety of algorithms in their catalog, in this case their DeepFilter solution. All I do is pass them the URL of an image, what I want the new filtered image to be called, and the name of the filter that I want to be applied. Algorithmia provides an API explorer you can copy & paste the required JSON into, or they also provide a demo application for you to use--no JSON required. 

Training Your Own Style Transfer Models Using Their AWS AMI
The first "rabbit hole" concept I fell into when doing the research on Algorithmia's model was their story on creating your own style transfer models, providing you step by step details on how to train them, including a ready to go AWS AMI that you can run as a GPU instance. At first, I thought they were just cannibalizing their own service, but then I realized it was much more savvier than that. They were offloading much of the costly compute resources needed to create the models, but the end product still resulted in using their Deep Filter APIs. 

Developing My Own API Layer For Working With Images and Videos
Once I had experience using Algorithmia's deep filter via their API, and had produced a handful of my own style transfer models, I got to work designing my own process for uploading and applying the filters to images, then eventually separating out videos into individual images, applying the filters, then reassembling them into videos. The entire process, start to finish is a set of APIs, with a couple of them simply acting as a facade for Algorithmia's file upload, download, and DeepFilter APIs. It provided me with a perfect hypothetical business for thinking through the economics of building on top of Algorithmia's platform.

Defining My Hard Costs of Algorithmia's Service and the AWS Compute Needed
Algorithmia provides a pricing calculator along with each of their algorithms, allowing you to easily predict your costs. They charge you per API call, and the compute usage by the second. Each API has its own calculator, and average runtime duration costs, so I'm easily able to calculate a per image cost to apply filters--something that exponentially grows when you are applying to 60 frames (images) per second of video. Similarly, when it comes to training filter models using AWS EC2 GUP instance, I have a per hour charge for compute, storage costs, and (now) a pretty good idea of how many hours it takes to make a single filter. 

All of this gives me some pretty solid numbers to work with when trying to build a viable business built on top of Algorithmia. In theory, when my customers use my algorithmic rotoscope image or video interface, as well as the API, I can cover my operating costs, and generate a healthy profit by charging a per image cost for applying a machine learning texture filter. What I really think is innovative about Algorithmia's approach is that they are providing an AWS AMI to offload much of the "heavy compute lifting", with all roads still leading back to using their service. It is a model that could quickly shift algorithmic API consumers to be more wholesale / volume consumers, from being just a retail level API consumer.

My example of this focuses on images and video, but this model can be applied to any type of algorithmically fueled APIs. It provides me with a model of how you can safely open source the process behind your algorithms as AWS AMI and actually drive more business to your APIs by evolving your API consumers into wholesale API consumers. In my experience, many API providers are very concerned with malicious users reverse engineering their algorithms via their APIs, when in reality, in true API fashion, there are ways you can actually open up your algorithms, make them more accessible, and deployable, while still helping contribute significantly to your bottom line.


Make It An API Driven Publishing Solution

Once I had established a sort of proof of concept for my algorithmic rotoscope process, and was able to manually execute each step of the process from separating a video, and applying filters, to reassembling the video, I quickly refactored my prototype code to be API-first. I did this even before I built any sort of interface for executing and managing the process, as this would allow me to not just execute the process, it would also allow me to manage, extend, and scale as many algorithmic rotoscopes as I wanted.

I'm not particularly proud of the API design, and think it is something that will evolve and change, as I push forward what is possible with my algorithmic rotoscope. I'm learning a lot along the way, and my focus in having an API is not to open up access to 3rd parties, but to allow me to scale my process, and efficiently run using Amazon Web Services, Algorithmia, and a handful of other APIs. Currently, I have about 25 separate paths for my API, which allows me to accomplish every step of the algorithmic rotoscope process. 

Currenlty my algorithmic rotoscope API runs on a single Amazon EC2 instance, which I am scaling vertically, meaning I just increase the size of the server instance when I want more to get done. However, having the entire process be API first, will allow me to easy scale horizontally, across multiple servers. This should allow me to isolate specific steps of the process, or several of them together, allowing me to scale the server separately for assembling and disassembling the videos--which can be pretty intensive.

Once I have some time I will publish an OpenAPI Spec for the API. I don't have any intention on opening up the API for 3rd party usage, but I may be open to deploying Amazon instances for wholesale use, and partner access, at some point in the future. I am not looking to do this as a full blown startup idea, but is something I'd like to evolve on the side for my own uses, and potentially as a service for select clients. I will keep adding paths to my algorithmic rotoscope API, but for now I'll work to document, and refine the existing paths I have, and tell the story of the API--you know, practice what I preach and all. Even if my API isn't fully public, there is no reason the documentation, story, and results can't be public. 


Unique Algorithmic Filters Is Where It Is At

I am having a blast with the image texture filters that Algorithmia includes as part of their DeepLearning service. I've been playing with how each filter will behave when applied to different images. Some work better for the desert images, while others work best for water, and I'll apply to my waterfalls, lakes, and rivers. The process has been a welcome distraction, but where I really get lost thinking about the possibilities is when it comes to creating your own filters--an area I am just getting started with. 

Algorithmia provides a pretty straightforward guide to creating your own filters. Once you fire up the EC2 GPU instance, and follow their setup process, creating filters is pretty easy, but is also pretty addictive, depending on what kind of habit can afford. ;-) I have created six filters so far, but plan on creating more as soon as I get some money and more time. Just like applying the filters, training filters takes some practice, and experience training it against a variety of images, colors, textures, etc.

Experience applying filters to a variety of images is important and valuable, but experience training and creating filters I think is where it is at. Being able to find just the right filter to apply to an image or images used in the video is valuable, but being able to identify and create and train exactly the right set of textures, colors, and filters could provide some really unique experiences. I'm not a big fan of the concept of intellectual property, but I could see knowledge of training your texture and filter algorithms against specific pieces of art, photographs, and elements from our physical worlds being a pretty potentially unique offering--something you'd want to keep secret.

Some of the Algorithmia filters are loud and intense, which I like for some applications, but I'm finding their lighter touch, more artistic, and subtle filters have a wider range of uses. I'm applying these findings to the filters I'm training, but I need more experience applying existing filters, as well as the training of new filters. All of this takes a significant amount of compute and storage power--which costs money. I have made my algorithmic rotoscope framework API-centric so that I can scale this and increase the number of videos I am able to process, as well as the number of filters I am able to create and add to the process.

I am going to create around 10-15 more filters, then spend time just applying to see what I can produce. Then I'm hoping to have enough experience in applying and training to know what works best, what I like, and what compliments my approach to drone video capture. Eventually, I am hoping to establish my own unique set of filters and my own unique style in applying them to video using my algorithmic rotoscope process. 


Opportunity In Breaking Up Videos Into Separate Images

I have been working my way through about 300 GB of drone and GoPro videos from this summer. One of the lingering thoughts I was having throughout this process centered around the concept of breaking videos up into individual slides. Once I stumbled across Algorithmia's Deep Learning, I found myself thinking about how I could break up videos into separate images, and apply algorithmic texture filters to each individual image, and then reassemble back into a video.

FFMPEG stood out as the solution I needed to accomplish this. It was pretty easy to export each video as separate .jpg files, as well as the ability to assemble any images into a video. What did take me some time was getting the frame rate, size, and other finer aspects of video to work as I desired, and in sync with my drone videos. To achieve an acceptable video I needed 60 individual images for each second of video, potentially making the work a compute and storage intensive endeavor.

Using Amazon EC2 and S3 it isn't too much work to leverage FFMPEG to break up videos into separate images. Once I did this, I started publishing a JSON representation of each video, along with each individual image to Github, leverage version control, forking, and the other benefits of the platform, but for use in managing a video, instead of code. This process has given each of my videos, a machine readable, Git managed the representation of each video for me to work with, manipulate, and evolve independently. 

When you combine the separation of video, with Github in this way, with an API-driven approach to separation, assembly, as well as individual and overall image manipulation, the possibilities are pretty limitless. After I separate out each video into images, I am then applying textures and filters to each individual image using Algorithmia's deep filters, a process that has numerous opportunities, but I think the wider approach holds more potential than just fun filters. This approach is the primary reason I did this work. Playing with the videos, filters, and images was a welcome distraction, but I think the opportunity around breaking up videos into separate images is much bigger than algorithmic rotoscope.


Algorithmic Rotoscope

I had come across Texture Networks: Feed-forward Synthesis of Textures and Stylized Image from Cornell University a while back in my regular monitoring of the API space, so  I was pleased to see Algorithmia building on this work with their Deep Filter project. The artistic styles and textures you can apply to images is fun to play with, and even more fun when you apply at scale using their API. While this was fun, what really caught my attention was their post on their open source AWS AMI for training style transfer models--where you can develop your own image filter.

I fired up an AWS instance, and within 48 hours I had my first image filter. For some reason, I went to bed that night thinking about drone video footage and began wondering if I could apply Algorithmia's filters, as well as the custom filters I create to drone footage. The next day I started playing with my prototype, to see what was possible with applying this type of image filtering to videos.

My approach has three distinct features:

  • Separate Video Into Images - Using FFMPEG, I created a function that would take any video I gave it and separate it into an image for each second of the video, and write them all to a folder. 
  • Apply Filters to Each Image - After uploading all the images for each video to Algorithm.io using their API, I would select one of their 30+ image filters or any of the filters I created.
  • Reassemble Video From Images - Once I've applied one of the algorithmic filters to all of the images I reassemble them back into a video.

While the videos I've been able to produce so far are interesting enough, I am intrigued by the process of defining and applying the filters. There are a handful of things going on here for me:

  • Defining Of The Algorithms Behind Each Image Filter - I'm having a blast with the image filters that Algorithmia provide, but their work got me thinking about how I can train image filters specifically for being applied to video in this way. I'm learning a lot of about the training process--something I want to keep working on.
  • Applying Algorithmic Filters To Images At Scale (Videos) - The videos I am working with currently have been 500 and 2000 individual images after I separate the video. I am learning a lot about working with video at the individual slide level, and have a lot more work ahead of me.
  • Changing How I Take Videos Using My Drone - I have been applying this work to 4K video I either took myself or was present when it was taken. As I'm applying these filters I am hyper aware of where the sun was positioned, how the shadows of rocks, trees, and buildings play in, making think deeply about how I can design future drone shots.

Algorithmia has enough filters to keep me interested for a while, but this is where I think the real value will exist--assembling well trained filters. If you can craft exactly the right filter, I think it can be used to define entirely new virtual landscape derived from the physical world around us. Some of the filters offer a pretty extreme shift in the video landscape, but some of them provide just enough polish to make you think you are in an alternate reality.

I'm not sure where I'm going with this work. I'm finding it oddly soothing to define new filters, and apply them to my drone videos from this summer. For now, I'm just thinking of interesting filters to build and apply the filters to the videos I already have. It also has me thinking about how I can take my drone out to get more shots--from both natural and urban landscape. 

For near-term future, I'm just getting to know what the filters produce, and how certain drone shot(lighting, color, terrain) respond to each filter. It has me thinking a lot about what would make a good filter, and what types of images I can use to train image filter algorithms that can be applied in this way. If you want to learn more about what I'm doing you can submit a commit via the Github issues for this project, or you can contact me directly