AI generated videos with voice instead of documentation.

Steve Froehlich
3 min readJan 4, 2023


For most of my career I didn’t enjoy reading or writing documentation. It constantly went out of date. Javadocs, markdown and generators helped but images and diagrams were a pain. Then Plant UML came along, which changed the game. You could now fully version all docs, text and diagrams with source code! I loved that idea, however there were still problems. For complex systems I had to setup meetings to walk people through the docs or create videos to explain things in more detail. Thanks to improvements in AI I was able to hack together some python code that will completely generate a video with vocal explanations. And its all in code:

  • the components
  • how they interact
  • the vocals (thanks to text to voice AI)
  • visual effects

Here is how it works. In this example lets explain how a simplified financial exchange architecture could work for the use case of a bitcoin trade. First, create the components of your system.

trader1 = create_component("Trader1")

trader2 = create_component("Trader2")

gwy = create_component("GWY")

eng = create_component("ENG")

mkd = create_component("MKD")

world = create_component("world")

Add the connections between components

self.add_connection_right(trader1, gwy)
self.add_connection_right(trader2, gwy)
self.add_connection_below(gwy, eng)
self.add_connection_below(eng, mkd)
self.add_connection_right(world, mkd)

After all the components and connections are added the code will create a display on screen like this

Now we can add the voice to text code to do the walk through for us

# the text in the below line will define what the AI will verbalize
with self.voiceover(text="There are two traders using the system") as tracker:, run_time=0.2)
self.highlight(talking_pt, trader2)

# the text in the below line will define what the AI will verbalize
with self.voiceover(text="There is a gateway that validates messages and routes them to the engine") as tracker:
self.highlight(talking_pt, gwy, highlight_duration)
self.wait(tracker.duration - highlight_duration)

In the above code will highlight the trader components and at the same time verbalize “There are two traders using the system.” The next block will highlight the gateway component (GWY) and verbalize “There is a gateway that validates messages and routes them to the engine.” The rest of the video covers sending messages, describing processing logic and walks users through the details. You can view the full video on youtube. You can view the video source code on github

I just finished this proof of concept and deployed it for beta testing. I’m excited that I won’t have to schedule meetings and walk people through architecture diagrams anymore. I can just code up a video and send the link! To be honest there will still be a need for traditional documents but I think tools like this can carry the explanatory burden instead of static images and the written word. I am also interested to see how my team and colleagues can collaborate on coding the videos together. Previously there wasn’t a good way to iterate on a video, but now these videos can move at the speed of code. As it should!

I have a bunch more features already working locally like:

  • add custom images
  • move and animate images
  • add components as images, send and receive messages from them
  • many more animation effects

Long term I think I could make these as engaging as the good technical creators on youtube and tiktok, but we will see. For now I’m happy with less meetings. Lastly I’m looking for beta testers so if you are interested in trying it free of charge, head to the landing page and join the wait list. I will send you your own browser based instance (Jupyter notebook) to play around with when I deploy it publicly. No installation needed. Hope you found this interesting and thanks for reading.



Steve Froehlich

I like to speculate about the future and help engineering teams build great software in e-commerce and digital finance.