VideoFlow: A Framework for Building Visual Analysis Pipelines

VideoFlow is an easy-to-use development and applicatoin toolkt to easily and efficiently build an end-to-end visual analysis pipeline. In most cases, developers have to spend a huge amount of time tackling data input and output, optimizing computation efficiency, or even debugging exhausting memory leaks together with algorithm development. VideoFlow aims to overcome these challenges by providing a flexible, efficient, extensible, and secure visual analysis framework for both the academia and industry. With VideoFlow, developers can focus on the improvement of algorithms themselves, as well as the construction of a complete visual analysis workflow. VideoFlow has been incubated in the practices of smart city innovation for more than five years. It has been widely used in tens of intelligent visual analysis systems.

Main Features

  • Flexibility. VideoFlow is designed around stateful Computation Graph and stateless Resource. Computation Graph abstracts the visual processing workflow into a stateful directed acyclic graph. Each visual analysis task can have a unique computation graph, or share the same graph with others. The flexibility is that developers can focus on the implementation of processing units (graph nodes) and the construction of the whole workflow. Resource is a stateless shared computation module of computation graphs. The most typical resource is deep learning model inference. Resource decouples the stateless visual processing components from the whole complicated visual analysis pipeline, helping developers focus on the optimization of these computation or Input/Output(IO) intensive implementation.

  • Efficiency. VideoFlow is designed for better efficiency from resource-level, video-level, frame-level, and operator-level.

    • Resource-level: resources can aggregate the scattered computation requests from computation graph instances into intensive processing for better efficiency. For example, VideoFlow uses dynamic batching of the input images from different operators, which share the same deep learning models for effective inference.
    • Video-level: all videos are analyzed in parallel. The nodes of all computation graphs are executed in a shared execution engine.
    • Frame-level: video frames can be parallelized for two reasons. Firstly, many operations are irrelevant to frame orders, like object detection. Secondly, in case the whole pipeline is complicated, it may take a long time to analyze one frame (from milliseconds to seconds). Waiting the previous frame to finish the whole pipeline is not only unnecessary, but also unacceptable for real-time processing. In VideoFlow, waiting only happens when the operation is dependent on the sequential order of frames, like object tracking.
    • Operator-level: visual analysis is a multi-branch pipeline in most cases. As Figure~\ref{fig:example} shows, crowd and pedestrian analysis can be parallelized. Inside pedestrian analysis, feature and attributes recognition can also be analyzed in parallel.
  • Extensibility. VideoFlow is designed from the beginning to be as modular as possible, allowing easy extension to almost all its components, including video decoding, pixel format conversion, deep model inference, and of course visual processing operators. It can be extended to different hardware devices like Graphic Processing Units(GPU), Neural Processing Unit (NPU), Machine Learning Units (MLU), etc. It can be hosted on either x86 or ARM platforms. Developers can customize their own implementations with VideoFlow as a dependent library. The extended implementations can be registered back by VideoFlow as plugins at runtime. Besides, though VideoFlow is originally designed for visual processing tasks, it can be easily extended to other stream analyzing tasks.

  • Security. Model protection is an important problem in industry. VideoFlow solves this by encoding model files into encrypted binary codes as part of the compiled library. The secret key can either be obscured into the same library, or exported to a separate key management service (KMS). At runtime, VideoFlow checks authorization from a remote service periodically. If authorized successfully, it decrypts the model data.


VideoFlow is recommended to built from source for best compilation with your developing environment:

git clone
cd videoflow && mkdir build
cmake ..
make -j all
make install

Refere to Installation for more detailed instructions on dependencies and building options.


View Github