Luckily I also provided a link, which, should you visit, mentions a paper. It might help.
We also have an entire website describing the technology stack built on top of the basic algorithm that allows it to be extended to entire videos for large scale data processing and large scale reasoning. We explain most of the pieces of the puzzle.