-
Notifications
You must be signed in to change notification settings - Fork 31
Add a sampler that samples only keyframes (I-Frames) #474
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@NicolasHug and I have also talked about a similar thing. We'll want to do some research to better understand the use-cases people want to cover. The biggest question to me is, should "key_frame_only" be a kind of seek mode, and then we use the samplers as-is? Or should the sampler API itself be aware of the concept of key frames? If the latter, what do we need to expose in the decoder to enable it? |
@scotts - I'm interested in something like this as well. Currently, I'm processing long videos using IterableDataset in Pytorch, where each worker handles a chunk of the video, and those chunks always start at an I-frame. To achieve this, I first identify the I-frames and use them to seek each worker to its assigned chunk. I use these keyframes for fast seeking, and then each worker samples frames in sequential order. Right now, I'm using PyAv for that, but I would like to see how I can achieve the same with torchcodec. I'm sharing this in case it's useful, but I'm also open to hearing if there are alternative ways to parallelize the processing of long videos that I might not be considering. |
@mnc537, we recently exposed an internal, undocumented API that returns the key frame indices: VideoDecoder._get_key_frame_indices(). We exposed it as a way to get feedback on how people want to use keyframes to inform our future public API for both the decoder and samplers. Based on your description, I think it fits your current needs. Note that using this function means you must also use the exact seek mode; see our related tutorial for more on that. Regarding other methods for parallelizing the decoding of long videos, it may be more efficient to ask the underlying decoding libraries to use more threads rather than trying to parallelize it yourself. Our Python |
Starting clips on keyframes can potentially be a fast and useful sampler because it can sometimes be faster to seek and decode to a keyframe instead of decoding multiple P frames to get to the sample point.
Ideas:
The text was updated successfully, but these errors were encountered: