Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GeneratorVideo does not work #4301

Open
BielStela opened this issue Nov 5, 2024 · 2 comments
Open

GeneratorVideo does not work #4301

BielStela opened this issue Nov 5, 2024 · 2 comments
Labels
Help Wanted 🙏 Contribution task, outside help would be appreciated!

Comments

@BielStela
Copy link

BielStela commented Nov 5, 2024

Description

VideoDataset using a GeneratorVideo does not work

Context

I'm truing to create a video using GeneratorVideo to see if I can free up some memory. I already tried successfully with SequentialVideo (which works quite well btw) and refactored to use a generator that yields frames and a GeneratorVideo.

Steps to Reproduce

# catalog.yml
test_video:
  type: video.VideoDataset
  filepath: data/03_primary/test.mp4
# nodes.py
from collections.abc import Generator

from PIL import Image
from kedro_datasets.video.video_dataset import GeneratorVideo


def make_video() -> GeneratorVideo:
    """Makes a video with three frames: one red, one green and one blue at 1 fps"""
    def frames() -> Generator[Image.Image, None, None]:
        w, h = 256, 256
        red_frame = Image.new("RGB", (w, h), (255, 0, 0))
        green_frame = Image.new("RGB", (w, h), (0, 255, 0))
        blue_frame = Image.new("RGB", (w, h), (0, 0, 255))
        frames = [red_frame, green_frame, blue_frame]
        yield from frames

    return GeneratorVideo(frames(), length=None, fps=1)
# pipeline.py
from kedro.pipeline import Pipeline, pipeline, node

from .nodes import make_video


def create_pipeline(**kwargs) -> Pipeline:
    return pipeline([node(make_video, inputs=None, outputs="test_video")])

Expected Result

A colorful video similar to this one ( in the preview does not work, hope it does when published)

test.mp4

Actual Result

This error!

kedro.io.core.DatasetError: Failed while saving data to dataset VideoDataset(filepath=<removed>, protocol=file).
'Image' object has no attribute 'fps'

If one changes the node to use a SequenceVideo like so:

def make_video() -> SequenceVideo:
    """Makes a video with three frames
        one red, one green and one blue at 1 fps"""
    def frames() -> list:
        w, h = 256, 256
        red_frame = Image.new("RGB", (w, h), (254, 0, 0))
        green_frame = Image.new("RGB", (w, h), (0, 254, 0))
        blue_frame = Image.new("RGB", (w, h), (0, 0, 254))
        frames = [red_frame, green_frame, blue_frame, blue_frame]
        return frames

    return SequenceVideo(frames(), fps=1)

It works well.

Now here it comes my debugging report:
One can see that there's a moment when running the pipeline, when the program is at
kedro.runner._run_node_sequential:528, the code does

        items = zip(it.cycle(keys), interleave(*streams))

where streams is a list containing my GeneratorVideo which gets iterated in the chaining. The problem is that the stream itself is an Iterator that gets crystallized into an iterator of Image.Image in the operation and iterated over while calling catalog.save(name, data). Then VideoDataset takes the control and fails instantly because the input is no longer a GeneratorVideo nor a SequenceVideo, it is now an Image

From here I have no more clue about how this can be fixed tho :_)

Your Environment

  • Kedro version used (pip show kedro or kedro -V): 0.19.9
  • Python version used (python -V): Python 3.11.9
  • Operating system and version: Linux 6.8.0-48-generic 22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon Oct 7 11:24:13 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
@merelcht merelcht added the Community Issue/PR opened by the open-source community label Nov 5, 2024
@DimedS
Copy link
Member

DimedS commented Nov 11, 2024

Thanks for raising this issue, @BielStela. It appears that there are inconsistencies between how GeneratorVideo handles iteration and the VideoDataset save method. We may need to modify GeneratorVideo to support iteration in a way that aligns with VideoDataset. Would you be interested in proposing a PR to address this?

@dundermain
Copy link

Hey @DimedS , let us wait for @BielStela response. If they are unable to raise it, I would like to work on this issue if it is okay with both of you.

@merelcht merelcht added Help Wanted 🙏 Contribution task, outside help would be appreciated! and removed Community Issue/PR opened by the open-source community labels Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Help Wanted 🙏 Contribution task, outside help would be appreciated!
Projects
None yet
Development

No branches or pull requests

4 participants