Giter Site home page Giter Site logo

Comments (20)

LinB203 avatar LinB203 commented on August 16, 2024 1

I checked the program that was uploading data in the background and it interrupted, maybe there is some unknown error. I'm trying to fix it, maybe we should be uploading zip files instead of video files.

from open-sora-plan.

quantumiracle avatar quantumiracle commented on August 16, 2024

Another error I got from t2v training is:

/opensora/dataset/t2v_datasets.py", line 76, in get_video
    frame_idx = self.vid_cap_list[idx]['frame_idx']
KeyError: 'frame_idx'

where frame_idx does not exists in the json file.

from open-sora-plan.

LinB203 avatar LinB203 commented on August 16, 2024

Hi,

When launching the t2v training, the it also requires to specify an image data path as here. However, in the HuggingFace dataset repo there is no image-text dataset, which leads to error when launching training:

FileNotFoundError: [Errno 2] No such file or directory: '/dxyl_data02/anno_jsons/human_images_162094.json'

How to fix this?

https://huggingface.co/datasets/LanguageBind/Open-Sora-Plan-v1.1.0/blob/main/anno_jsons/human_images_162094.json

from open-sora-plan.

LinB203 avatar LinB203 commented on August 16, 2024

Another error I got from t2v training is:

/opensora/dataset/t2v_datasets.py", line 76, in get_video
    frame_idx = self.vid_cap_list[idx]['frame_idx']
KeyError: 'frame_idx'

where frame_idx does not exists in the json file.

Do you use the v1.1's code? The code of v1.1 should use annotation from here.

from open-sora-plan.

quantumiracle avatar quantumiracle commented on August 16, 2024

Thanks for quick reply.

It seems I'm using v1.0 dataset.

from open-sora-plan.

quantumiracle avatar quantumiracle commented on August 16, 2024

Hi,

when I'm trying to download v1.1 dataset with:

from huggingface_hub import snapshot_download
snapshot_download(repo_id="LanguageBind/Open-Sora-Plan-v1.1.0", repo_type="dataset", local_dir=data_dir)

I got error:

...
Fetching 117685 files:  12%|████████████▉                                                                                               | 14039/117685 [14:29<1:46:57, 16.15it/s]
4173980_resize1080p.mp4: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4.55M/4.55M [00:00<00:00, 247MB/s]
4173972_resize1080p.mp4: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5.49M/5.49M [00:00<00:00, 40.7MB/s]
4173976_resize1080p.mp4: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8.94M/8.94M [00:00<00:00, 31.9MB/s]
4173975_resize1080p.mp4: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10.7M/10.7M [00:00<00:00, 45.5MB/s]
4173977_resize1080p.mp4: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12.3M/12.3M [00:00<00:00, 66.7MB/s]
4173973_resize1080p.mp4: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 26.1M/26.1M [00:00<00:00, 83.2MB/s]
4173981_resize1080p.mp4: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 18.8M/18.8M [00:00<00:00, 126MB/s]
4173982_resize1080p.mp4: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 24.6M/24.6M [00:00<00:00, 298MB/s]
Traceback (most recent call last):
  File "/opt/venv/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 304, in hf_raise_for_status
    response.raise_for_status()
    raise HfHubHTTPError(message, response=response) from e
huggingface_hub.utils._errors.HfHubHTTPError:

403 Forbidden: None.
Cannot access content at: https://cdn-lfs-us-1.huggingface.co/repos/d1/a4/d1a47faaa1475f32c7e503cebcd6029bdf94c4a148ceb23e2f5e052d50d3f02a/dc4d652445209b5ad6ad292bc6755cc067abf187c739ee2bf8e8b75b3b2a9d90?response-content-disposition=inline%3B+filename*%3DUTF-8%27%274173971_resize1080p.mp4%3B+filename%3D%224173971_resize1080p.mp4%22%3B&response-content-type=video%2Fmp4&Expires=1718648001&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTcxODY0ODAwMX19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy11cy0xLmh1Z2dpbmdmYWNlLmNvL3JlcG9zL2QxL2E0L2QxYTQ3ZmFhYTE0NzVmMzJjN2U1MDNjZWJjZDYwMjliZGY5NGM0YTE0OGNlYjIzZTJmNWUwNTJkNTBkM2YwMmEvZGM0ZDY1MjQ0NTIwOWI1YWQ2YWQyOTJiYzY3NTVjYzA2N2FiZjE4N2M3MzllZTJiZjhlOGI3NWIzYjJhOWQ5MD9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSomcmVzcG9uc2UtY29udGVudC10eXBlPSoifV19&Signature=dnYc6UEjLcqCarh~mWlzSybLf505FdK8ClHTvIKnnY4Pc2nMLnsp5fxAUSLz3u24xOSQoykAxOG2h2kKCgMG-yKe4bUGRkLrLNJwn75Xl1C5L2iza3-wE6LlnDAre6Ju81QWolv1Wy6fIK0OHWJVMhIHquUqKyMHiaOXl7CLktLQg0POb-wga8HB9HFLDdsUm~1a2uH2mSAOcdAQz9teTMOJ4HCIOfwuuPaJiYK0g0NPeiddWMP4U~8R3cgghVLzq67YFrmmdcpT6Rv-K1F4LE4nLIo9LwQmATHzbI2y1Xgmzs9wFN4U7aGJ6Hq7avfaFplKLOK7nvV-enaJ-t0EOA__&Key-Pair-Id=K2FPYV99P2N66Q.
If you are trying to create or update content,make sure you have a token with the `write` role.

from open-sora-plan.

LinB203 avatar LinB203 commented on August 16, 2024

It seems that it is a network error? Btw, now the full pexel datasets do not upload completely.

from open-sora-plan.

quantumiracle avatar quantumiracle commented on August 16, 2024

Hi,

I think this is an access issue instead of network problem since it reports:

If you are trying to create or update content,make sure you have a token with the `write` role.

I tried both with snapshot_download and git clone directly, and both give this error. Any idea on why this happens?

from open-sora-plan.

quantumiracle avatar quantumiracle commented on August 16, 2024

Yes, compressed tar.gz would be good

Also it may be good to host each dataset with different url, and provide a downloading script. Trying to download the entire dataset and got interrupt in the middle will take a long time.

from open-sora-plan.

physercoe avatar physercoe commented on August 16, 2024

Hi,

I think this is an access issue instead of network problem since it reports:

If you are trying to create or update content,make sure you have a token with the `write` role.

I tried both with snapshot_download and git clone directly, and both give this error. Any idea on why this happens?

i met the same question, please provide the compressed tar.gz files instead of a lot of seperate small files

from open-sora-plan.

quantumiracle avatar quantumiracle commented on August 16, 2024

@LinB203 Hi, when will the dataset be ready? I could help with curating the data if you need.

from open-sora-plan.

LinB203 avatar LinB203 commented on August 16, 2024

Hi all, due to pexel data upload exception. We decided to package it and upload it again. This process will last about a week.

from open-sora-plan.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.