Giter Site home page Giter Site logo

Comments (5)

SA-j00u avatar SA-j00u commented on June 10, 2024

i checked it on dataset with missed files
and looks like it doesn't works
rnd is works
but start rnd seed is same (usually it initialized with current time)
and every --auto_resume it processing same files!

i make 1st file missed
and several runs it "crashed" on 6rd iteration
after i saved on 2 iteration
and on several resumes it "crashed" on 8rd iteration

from real-esrgan.

SA-j00u avatar SA-j00u commented on June 10, 2024

i put print(filepath) to def get(self, filepath): and def get_text(self, filepath):
in basicsr\utils\file_client.py

3 runs with 1 resume

HQ\0004.png	HQ\0004.png	HQ\0004.png
LQ\0004.png     LQ\0004.png     LQ\0004.png
HQ\0001.png     HQ\0001.png     HQ\0001.png
LQ\0001.png     LQ\0001.png     LQ\0001.png
HQ\0007.png     HQ\0007.png     HQ\0007.png
LQ\0007.png     LQ\0007.png     LQ\0007.png
iter:       3   iter:       3   iter:      11
HQ\0005.png     HQ\0005.png     HQ\0005.png
LQ\0005.png     LQ\0005.png     LQ\0005.png
iter:       4   iter:       4   iter:      12
HQ\0003.png     HQ\0003.png     HQ\0003.png
LQ\0003.png     LQ\0003.png     LQ\0003.png
iter:       5   iter:       5   iter:      13
                INFO: Saving models and training states.
HQ\0009.png     HQ\0009.png     HQ\0009.png
LQ\0009.png     LQ\0009.png     LQ\0009.png
iter:       6   iter:       6   iter:      14
HQ\0000.png     HQ\0000.png     HQ\0000.png
LQ\0000.png     LQ\0000.png     LQ\0000.png
iter:       7   iter:       7   iter:      15
                                INFO: Saving models and training states.
HQ\0008.png     HQ\0008.png     HQ\0008.png
LQ\0008.png     LQ\0008.png     LQ\0008.png
iter:       8   iter:       8   iter:      16
HQ\0006.png     HQ\0006.png     HQ\0006.png
LQ\0006.png     LQ\0006.png     LQ\0006.png
iter:       9   iter:       9   iter:      17
HQ\0002.png     HQ\0002.png     HQ\0002.png
LQ\0002.png     LQ\0002.png     LQ\0002.png
iter:      10   iter:      10
                INFO: Saving models and training states.
                iter:      11
                iter:      12

from real-esrgan.

SA-j00u avatar SA-j00u commented on June 10, 2024

there is only 2 places in basicsr that run random.seed(seed)
and for seed used random 🤦‍♂️

    # random seed
    seed = opt.get('manual_seed')
    if seed is None:
        seed = random.randint(1, 10000)
        opt['manual_seed'] = seed
    set_random_seed(seed + opt['rank'])
def set_random_seed(seed):
    """Set random seeds."""
    random.seed(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    torch.cuda.manual_seed(seed)
    torch.cuda.manual_seed_all(seed)
def worker_init_fn(worker_id, num_workers, rank, seed):
    # Set the worker seed to num_workers * rank + worker_id + seed
    worker_seed = num_workers * rank + worker_id + seed
    np.random.seed(worker_seed)
    random.seed(worker_seed)

from real-esrgan.

SA-j00u avatar SA-j00u commented on June 10, 2024

i tried to put random.seed init in different places
but i can't change files order at all...

from real-esrgan.

SA-j00u avatar SA-j00u commented on June 10, 2024

fix?
basicsr\data\data_sampler.py

    def __iter__(self):
        # deterministically shuffle based on epoch
        g = torch.Generator()
        # EPIC FAIL
        # g.manual_seed(self.epoch)
        import random
        random.seed(a=None, version=2)
        g.manual_seed(random.randint(1, 2147483647))
        indices = torch.randperm(self.total_size, generator=g).tolist()

so random clip may not works correct too

so i spend 10 days
training same first files...
(and get strange results)

from real-esrgan.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.