pytorch 中 DataLoader 的问题

Travis2022/01/25
275 字, 阅读需要 2 分钟

复现问题

RuntimeError: DataLoader worker (pid (s) 34132) exited unexpectedly,但是如果将 num_workers 设置为 0,则不会出现这个问题。只用单线程跑的话不会出问题。

import torch
import numpy as np
from torch.utils.data import Dataset, DataLoader


class WineDataset (Dataset):
    def __init__(self):
        # data loading
        xy = np.loadtxt (
            "./pytorchTutorial/data/wine/wine.csv", delimiter=",", dtype=np.float32, skiprows=1,
        )
        self.x = torch.from_numpy (xy [:, 1:])
        self.y = torch.from_numpy (xy [:, [0]])
        self.n_samples = xy.shape [0]  # number of samples

    def __getitem__(self, index):
        return self.x [index], self.y [index]

    def __len__(self):
        return self.n_samples



dataset = WineDataset ()
# first_data = dataset [0]
# features, labels = first_data
dataloader = DataLoader (dataset=dataset, batch_size=4, shuffle=True, num_workers=2)
dataiter = iter (dataloader)
data = dataiter.next ()
features, labels = data
print (features, labels)

解决

发现有很多人都有这个问题,Runtime Error with DataLoader: exited unexpectedly。但是貌似还没有人解释为什么这个问题会出现。

I had a similar error with my datasets. The problem was that I had incorrect dimensions at some point which made some of the tensors become huge, so they were overfilling the memory.

Correcting the dimensions solved this problem.

解决方法很简单,把迭代部分放在 __main__ 中就可以了

if __name__ == "__main__":
    dataset = WineDataset ()
    dataloader = DataLoader (dataset=dataset, batch_size=4, shuffle=True, num_workers=2)
    dataiter = iter (dataloader)
    data = dataiter.next ()
    features, labels = data
© LICENSED UNDER CC BY-NC-SA 4.0