Tutorial 2: Train MoCo on CIFAR-10

In this tutorial, we will train a model based on the MoCo Paper Momentum Contrast for Unsupervised Visual Representation Learning.

When training self-supervised models using contrastive loss we usually face one big problem. To get good results, we need many negative examples for the contrastive loss to work. Therefore, we need a large batch size. However, not everyone has access to a cluster full of GPUs or TPUs. To solve this problem, alternative approaches have been developed. Some of them use a memory bank to store old negative examples we can query to compensate for the smaller batch size. MoCo takes this approach one step further by including a momentum encoder.

We use the CIFAR-10 dataset for this tutorial.

In this tutorial you will learn:

  • How to use lightly to load a dataset and train a model

  • How to create a MoCo model with a memory bank

  • How to use the pre-trained model after self-supervised learning for a transfer learning task

Imports

Import the Python frameworks we need for this tutorial. Make sure you have lightly installed.

pip install lightly
import torch
import torch.nn as nn
import torchvision
import pytorch_lightning as pl
import lightly

Configuration

We set some configuration parameters for our experiment. Feel free to change them and analyze the effect.

The default configuration uses a batch size of 512. This requires around 6.4GB of GPU memory. When training for 100 epochs you should achieve around 73% test set accuracy. When training for 200 epochs accuracy increases to about 80%.

num_workers = 8
batch_size = 512
memory_bank_size = 4096
seed = 1
max_epochs = 100

Replace the path with the location of your CIFAR-10 dataset. We assume we have a train folder with subfolders for each class and .png images inside.

You can download CIFAR-10 in folders from Kaggle.

# The dataset structure should be like this:
# cifar10/train/
#  L airplane/
#    L 10008_airplane.png
#    L ...
#  L automobile/
#  L bird/
#  L cat/
#  L deer/
#  L dog/
#  L frog/
#  L horse/
#  L ship/
#  L truck/
path_to_train = '/datasets/cifar10/train/'
path_to_test = '/datasets/cifar10/test/'

Let’s set the seed to ensure reproducibility of the experiments

pl.seed_everything(seed)

Out:

Global seed set to 1

1

Setup data augmentations and loaders

We start with our data preprocessing pipeline. We can implement augmentations from the MOCO paper using the collate functions provided by lightly. For MoCo v2, we can use the same augmentations as SimCLR but override the input size and blur. Images from the CIFAR-10 dataset have a resolution of 32x32 pixels. Let’s use this resolution to train our model.

Note

We could use a higher input resolution to train our model. However, since the original resolution of CIFAR-10 images is low there is no real value in increasing the resolution. A higher resolution results in higher memory consumption and to compensate for that we would need to reduce the batch size.

# MoCo v2 uses SimCLR augmentations, additionally, disable blur
collate_fn = lightly.data.SimCLRCollateFunction(
    input_size=32,
    gaussian_blur=0.,
)

We don’t want any augmentation for our test data. Therefore, we create custom, torchvision based data transformations. Let’s ensure the size is correct and we normalize the data in the same way as we do with the training data.

# Augmentations typically used to train on cifar-10
train_classifier_transforms = torchvision.transforms.Compose([
    torchvision.transforms.RandomCrop(32, padding=4),
    torchvision.transforms.RandomHorizontalFlip(),
    torchvision.transforms.ToTensor(),
    torchvision.transforms.Normalize(
        mean=lightly.data.collate.imagenet_normalize['mean'],
        std=lightly.data.collate.imagenet_normalize['std'],
    )
])

# No additional augmentations for the test set
test_transforms = torchvision.transforms.Compose([
    torchvision.transforms.Resize((32, 32)),
    torchvision.transforms.ToTensor(),
    torchvision.transforms.Normalize(
        mean=lightly.data.collate.imagenet_normalize['mean'],
        std=lightly.data.collate.imagenet_normalize['std'],
    )
])

# We use the moco augmentations for training moco
dataset_train_moco = lightly.data.LightlyDataset(
    input_dir=path_to_train
)

# Since we also train a linear classifier on the pre-trained moco model we
# reuse the test augmentations here (MoCo augmentations are very strong and
# usually reduce accuracy of models which are not used for contrastive learning.
# Our linear layer will be trained using cross entropy loss and labels provided
# by the dataset. Therefore we chose light augmentations.)
dataset_train_classifier = lightly.data.LightlyDataset(
    input_dir=path_to_train,
    transform=train_classifier_transforms
)

dataset_test = lightly.data.LightlyDataset(
    input_dir=path_to_test,
    transform=test_transforms
)

Create the dataloaders to load and preprocess the data in the background.

dataloader_train_moco = torch.utils.data.DataLoader(
    dataset_train_moco,
    batch_size=batch_size,
    shuffle=True,
    collate_fn=collate_fn,
    drop_last=True,
    num_workers=num_workers
)

dataloader_train_classifier = torch.utils.data.DataLoader(
    dataset_train_classifier,
    batch_size=batch_size,
    shuffle=True,
    drop_last=True,
    num_workers=num_workers
)

dataloader_test = torch.utils.data.DataLoader(
    dataset_test,
    batch_size=batch_size,
    shuffle=False,
    drop_last=False,
    num_workers=num_workers
)

Create the MoCo Lightning Module

Now we create our MoCo model. We use PyTorch Lightning to train our model. We follow the specification of the lightning module. In this example we set the number of features for the hidden dimension to 512. The momentum for the Momentum Encoder is set to 0.99 (default is 0.999) since other reports show that this works better for Cifar-10.

For the backbone we use the lightly variant of a resnet-18. You can use another model following our playground to use custom backbones.

Note

We use a split batch norm to simulate multi-gpu behaviour. Combined with the use of batch shuffling, this prevents the model from communicating through the batch norm layers.

class MocoModel(pl.LightningModule):
    def __init__(self):
        super().__init__()

        # create a ResNet backbone and remove the classification head
        resnet = lightly.models.ResNetGenerator('resnet-18', 1, num_splits=8)
        backbone = nn.Sequential(
            *list(resnet.children())[:-1],
            nn.AdaptiveAvgPool2d(1),
        )

        # create a moco based on ResNet
        self.resnet_moco = \
            lightly.models.MoCo(backbone, num_ftrs=512, m=0.99, batch_shuffle=True)

        # create our loss with the optional memory bank
        self.criterion = lightly.loss.NTXentLoss(
            temperature=0.1,
            memory_bank_size=memory_bank_size)

    def forward(self, x):
        self.resnet_moco(x)

    # We provide a helper method to log weights in tensorboard
    # which is useful for debugging.
    def custom_histogram_weights(self):
        for name, params in self.named_parameters():
            self.logger.experiment.add_histogram(
                name, params, self.current_epoch)

    def training_step(self, batch, batch_idx):
        (x0, x1), _, _ = batch
        y0, y1 = self.resnet_moco(x0, x1)
        loss = self.criterion(y0, y1)
        self.log('train_loss_ssl', loss)
        return loss

    def training_epoch_end(self, outputs):
        self.custom_histogram_weights()


    def configure_optimizers(self):
        optim = torch.optim.SGD(self.resnet_moco.parameters(), lr=6e-2,
                                momentum=0.9, weight_decay=5e-4)
        scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optim, max_epochs)
        return [optim], [scheduler]

Create the Classifier Lightning Module

We create a linear classifier using the features we extract using MoCo and train it on the dataset

class Classifier(pl.LightningModule):
    def __init__(self, model):
        super().__init__()
        # create a moco based on ResNet
        self.resnet_moco = model

        # freeze the layers of moco
        for p in self.resnet_moco.parameters():  # reset requires_grad
            p.requires_grad = False

        # we create a linear layer for our downstream classification
        # model
        self.fc = nn.Linear(512, 10)

        self.accuracy = pl.metrics.Accuracy()

    def forward(self, x):
        with torch.no_grad():
            y_hat = self.resnet_moco.backbone(x).squeeze()
            y_hat = nn.functional.normalize(y_hat, dim=1)
        y_hat = self.fc(y_hat)
        return y_hat

    # We provide a helper method to log weights in tensorboard
    # which is useful for debugging.
    def custom_histogram_weights(self):
        for name, params in self.named_parameters():
            self.logger.experiment.add_histogram(
                name, params, self.current_epoch)

    def training_step(self, batch, batch_idx):
        x, y, _ = batch
        y_hat = self.forward(x)
        loss = nn.functional.cross_entropy(y_hat, y)
        self.log('train_loss_fc', loss)
        return loss

    def training_epoch_end(self, outputs):
        self.custom_histogram_weights()

    def validation_step(self, batch, batch_idx):
        x, y, _ = batch
        y_hat = self.forward(x)
        y_hat = torch.nn.functional.softmax(y_hat, dim=1)
        self.accuracy(y_hat, y)
        self.log('val_acc', self.accuracy.compute(),
                 on_epoch=True, prog_bar=True)

    def configure_optimizers(self):
        optim = torch.optim.SGD(self.fc.parameters(), lr=30.)
        scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optim, max_epochs)
        return [optim], [scheduler]

Train the MoCo model

We can instantiate the model and train it using the lightning trainer.

# use a GPU if available
gpus = 1 if torch.cuda.is_available() else 0

model = MocoModel()
trainer = pl.Trainer(max_epochs=max_epochs, gpus=gpus,
                     progress_bar_refresh_rate=100)
trainer.fit(
    model,
    dataloader_train_moco
)

Out:

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
/opt/conda/envs/lightly/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/opt/conda/envs/lightly/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/opt/conda/envs/lightly/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/opt/conda/envs/lightly/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/opt/conda/envs/lightly/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/opt/conda/envs/lightly/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])

  | Name        | Type       | Params
-------------------------------------------
0 | resnet_moco | MoCo       | 23.0 M
1 | criterion   | NTXentLoss | 0
-------------------------------------------
11.5 M    Trainable params
11.5 M    Non-trainable params
23.0 M    Total params
91.977    Total estimated model params size (MB)

Training: 0it [00:00, ?it/s]
Training:   0%|          | 0/97 [00:00<?, ?it/s]
Epoch 0:   0%|          | 0/97 [00:00<?, ?it/s]
Epoch 0: 100%|##########| 97/97 [00:25<00:00,  3.82it/s]
Epoch 0: 100%|##########| 97/97 [00:25<00:00,  3.82it/s, loss=7.12, v_num=85]
Epoch 0:   0%|          | 0/97 [00:00<?, ?it/s, loss=7.12, v_num=85]
Epoch 1:   0%|          | 0/97 [00:00<?, ?it/s, loss=7.12, v_num=85]
Epoch 1: 100%|##########| 97/97 [00:24<00:00,  3.93it/s, loss=7.12, v_num=85]
Epoch 1: 100%|##########| 97/97 [00:24<00:00,  3.93it/s, loss=7.33, v_num=85]
Epoch 1:   0%|          | 0/97 [00:00<?, ?it/s, loss=7.33, v_num=85]
Epoch 2:   0%|          | 0/97 [00:00<?, ?it/s, loss=7.33, v_num=85]
Epoch 2: 100%|##########| 97/97 [00:25<00:00,  3.82it/s, loss=7.33, v_num=85]
Epoch 2: 100%|##########| 97/97 [00:25<00:00,  3.82it/s, loss=7.19, v_num=85]
Epoch 2:   0%|          | 0/97 [00:00<?, ?it/s, loss=7.19, v_num=85]
Epoch 3:   0%|          | 0/97 [00:00<?, ?it/s, loss=7.19, v_num=85]
Epoch 3: 100%|##########| 97/97 [00:25<00:00,  3.79it/s, loss=7.19, v_num=85]
Epoch 3: 100%|##########| 97/97 [00:25<00:00,  3.79it/s, loss=6.89, v_num=85]
Epoch 3:   0%|          | 0/97 [00:00<?, ?it/s, loss=6.89, v_num=85]
Epoch 4:   0%|          | 0/97 [00:00<?, ?it/s, loss=6.89, v_num=85]
Epoch 4: 100%|##########| 97/97 [00:25<00:00,  3.78it/s, loss=6.89, v_num=85]
Epoch 4: 100%|##########| 97/97 [00:25<00:00,  3.78it/s, loss=6.69, v_num=85]
Epoch 4:   0%|          | 0/97 [00:00<?, ?it/s, loss=6.69, v_num=85]
Epoch 5:   0%|          | 0/97 [00:00<?, ?it/s, loss=6.69, v_num=85]
Epoch 5: 100%|##########| 97/97 [00:24<00:00,  3.89it/s, loss=6.69, v_num=85]
Epoch 5: 100%|##########| 97/97 [00:24<00:00,  3.89it/s, loss=6.52, v_num=85]
Epoch 5:   0%|          | 0/97 [00:00<?, ?it/s, loss=6.52, v_num=85]
Epoch 6:   0%|          | 0/97 [00:00<?, ?it/s, loss=6.52, v_num=85]
Epoch 6: 100%|##########| 97/97 [00:25<00:00,  3.82it/s, loss=6.52, v_num=85]
Epoch 6: 100%|##########| 97/97 [00:25<00:00,  3.82it/s, loss=6.35, v_num=85]
Epoch 6:   0%|          | 0/97 [00:00<?, ?it/s, loss=6.35, v_num=85]
Epoch 7:   0%|          | 0/97 [00:00<?, ?it/s, loss=6.35, v_num=85]
Epoch 7: 100%|##########| 97/97 [00:25<00:00,  3.76it/s, loss=6.35, v_num=85]
Epoch 7: 100%|##########| 97/97 [00:25<00:00,  3.76it/s, loss=6.18, v_num=85]
Epoch 7:   0%|          | 0/97 [00:00<?, ?it/s, loss=6.18, v_num=85]
Epoch 8:   0%|          | 0/97 [00:00<?, ?it/s, loss=6.18, v_num=85]
Epoch 8: 100%|##########| 97/97 [00:25<00:00,  3.75it/s, loss=6.18, v_num=85]
Epoch 8: 100%|##########| 97/97 [00:25<00:00,  3.75it/s, loss=6.05, v_num=85]
Epoch 8:   0%|          | 0/97 [00:00<?, ?it/s, loss=6.05, v_num=85]
Epoch 9:   0%|          | 0/97 [00:00<?, ?it/s, loss=6.05, v_num=85]
Epoch 9: 100%|##########| 97/97 [00:26<00:00,  3.69it/s, loss=6.05, v_num=85]
Epoch 9: 100%|##########| 97/97 [00:26<00:00,  3.69it/s, loss=5.87, v_num=85]
Epoch 9:   0%|          | 0/97 [00:00<?, ?it/s, loss=5.87, v_num=85]
Epoch 10:   0%|          | 0/97 [00:00<?, ?it/s, loss=5.87, v_num=85]
Epoch 10: 100%|##########| 97/97 [00:25<00:00,  3.79it/s, loss=5.87, v_num=85]
Epoch 10: 100%|##########| 97/97 [00:25<00:00,  3.79it/s, loss=5.75, v_num=85]
Epoch 10:   0%|          | 0/97 [00:00<?, ?it/s, loss=5.75, v_num=85]
Epoch 11:   0%|          | 0/97 [00:00<?, ?it/s, loss=5.75, v_num=85]
Epoch 11: 100%|##########| 97/97 [00:25<00:00,  3.80it/s, loss=5.75, v_num=85]
Epoch 11: 100%|##########| 97/97 [00:25<00:00,  3.80it/s, loss=5.64, v_num=85]
Epoch 11:   0%|          | 0/97 [00:00<?, ?it/s, loss=5.64, v_num=85]
Epoch 12:   0%|          | 0/97 [00:00<?, ?it/s, loss=5.64, v_num=85]
Epoch 12: 100%|##########| 97/97 [00:25<00:00,  3.82it/s, loss=5.64, v_num=85]
Epoch 12: 100%|##########| 97/97 [00:25<00:00,  3.81it/s, loss=5.51, v_num=85]
Epoch 12:   0%|          | 0/97 [00:00<?, ?it/s, loss=5.51, v_num=85]
Epoch 13:   0%|          | 0/97 [00:00<?, ?it/s, loss=5.51, v_num=85]
Epoch 13: 100%|##########| 97/97 [00:25<00:00,  3.86it/s, loss=5.51, v_num=85]
Epoch 13: 100%|##########| 97/97 [00:25<00:00,  3.86it/s, loss=5.39, v_num=85]
Epoch 13:   0%|          | 0/97 [00:00<?, ?it/s, loss=5.39, v_num=85]
Epoch 14:   0%|          | 0/97 [00:00<?, ?it/s, loss=5.39, v_num=85]
Epoch 14: 100%|##########| 97/97 [00:25<00:00,  3.76it/s, loss=5.39, v_num=85]
Epoch 14: 100%|##########| 97/97 [00:25<00:00,  3.76it/s, loss=5.3, v_num=85]
Epoch 14:   0%|          | 0/97 [00:00<?, ?it/s, loss=5.3, v_num=85]
Epoch 15:   0%|          | 0/97 [00:00<?, ?it/s, loss=5.3, v_num=85]
Epoch 15: 100%|##########| 97/97 [00:25<00:00,  3.77it/s, loss=5.3, v_num=85]
Epoch 15: 100%|##########| 97/97 [00:25<00:00,  3.77it/s, loss=5.18, v_num=85]
Epoch 15:   0%|          | 0/97 [00:00<?, ?it/s, loss=5.18, v_num=85]
Epoch 16:   0%|          | 0/97 [00:00<?, ?it/s, loss=5.18, v_num=85]
Epoch 16: 100%|##########| 97/97 [00:25<00:00,  3.84it/s, loss=5.18, v_num=85]
Epoch 16: 100%|##########| 97/97 [00:25<00:00,  3.84it/s, loss=5.11, v_num=85]
Epoch 16:   0%|          | 0/97 [00:00<?, ?it/s, loss=5.11, v_num=85]
Epoch 17:   0%|          | 0/97 [00:00<?, ?it/s, loss=5.11, v_num=85]
Epoch 17: 100%|##########| 97/97 [00:25<00:00,  3.80it/s, loss=5.11, v_num=85]
Epoch 17: 100%|##########| 97/97 [00:25<00:00,  3.80it/s, loss=5.07, v_num=85]
Epoch 17:   0%|          | 0/97 [00:00<?, ?it/s, loss=5.07, v_num=85]
Epoch 18:   0%|          | 0/97 [00:00<?, ?it/s, loss=5.07, v_num=85]
Epoch 18: 100%|##########| 97/97 [00:25<00:00,  3.81it/s, loss=5.07, v_num=85]
Epoch 18: 100%|##########| 97/97 [00:25<00:00,  3.81it/s, loss=4.99, v_num=85]
Epoch 18:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.99, v_num=85]
Epoch 19:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.99, v_num=85]
Epoch 19: 100%|##########| 97/97 [00:25<00:00,  3.79it/s, loss=4.99, v_num=85]
Epoch 19: 100%|##########| 97/97 [00:25<00:00,  3.79it/s, loss=4.9, v_num=85]
Epoch 19:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.9, v_num=85]
Epoch 20:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.9, v_num=85]
Epoch 20: 100%|##########| 97/97 [00:26<00:00,  3.71it/s, loss=4.9, v_num=85]
Epoch 20: 100%|##########| 97/97 [00:26<00:00,  3.71it/s, loss=4.89, v_num=85]
Epoch 20:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.89, v_num=85]
Epoch 21:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.89, v_num=85]
Epoch 21: 100%|##########| 97/97 [00:25<00:00,  3.80it/s, loss=4.89, v_num=85]
Epoch 21: 100%|##########| 97/97 [00:25<00:00,  3.80it/s, loss=4.8, v_num=85]
Epoch 21:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.8, v_num=85]
Epoch 22:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.8, v_num=85]
Epoch 22: 100%|##########| 97/97 [00:26<00:00,  3.71it/s, loss=4.8, v_num=85]
Epoch 22: 100%|##########| 97/97 [00:26<00:00,  3.71it/s, loss=4.75, v_num=85]
Epoch 22:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.75, v_num=85]
Epoch 23:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.75, v_num=85]
Epoch 23: 100%|##########| 97/97 [00:24<00:00,  3.89it/s, loss=4.75, v_num=85]
Epoch 23: 100%|##########| 97/97 [00:24<00:00,  3.89it/s, loss=4.65, v_num=85]
Epoch 23:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.65, v_num=85]
Epoch 24:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.65, v_num=85]
Epoch 24: 100%|##########| 97/97 [00:25<00:00,  3.85it/s, loss=4.65, v_num=85]
Epoch 24: 100%|##########| 97/97 [00:25<00:00,  3.85it/s, loss=4.59, v_num=85]
Epoch 24:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.59, v_num=85]
Epoch 25:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.59, v_num=85]
Epoch 25: 100%|##########| 97/97 [00:25<00:00,  3.82it/s, loss=4.59, v_num=85]
Epoch 25: 100%|##########| 97/97 [00:25<00:00,  3.82it/s, loss=4.49, v_num=85]
Epoch 25:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.49, v_num=85]
Epoch 26:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.49, v_num=85]
Epoch 26: 100%|##########| 97/97 [00:25<00:00,  3.79it/s, loss=4.49, v_num=85]
Epoch 26: 100%|##########| 97/97 [00:25<00:00,  3.79it/s, loss=4.45, v_num=85]
Epoch 26:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.45, v_num=85]
Epoch 27:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.45, v_num=85]
Epoch 27: 100%|##########| 97/97 [00:25<00:00,  3.80it/s, loss=4.45, v_num=85]
Epoch 27: 100%|##########| 97/97 [00:25<00:00,  3.80it/s, loss=4.43, v_num=85]
Epoch 27:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.43, v_num=85]
Epoch 28:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.43, v_num=85]
Epoch 28: 100%|##########| 97/97 [00:25<00:00,  3.78it/s, loss=4.43, v_num=85]
Epoch 28: 100%|##########| 97/97 [00:25<00:00,  3.78it/s, loss=4.3, v_num=85]
Epoch 28:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.3, v_num=85]
Epoch 29:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.3, v_num=85]
Epoch 29: 100%|##########| 97/97 [00:24<00:00,  3.89it/s, loss=4.3, v_num=85]
Epoch 29: 100%|##########| 97/97 [00:24<00:00,  3.89it/s, loss=4.31, v_num=85]
Epoch 29:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.31, v_num=85]
Epoch 30:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.31, v_num=85]
Epoch 30: 100%|##########| 97/97 [00:25<00:00,  3.79it/s, loss=4.31, v_num=85]
Epoch 30: 100%|##########| 97/97 [00:25<00:00,  3.79it/s, loss=4.27, v_num=85]
Epoch 30:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.27, v_num=85]
Epoch 31:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.27, v_num=85]
Epoch 31: 100%|##########| 97/97 [00:25<00:00,  3.74it/s, loss=4.27, v_num=85]
Epoch 31: 100%|##########| 97/97 [00:25<00:00,  3.74it/s, loss=4.24, v_num=85]
Epoch 31:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.24, v_num=85]
Epoch 32:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.24, v_num=85]
Epoch 32: 100%|##########| 97/97 [00:25<00:00,  3.74it/s, loss=4.24, v_num=85]
Epoch 32: 100%|##########| 97/97 [00:25<00:00,  3.74it/s, loss=4.21, v_num=85]
Epoch 32:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.21, v_num=85]
Epoch 33:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.21, v_num=85]
Epoch 33: 100%|##########| 97/97 [00:25<00:00,  3.77it/s, loss=4.21, v_num=85]
Epoch 33: 100%|##########| 97/97 [00:25<00:00,  3.77it/s, loss=4.13, v_num=85]
Epoch 33:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.13, v_num=85]
Epoch 34:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.13, v_num=85]
Epoch 34: 100%|##########| 97/97 [00:25<00:00,  3.84it/s, loss=4.13, v_num=85]
Epoch 34: 100%|##########| 97/97 [00:25<00:00,  3.84it/s, loss=4.11, v_num=85]
Epoch 34:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.11, v_num=85]
Epoch 35:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.11, v_num=85]
Epoch 35: 100%|##########| 97/97 [00:25<00:00,  3.77it/s, loss=4.11, v_num=85]
Epoch 35: 100%|##########| 97/97 [00:25<00:00,  3.77it/s, loss=4.12, v_num=85]
Epoch 35:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.12, v_num=85]
Epoch 36:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.12, v_num=85]
Epoch 36: 100%|##########| 97/97 [00:25<00:00,  3.78it/s, loss=4.12, v_num=85]
Epoch 36: 100%|##########| 97/97 [00:25<00:00,  3.78it/s, loss=4.05, v_num=85]
Epoch 36:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.05, v_num=85]
Epoch 37:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.05, v_num=85]
Epoch 37: 100%|##########| 97/97 [00:24<00:00,  3.88it/s, loss=4.05, v_num=85]
Epoch 37: 100%|##########| 97/97 [00:24<00:00,  3.88it/s, loss=4.03, v_num=85]
Epoch 37:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.03, v_num=85]
Epoch 38:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.03, v_num=85]
Epoch 38: 100%|##########| 97/97 [00:25<00:00,  3.84it/s, loss=4.03, v_num=85]
Epoch 38: 100%|##########| 97/97 [00:25<00:00,  3.84it/s, loss=4.02, v_num=85]
Epoch 38:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.02, v_num=85]
Epoch 39:   0%|          | 0/97 [00:00<?, ?it/s, loss=4.02, v_num=85]
Epoch 39: 100%|##########| 97/97 [00:25<00:00,  3.78it/s, loss=4.02, v_num=85]
Epoch 39: 100%|##########| 97/97 [00:25<00:00,  3.78it/s, loss=3.96, v_num=85]
Epoch 39:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.96, v_num=85]
Epoch 40:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.96, v_num=85]
Epoch 40: 100%|##########| 97/97 [00:25<00:00,  3.81it/s, loss=3.96, v_num=85]
Epoch 40: 100%|##########| 97/97 [00:25<00:00,  3.81it/s, loss=3.93, v_num=85]
Epoch 40:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.93, v_num=85]
Epoch 41:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.93, v_num=85]
Epoch 41: 100%|##########| 97/97 [00:25<00:00,  3.87it/s, loss=3.93, v_num=85]
Epoch 41: 100%|##########| 97/97 [00:25<00:00,  3.87it/s, loss=3.93, v_num=85]
Epoch 41:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.93, v_num=85]
Epoch 42:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.93, v_num=85]
Epoch 42: 100%|##########| 97/97 [00:25<00:00,  3.84it/s, loss=3.93, v_num=85]
Epoch 42: 100%|##########| 97/97 [00:25<00:00,  3.84it/s, loss=3.9, v_num=85]
Epoch 42:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.9, v_num=85]
Epoch 43:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.9, v_num=85]
Epoch 43: 100%|##########| 97/97 [00:24<00:00,  3.89it/s, loss=3.9, v_num=85]
Epoch 43: 100%|##########| 97/97 [00:24<00:00,  3.89it/s, loss=3.83, v_num=85]
Epoch 43:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.83, v_num=85]
Epoch 44:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.83, v_num=85]
Epoch 44: 100%|##########| 97/97 [00:24<00:00,  3.88it/s, loss=3.83, v_num=85]
Epoch 44: 100%|##########| 97/97 [00:25<00:00,  3.88it/s, loss=3.87, v_num=85]
Epoch 44:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.87, v_num=85]
Epoch 45:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.87, v_num=85]
Epoch 45: 100%|##########| 97/97 [00:24<00:00,  3.92it/s, loss=3.87, v_num=85]
Epoch 45: 100%|##########| 97/97 [00:24<00:00,  3.92it/s, loss=3.82, v_num=85]
Epoch 45:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.82, v_num=85]
Epoch 46:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.82, v_num=85]
Epoch 46: 100%|##########| 97/97 [00:24<00:00,  3.93it/s, loss=3.82, v_num=85]
Epoch 46: 100%|##########| 97/97 [00:24<00:00,  3.93it/s, loss=3.78, v_num=85]
Epoch 46:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.78, v_num=85]
Epoch 47:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.78, v_num=85]
Epoch 47: 100%|##########| 97/97 [00:25<00:00,  3.85it/s, loss=3.78, v_num=85]
Epoch 47: 100%|##########| 97/97 [00:25<00:00,  3.85it/s, loss=3.76, v_num=85]
Epoch 47:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.76, v_num=85]
Epoch 48:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.76, v_num=85]
Epoch 48: 100%|##########| 97/97 [00:25<00:00,  3.79it/s, loss=3.76, v_num=85]
Epoch 48: 100%|##########| 97/97 [00:25<00:00,  3.79it/s, loss=3.75, v_num=85]
Epoch 48:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.75, v_num=85]
Epoch 49:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.75, v_num=85]
Epoch 49: 100%|##########| 97/97 [00:25<00:00,  3.85it/s, loss=3.75, v_num=85]
Epoch 49: 100%|##########| 97/97 [00:25<00:00,  3.85it/s, loss=3.68, v_num=85]
Epoch 49:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.68, v_num=85]
Epoch 50:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.68, v_num=85]
Epoch 50: 100%|##########| 97/97 [00:25<00:00,  3.83it/s, loss=3.68, v_num=85]
Epoch 50: 100%|##########| 97/97 [00:25<00:00,  3.83it/s, loss=3.67, v_num=85]
Epoch 50:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.67, v_num=85]
Epoch 51:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.67, v_num=85]
Epoch 51: 100%|##########| 97/97 [00:25<00:00,  3.81it/s, loss=3.67, v_num=85]
Epoch 51: 100%|##########| 97/97 [00:25<00:00,  3.81it/s, loss=3.61, v_num=85]
Epoch 51:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.61, v_num=85]
Epoch 52:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.61, v_num=85]
Epoch 52: 100%|##########| 97/97 [00:25<00:00,  3.87it/s, loss=3.61, v_num=85]
Epoch 52: 100%|##########| 97/97 [00:25<00:00,  3.87it/s, loss=3.62, v_num=85]
Epoch 52:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.62, v_num=85]
Epoch 53:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.62, v_num=85]
Epoch 53: 100%|##########| 97/97 [00:25<00:00,  3.83it/s, loss=3.62, v_num=85]
Epoch 53: 100%|##########| 97/97 [00:25<00:00,  3.83it/s, loss=3.61, v_num=85]
Epoch 53:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.61, v_num=85]
Epoch 54:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.61, v_num=85]
Epoch 54: 100%|##########| 97/97 [00:25<00:00,  3.80it/s, loss=3.61, v_num=85]
Epoch 54: 100%|##########| 97/97 [00:25<00:00,  3.80it/s, loss=3.61, v_num=85]
Epoch 54:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.61, v_num=85]
Epoch 55:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.61, v_num=85]
Epoch 55: 100%|##########| 97/97 [00:24<00:00,  3.92it/s, loss=3.61, v_num=85]
Epoch 55: 100%|##########| 97/97 [00:24<00:00,  3.92it/s, loss=3.55, v_num=85]
Epoch 55:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.55, v_num=85]
Epoch 56:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.55, v_num=85]
Epoch 56: 100%|##########| 97/97 [00:24<00:00,  3.89it/s, loss=3.55, v_num=85]
Epoch 56: 100%|##########| 97/97 [00:24<00:00,  3.89it/s, loss=3.53, v_num=85]
Epoch 56:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.53, v_num=85]
Epoch 57:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.53, v_num=85]
Epoch 57: 100%|##########| 97/97 [00:25<00:00,  3.88it/s, loss=3.53, v_num=85]
Epoch 57: 100%|##########| 97/97 [00:25<00:00,  3.88it/s, loss=3.55, v_num=85]
Epoch 57:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.55, v_num=85]
Epoch 58:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.55, v_num=85]
Epoch 58: 100%|##########| 97/97 [00:24<00:00,  3.88it/s, loss=3.55, v_num=85]
Epoch 58: 100%|##########| 97/97 [00:24<00:00,  3.88it/s, loss=3.49, v_num=85]
Epoch 58:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.49, v_num=85]
Epoch 59:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.49, v_num=85]
Epoch 59: 100%|##########| 97/97 [00:24<00:00,  3.98it/s, loss=3.49, v_num=85]
Epoch 59: 100%|##########| 97/97 [00:24<00:00,  3.98it/s, loss=3.45, v_num=85]
Epoch 59:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.45, v_num=85]
Epoch 60:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.45, v_num=85]
Epoch 60: 100%|##########| 97/97 [00:24<00:00,  3.93it/s, loss=3.45, v_num=85]
Epoch 60: 100%|##########| 97/97 [00:24<00:00,  3.93it/s, loss=3.45, v_num=85]
Epoch 60:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.45, v_num=85]
Epoch 61:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.45, v_num=85]
Epoch 61: 100%|##########| 97/97 [00:24<00:00,  3.91it/s, loss=3.45, v_num=85]
Epoch 61: 100%|##########| 97/97 [00:24<00:00,  3.91it/s, loss=3.45, v_num=85]
Epoch 61:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.45, v_num=85]
Epoch 62:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.45, v_num=85]
Epoch 62: 100%|##########| 97/97 [00:24<00:00,  3.94it/s, loss=3.45, v_num=85]
Epoch 62: 100%|##########| 97/97 [00:24<00:00,  3.94it/s, loss=3.42, v_num=85]
Epoch 62:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.42, v_num=85]
Epoch 63:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.42, v_num=85]
Epoch 63: 100%|##########| 97/97 [00:24<00:00,  3.95it/s, loss=3.42, v_num=85]
Epoch 63: 100%|##########| 97/97 [00:24<00:00,  3.95it/s, loss=3.38, v_num=85]
Epoch 63:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.38, v_num=85]
Epoch 64:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.38, v_num=85]
Epoch 64: 100%|##########| 97/97 [00:23<00:00,  4.05it/s, loss=3.38, v_num=85]
Epoch 64: 100%|##########| 97/97 [00:23<00:00,  4.05it/s, loss=3.36, v_num=85]
Epoch 64:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.36, v_num=85]
Epoch 65:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.36, v_num=85]
Epoch 65: 100%|##########| 97/97 [00:24<00:00,  3.97it/s, loss=3.36, v_num=85]
Epoch 65: 100%|##########| 97/97 [00:24<00:00,  3.97it/s, loss=3.31, v_num=85]
Epoch 65:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.31, v_num=85]
Epoch 66:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.31, v_num=85]
Epoch 66: 100%|##########| 97/97 [00:24<00:00,  4.01it/s, loss=3.31, v_num=85]
Epoch 66: 100%|##########| 97/97 [00:24<00:00,  4.01it/s, loss=3.3, v_num=85]
Epoch 66:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.3, v_num=85]
Epoch 67:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.3, v_num=85]
Epoch 67: 100%|##########| 97/97 [00:24<00:00,  4.00it/s, loss=3.3, v_num=85]
Epoch 67: 100%|##########| 97/97 [00:24<00:00,  4.00it/s, loss=3.3, v_num=85]
Epoch 67:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.3, v_num=85]
Epoch 68:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.3, v_num=85]
Epoch 68: 100%|##########| 97/97 [00:25<00:00,  3.88it/s, loss=3.3, v_num=85]
Epoch 68: 100%|##########| 97/97 [00:25<00:00,  3.88it/s, loss=3.26, v_num=85]
Epoch 68:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.26, v_num=85]
Epoch 69:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.26, v_num=85]
Epoch 69: 100%|##########| 97/97 [00:24<00:00,  3.96it/s, loss=3.26, v_num=85]
Epoch 69: 100%|##########| 97/97 [00:24<00:00,  3.96it/s, loss=3.3, v_num=85]
Epoch 69:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.3, v_num=85]
Epoch 70:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.3, v_num=85]
Epoch 70: 100%|##########| 97/97 [00:24<00:00,  4.01it/s, loss=3.3, v_num=85]
Epoch 70: 100%|##########| 97/97 [00:24<00:00,  4.01it/s, loss=3.23, v_num=85]
Epoch 70:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.23, v_num=85]
Epoch 71:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.23, v_num=85]
Epoch 71: 100%|##########| 97/97 [00:24<00:00,  3.95it/s, loss=3.23, v_num=85]
Epoch 71: 100%|##########| 97/97 [00:24<00:00,  3.95it/s, loss=3.24, v_num=85]
Epoch 71:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.24, v_num=85]
Epoch 72:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.24, v_num=85]
Epoch 72: 100%|##########| 97/97 [00:23<00:00,  4.05it/s, loss=3.24, v_num=85]
Epoch 72: 100%|##########| 97/97 [00:23<00:00,  4.05it/s, loss=3.27, v_num=85]
Epoch 72:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.27, v_num=85]
Epoch 73:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.27, v_num=85]
Epoch 73: 100%|##########| 97/97 [00:24<00:00,  3.98it/s, loss=3.27, v_num=85]
Epoch 73: 100%|##########| 97/97 [00:24<00:00,  3.98it/s, loss=3.19, v_num=85]
Epoch 73:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.19, v_num=85]
Epoch 74:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.19, v_num=85]
Epoch 74: 100%|##########| 97/97 [00:24<00:00,  3.94it/s, loss=3.19, v_num=85]
Epoch 74: 100%|##########| 97/97 [00:24<00:00,  3.94it/s, loss=3.22, v_num=85]
Epoch 74:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.22, v_num=85]
Epoch 75:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.22, v_num=85]
Epoch 75: 100%|##########| 97/97 [00:24<00:00,  3.98it/s, loss=3.22, v_num=85]
Epoch 75: 100%|##########| 97/97 [00:24<00:00,  3.98it/s, loss=3.19, v_num=85]
Epoch 75:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.19, v_num=85]
Epoch 76:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.19, v_num=85]
Epoch 76: 100%|##########| 97/97 [00:24<00:00,  3.93it/s, loss=3.19, v_num=85]
Epoch 76: 100%|##########| 97/97 [00:24<00:00,  3.93it/s, loss=3.18, v_num=85]
Epoch 76:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.18, v_num=85]
Epoch 77:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.18, v_num=85]
Epoch 77: 100%|##########| 97/97 [00:26<00:00,  3.72it/s, loss=3.18, v_num=85]
Epoch 77: 100%|##########| 97/97 [00:26<00:00,  3.72it/s, loss=3.17, v_num=85]
Epoch 77:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.17, v_num=85]
Epoch 78:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.17, v_num=85]
Epoch 78: 100%|##########| 97/97 [00:26<00:00,  3.66it/s, loss=3.17, v_num=85]
Epoch 78: 100%|##########| 97/97 [00:26<00:00,  3.66it/s, loss=3.18, v_num=85]
Epoch 78:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.18, v_num=85]
Epoch 79:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.18, v_num=85]
Epoch 79: 100%|##########| 97/97 [00:28<00:00,  3.35it/s, loss=3.18, v_num=85]
Epoch 79: 100%|##########| 97/97 [00:28<00:00,  3.35it/s, loss=3.14, v_num=85]
Epoch 79:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.14, v_num=85]
Epoch 80:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.14, v_num=85]
Epoch 80: 100%|##########| 97/97 [00:34<00:00,  2.79it/s, loss=3.14, v_num=85]
Epoch 80: 100%|##########| 97/97 [00:34<00:00,  2.79it/s, loss=3.16, v_num=85]
Epoch 80:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.16, v_num=85]
Epoch 81:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.16, v_num=85]
Epoch 81: 100%|##########| 97/97 [00:33<00:00,  2.92it/s, loss=3.16, v_num=85]
Epoch 81: 100%|##########| 97/97 [00:33<00:00,  2.92it/s, loss=3.1, v_num=85]
Epoch 81:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.1, v_num=85]
Epoch 82:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.1, v_num=85]
Epoch 82: 100%|##########| 97/97 [00:34<00:00,  2.82it/s, loss=3.1, v_num=85]
Epoch 82: 100%|##########| 97/97 [00:34<00:00,  2.82it/s, loss=3.11, v_num=85]
Epoch 82:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.11, v_num=85]
Epoch 83:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.11, v_num=85]
Epoch 83: 100%|##########| 97/97 [00:33<00:00,  2.87it/s, loss=3.11, v_num=85]
Epoch 83: 100%|##########| 97/97 [00:33<00:00,  2.86it/s, loss=3.09, v_num=85]
Epoch 83:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.09, v_num=85]
Epoch 84:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.09, v_num=85]
Epoch 84: 100%|##########| 97/97 [00:32<00:00,  2.95it/s, loss=3.09, v_num=85]
Epoch 84: 100%|##########| 97/97 [00:32<00:00,  2.95it/s, loss=3.12, v_num=85]
Epoch 84:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.12, v_num=85]
Epoch 85:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.12, v_num=85]
Epoch 85: 100%|##########| 97/97 [00:32<00:00,  2.95it/s, loss=3.12, v_num=85]
Epoch 85: 100%|##########| 97/97 [00:32<00:00,  2.95it/s, loss=3.1, v_num=85]
Epoch 85:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.1, v_num=85]
Epoch 86:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.1, v_num=85]
Epoch 86: 100%|##########| 97/97 [00:31<00:00,  3.12it/s, loss=3.1, v_num=85]
Epoch 86: 100%|##########| 97/97 [00:31<00:00,  3.12it/s, loss=3.03, v_num=85]
Epoch 86:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.03, v_num=85]
Epoch 87:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.03, v_num=85]
Epoch 87: 100%|##########| 97/97 [00:33<00:00,  2.85it/s, loss=3.03, v_num=85]
Epoch 87: 100%|##########| 97/97 [00:33<00:00,  2.85it/s, loss=3.05, v_num=85]
Epoch 87:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.05, v_num=85]
Epoch 88:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.05, v_num=85]
Epoch 88: 100%|##########| 97/97 [00:33<00:00,  2.91it/s, loss=3.05, v_num=85]
Epoch 88: 100%|##########| 97/97 [00:33<00:00,  2.91it/s, loss=3.06, v_num=85]
Epoch 88:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.06, v_num=85]
Epoch 89:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.06, v_num=85]
Epoch 89: 100%|##########| 97/97 [00:34<00:00,  2.83it/s, loss=3.06, v_num=85]
Epoch 89: 100%|##########| 97/97 [00:34<00:00,  2.83it/s, loss=3.1, v_num=85]
Epoch 89:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.1, v_num=85]
Epoch 90:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.1, v_num=85]
Epoch 90: 100%|##########| 97/97 [00:35<00:00,  2.75it/s, loss=3.1, v_num=85]
Epoch 90: 100%|##########| 97/97 [00:35<00:00,  2.75it/s, loss=3.05, v_num=85]
Epoch 90:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.05, v_num=85]
Epoch 91:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.05, v_num=85]
Epoch 91: 100%|##########| 97/97 [00:31<00:00,  3.13it/s, loss=3.05, v_num=85]
Epoch 91: 100%|##########| 97/97 [00:31<00:00,  3.13it/s, loss=3.09, v_num=85]
Epoch 91:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.09, v_num=85]
Epoch 92:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.09, v_num=85]
Epoch 92: 100%|##########| 97/97 [00:24<00:00,  3.91it/s, loss=3.09, v_num=85]
Epoch 92: 100%|##########| 97/97 [00:24<00:00,  3.91it/s, loss=3.02, v_num=85]
Epoch 92:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.02, v_num=85]
Epoch 93:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.02, v_num=85]
Epoch 93: 100%|##########| 97/97 [00:25<00:00,  3.77it/s, loss=3.02, v_num=85]
Epoch 93: 100%|##########| 97/97 [00:25<00:00,  3.77it/s, loss=3.03, v_num=85]
Epoch 93:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.03, v_num=85]
Epoch 94:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.03, v_num=85]
Epoch 94: 100%|##########| 97/97 [00:24<00:00,  3.99it/s, loss=3.03, v_num=85]
Epoch 94: 100%|##########| 97/97 [00:24<00:00,  3.99it/s, loss=3.06, v_num=85]
Epoch 94:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.06, v_num=85]
Epoch 95:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.06, v_num=85]
Epoch 95: 100%|##########| 97/97 [00:25<00:00,  3.85it/s, loss=3.06, v_num=85]
Epoch 95: 100%|##########| 97/97 [00:25<00:00,  3.85it/s, loss=3.03, v_num=85]
Epoch 95:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.03, v_num=85]
Epoch 96:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.03, v_num=85]
Epoch 96: 100%|##########| 97/97 [00:26<00:00,  3.72it/s, loss=3.03, v_num=85]
Epoch 96: 100%|##########| 97/97 [00:26<00:00,  3.71it/s, loss=3.05, v_num=85]
Epoch 96:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.05, v_num=85]
Epoch 97:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.05, v_num=85]
Epoch 97: 100%|##########| 97/97 [00:35<00:00,  2.73it/s, loss=3.05, v_num=85]
Epoch 97: 100%|##########| 97/97 [00:35<00:00,  2.73it/s, loss=3.03, v_num=85]
Epoch 97:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.03, v_num=85]
Epoch 98:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.03, v_num=85]
Epoch 98: 100%|##########| 97/97 [00:33<00:00,  2.89it/s, loss=3.03, v_num=85]
Epoch 98: 100%|##########| 97/97 [00:33<00:00,  2.89it/s, loss=3.03, v_num=85]
Epoch 98:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.03, v_num=85]
Epoch 99:   0%|          | 0/97 [00:00<?, ?it/s, loss=3.03, v_num=85]
Epoch 99: 100%|##########| 97/97 [00:32<00:00,  2.99it/s, loss=3.03, v_num=85]
Epoch 99: 100%|##########| 97/97 [00:32<00:00,  2.99it/s, loss=3.05, v_num=85]
Epoch 99: 100%|##########| 97/97 [00:36<00:00,  2.63it/s, loss=3.05, v_num=85]

Train the Classifier

model.eval()
classifier = Classifier(model.resnet_moco)
trainer = pl.Trainer(max_epochs=max_epochs, gpus=gpus,
                     progress_bar_refresh_rate=100)
trainer.fit(
    classifier,
    dataloader_train_classifier,
    dataloader_test
)

Out:

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name        | Type     | Params
-----------------------------------------
0 | resnet_moco | MoCo     | 23.0 M
1 | fc          | Linear   | 5.1 K
2 | accuracy    | Accuracy | 0
-----------------------------------------
5.1 K     Trainable params
23.0 M    Non-trainable params
23.0 M    Total params
91.998    Total estimated model params size (MB)

Validation sanity check: 0it [00:00, ?it/s]
Validation sanity check:   0%|          | 0/2 [00:00<?, ?it/s]

Global seed set to 1

Training: 0it [00:00, ?it/s]
Training:   0%|          | 0/117 [00:00<?, ?it/s]
Epoch 0:   0%|          | 0/117 [00:00<?, ?it/s]
Epoch 0:  85%|########5 | 100/117 [00:11<00:02,  8.44it/s]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:02<00:00,  9.54it/s]
Epoch 0: 100%|##########| 117/117 [00:13<00:00,  8.39it/s]
Epoch 0: 100%|##########| 117/117 [00:14<00:00,  8.34it/s, loss=2.12, v_num=86, val_acc=0.519]

                                                           
Epoch 0:   0%|          | 0/117 [00:00<?, ?it/s, loss=2.12, v_num=86, val_acc=0.519]
Epoch 1:   0%|          | 0/117 [00:00<?, ?it/s, loss=2.12, v_num=86, val_acc=0.519]
Epoch 1:  85%|########5 | 100/117 [00:11<00:01,  8.83it/s, loss=2.12, v_num=86, val_acc=0.519]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 10.29it/s]
Epoch 1: 100%|##########| 117/117 [00:13<00:00,  8.82it/s, loss=2.12, v_num=86, val_acc=0.519]
Epoch 1: 100%|##########| 117/117 [00:13<00:00,  8.74it/s, loss=1.7, v_num=86, val_acc=0.554]

                                                           
Epoch 1:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.7, v_num=86, val_acc=0.554]
Epoch 2:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.7, v_num=86, val_acc=0.554]
Epoch 2:  85%|########5 | 100/117 [00:12<00:02,  8.21it/s, loss=1.7, v_num=86, val_acc=0.554]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 10.47it/s]
Epoch 2: 100%|##########| 117/117 [00:14<00:00,  8.30it/s, loss=1.7, v_num=86, val_acc=0.554]
Epoch 2: 100%|##########| 117/117 [00:14<00:00,  8.25it/s, loss=1.59, v_num=86, val_acc=0.560]

                                                           
Epoch 2:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.59, v_num=86, val_acc=0.560]
Epoch 3:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.59, v_num=86, val_acc=0.560]
Epoch 3:   0%|          | 0/117 [00:10<?, ?it/s, loss=1.59, v_num=86, val_acc=0.560]
Epoch 3:  85%|########5 | 100/117 [00:11<00:02,  8.44it/s, loss=1.59, v_num=86, val_acc=0.560]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 10.35it/s]
Epoch 3: 100%|##########| 117/117 [00:13<00:00,  8.49it/s, loss=1.59, v_num=86, val_acc=0.560]
Epoch 3: 100%|##########| 117/117 [00:13<00:00,  8.43it/s, loss=1.52, v_num=86, val_acc=0.570]

                                                           
Epoch 3:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.52, v_num=86, val_acc=0.570]
Epoch 4:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.52, v_num=86, val_acc=0.570]
Epoch 4:  85%|########5 | 100/117 [00:12<00:02,  8.33it/s, loss=1.52, v_num=86, val_acc=0.570]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 10.22it/s]
Epoch 4: 100%|##########| 117/117 [00:13<00:00,  8.38it/s, loss=1.52, v_num=86, val_acc=0.570]
Epoch 4: 100%|##########| 117/117 [00:14<00:00,  8.32it/s, loss=1.43, v_num=86, val_acc=0.570]

                                                           
Epoch 4:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.43, v_num=86, val_acc=0.570]
Epoch 5:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.43, v_num=86, val_acc=0.570]
Epoch 5:   0%|          | 0/117 [00:11<?, ?it/s, loss=1.43, v_num=86, val_acc=0.570]
Epoch 5:  85%|########5 | 100/117 [00:11<00:01,  8.64it/s, loss=1.43, v_num=86, val_acc=0.570]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 10.19it/s]
Epoch 5: 100%|##########| 117/117 [00:13<00:00,  8.64it/s, loss=1.43, v_num=86, val_acc=0.570]
Epoch 5: 100%|##########| 117/117 [00:13<00:00,  8.57it/s, loss=1.37, v_num=86, val_acc=0.578]

                                                           
Epoch 5:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.37, v_num=86, val_acc=0.578]
Epoch 6:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.37, v_num=86, val_acc=0.578]
Epoch 6:  85%|########5 | 100/117 [00:11<00:01,  8.75it/s, loss=1.37, v_num=86, val_acc=0.578]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 10.80it/s]
Epoch 6: 100%|##########| 117/117 [00:13<00:00,  8.81it/s, loss=1.37, v_num=86, val_acc=0.578]
Epoch 6: 100%|##########| 117/117 [00:13<00:00,  8.75it/s, loss=1.44, v_num=86, val_acc=0.584]

                                                           
Epoch 6:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.44, v_num=86, val_acc=0.584]
Epoch 7:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.44, v_num=86, val_acc=0.584]
Epoch 7:  85%|########5 | 100/117 [00:12<00:02,  8.26it/s, loss=1.44, v_num=86, val_acc=0.584]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:02<00:00,  9.84it/s]
Epoch 7: 100%|##########| 117/117 [00:14<00:00,  8.27it/s, loss=1.44, v_num=86, val_acc=0.584]
Epoch 7: 100%|##########| 117/117 [00:14<00:00,  8.19it/s, loss=1.25, v_num=86, val_acc=0.586]

                                                           
Epoch 7:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.25, v_num=86, val_acc=0.586]
Epoch 8:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.25, v_num=86, val_acc=0.586]
Epoch 8:  85%|########5 | 100/117 [00:12<00:02,  8.20it/s, loss=1.25, v_num=86, val_acc=0.586]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 10.10it/s]
Epoch 8: 100%|##########| 117/117 [00:14<00:00,  8.25it/s, loss=1.25, v_num=86, val_acc=0.586]
Epoch 8: 100%|##########| 117/117 [00:14<00:00,  8.20it/s, loss=1.24, v_num=86, val_acc=0.591]

                                                           
Epoch 8:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.24, v_num=86, val_acc=0.591]
Epoch 9:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.24, v_num=86, val_acc=0.591]
Epoch 9:  85%|########5 | 100/117 [00:12<00:02,  7.99it/s, loss=1.24, v_num=86, val_acc=0.591]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 10.22it/s]
Epoch 9: 100%|##########| 117/117 [00:14<00:00,  8.09it/s, loss=1.24, v_num=86, val_acc=0.591]
Epoch 9: 100%|##########| 117/117 [00:14<00:00,  8.03it/s, loss=1.36, v_num=86, val_acc=0.597]

                                                           
Epoch 9:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.36, v_num=86, val_acc=0.597]
Epoch 10:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.36, v_num=86, val_acc=0.597]
Epoch 10:  85%|########5 | 100/117 [00:12<00:02,  8.03it/s, loss=1.36, v_num=86, val_acc=0.597]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 10.15it/s]
Epoch 10: 100%|##########| 117/117 [00:14<00:00,  8.11it/s, loss=1.36, v_num=86, val_acc=0.597]
Epoch 10: 100%|##########| 117/117 [00:14<00:00,  8.05it/s, loss=1.37, v_num=86, val_acc=0.602]

                                                           
Epoch 10:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.37, v_num=86, val_acc=0.602]
Epoch 11:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.37, v_num=86, val_acc=0.602]
Epoch 11:  85%|########5 | 100/117 [00:12<00:02,  8.20it/s, loss=1.37, v_num=86, val_acc=0.602]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:02<00:00,  9.45it/s]
Epoch 11: 100%|##########| 117/117 [00:14<00:00,  8.17it/s, loss=1.37, v_num=86, val_acc=0.602]
Epoch 11: 100%|##########| 117/117 [00:14<00:00,  8.11it/s, loss=1.39, v_num=86, val_acc=0.605]

                                                           
Epoch 11:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.39, v_num=86, val_acc=0.605]
Epoch 12:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.39, v_num=86, val_acc=0.605]
Epoch 12:   0%|          | 0/117 [00:10<?, ?it/s, loss=1.39, v_num=86, val_acc=0.605]
Epoch 12:  85%|########5 | 100/117 [00:12<00:02,  8.26it/s, loss=1.39, v_num=86, val_acc=0.605]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 10.47it/s]
Epoch 12: 100%|##########| 117/117 [00:14<00:00,  8.35it/s, loss=1.39, v_num=86, val_acc=0.605]
Epoch 12: 100%|##########| 117/117 [00:14<00:00,  8.29it/s, loss=1.16, v_num=86, val_acc=0.608]

                                                           
Epoch 12:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.16, v_num=86, val_acc=0.608]
Epoch 13:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.16, v_num=86, val_acc=0.608]
Epoch 13:  85%|########5 | 100/117 [00:12<00:02,  7.82it/s, loss=1.16, v_num=86, val_acc=0.608]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 10.40it/s]
Epoch 13: 100%|##########| 117/117 [00:14<00:00,  7.95it/s, loss=1.16, v_num=86, val_acc=0.608]
Epoch 13: 100%|##########| 117/117 [00:14<00:00,  7.90it/s, loss=1.27, v_num=86, val_acc=0.607]

                                                           
Epoch 13:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.27, v_num=86, val_acc=0.607]
Epoch 14:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.27, v_num=86, val_acc=0.607]
Epoch 14:   0%|          | 0/117 [00:10<?, ?it/s, loss=1.27, v_num=86, val_acc=0.607]
Epoch 14:  85%|########5 | 100/117 [00:11<00:02,  8.39it/s, loss=1.27, v_num=86, val_acc=0.607]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:02<00:00,  9.67it/s]
Epoch 14: 100%|##########| 117/117 [00:13<00:00,  8.36it/s, loss=1.27, v_num=86, val_acc=0.607]
Epoch 14: 100%|##########| 117/117 [00:14<00:00,  8.31it/s, loss=1.12, v_num=86, val_acc=0.609]

                                                           
Epoch 14:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.12, v_num=86, val_acc=0.609]
Epoch 15:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.12, v_num=86, val_acc=0.609]
Epoch 15:  85%|########5 | 100/117 [00:12<00:02,  7.94it/s, loss=1.12, v_num=86, val_acc=0.609]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 10.01it/s]
Epoch 15: 100%|##########| 117/117 [00:14<00:00,  8.02it/s, loss=1.12, v_num=86, val_acc=0.609]
Epoch 15: 100%|##########| 117/117 [00:14<00:00,  7.95it/s, loss=1.27, v_num=86, val_acc=0.608]

                                                           
Epoch 15:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.27, v_num=86, val_acc=0.608]
Epoch 16:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.27, v_num=86, val_acc=0.608]
Epoch 16:   0%|          | 0/117 [00:11<?, ?it/s, loss=1.27, v_num=86, val_acc=0.608]
Epoch 16:  85%|########5 | 100/117 [00:12<00:02,  7.74it/s, loss=1.27, v_num=86, val_acc=0.608]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 10.07it/s]
Epoch 16: 100%|##########| 117/117 [00:14<00:00,  7.84it/s, loss=1.27, v_num=86, val_acc=0.608]
Epoch 16: 100%|##########| 117/117 [00:15<00:00,  7.77it/s, loss=1.12, v_num=86, val_acc=0.613]

                                                           
Epoch 16:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.12, v_num=86, val_acc=0.613]
Epoch 17:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.12, v_num=86, val_acc=0.613]
Epoch 17:  85%|########5 | 100/117 [00:12<00:02,  8.14it/s, loss=1.12, v_num=86, val_acc=0.613]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 10.25it/s]
Epoch 17: 100%|##########| 117/117 [00:14<00:00,  8.21it/s, loss=1.12, v_num=86, val_acc=0.613]
Epoch 17: 100%|##########| 117/117 [00:14<00:00,  8.15it/s, loss=1.22, v_num=86, val_acc=0.613]

                                                           
Epoch 17:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.22, v_num=86, val_acc=0.613]
Epoch 18:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.22, v_num=86, val_acc=0.613]
Epoch 18:   0%|          | 0/117 [00:10<?, ?it/s, loss=1.22, v_num=86, val_acc=0.613]
Epoch 18:  85%|########5 | 100/117 [00:13<00:02,  7.68it/s, loss=1.22, v_num=86, val_acc=0.613]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:02<00:00,  9.85it/s]
Epoch 18: 100%|##########| 117/117 [00:15<00:00,  7.77it/s, loss=1.22, v_num=86, val_acc=0.613]
Epoch 18: 100%|##########| 117/117 [00:15<00:00,  7.72it/s, loss=1.1, v_num=86, val_acc=0.615]

                                                           
Epoch 18:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.1, v_num=86, val_acc=0.615]
Epoch 19:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.1, v_num=86, val_acc=0.615]
Epoch 19:  85%|########5 | 100/117 [00:12<00:02,  8.12it/s, loss=1.1, v_num=86, val_acc=0.615]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:02<00:00,  9.91it/s]
Epoch 19: 100%|##########| 117/117 [00:14<00:00,  8.16it/s, loss=1.1, v_num=86, val_acc=0.615]
Epoch 19: 100%|##########| 117/117 [00:14<00:00,  8.11it/s, loss=1.04, v_num=86, val_acc=0.620]

                                                           
Epoch 19:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.04, v_num=86, val_acc=0.620]
Epoch 20:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.04, v_num=86, val_acc=0.620]
Epoch 20:   0%|          | 0/117 [00:10<?, ?it/s, loss=1.04, v_num=86, val_acc=0.620]
Epoch 20:  85%|########5 | 100/117 [00:12<00:02,  7.92it/s, loss=1.04, v_num=86, val_acc=0.620]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:02<00:00,  9.93it/s]
Epoch 20: 100%|##########| 117/117 [00:14<00:00,  7.99it/s, loss=1.04, v_num=86, val_acc=0.620]
Epoch 20: 100%|##########| 117/117 [00:14<00:00,  7.93it/s, loss=1.13, v_num=86, val_acc=0.621]

                                                           
Epoch 20:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.13, v_num=86, val_acc=0.621]
Epoch 21:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.13, v_num=86, val_acc=0.621]
Epoch 21:  85%|########5 | 100/117 [00:11<00:02,  8.33it/s, loss=1.13, v_num=86, val_acc=0.621]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.95it/s]
Epoch 21: 100%|##########| 117/117 [00:13<00:00,  8.55it/s, loss=1.13, v_num=86, val_acc=0.621]
Epoch 21: 100%|##########| 117/117 [00:13<00:00,  8.51it/s, loss=1.13, v_num=86, val_acc=0.624]

                                                           
Epoch 21:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.13, v_num=86, val_acc=0.624]
Epoch 22:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.13, v_num=86, val_acc=0.624]
Epoch 22:  85%|########5 | 100/117 [00:09<00:01, 10.29it/s, loss=1.13, v_num=86, val_acc=0.624]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.93it/s]
Epoch 22: 100%|##########| 117/117 [00:11<00:00, 10.18it/s, loss=1.1, v_num=86, val_acc=0.626]

                                                           
Epoch 22:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.1, v_num=86, val_acc=0.626]
Epoch 23:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.1, v_num=86, val_acc=0.626]
Epoch 23:  85%|########5 | 100/117 [00:09<00:01, 10.12it/s, loss=1.1, v_num=86, val_acc=0.626]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.20it/s]
Epoch 23: 100%|##########| 117/117 [00:11<00:00, 10.07it/s, loss=1.04, v_num=86, val_acc=0.628]

                                                           
Epoch 23:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.04, v_num=86, val_acc=0.628]
Epoch 24:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.04, v_num=86, val_acc=0.628]
Epoch 24:  85%|########5 | 100/117 [00:09<00:01, 10.33it/s, loss=1.04, v_num=86, val_acc=0.628]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.77it/s]
Epoch 24: 100%|##########| 117/117 [00:11<00:00, 10.21it/s, loss=1.07, v_num=86, val_acc=0.631]

                                                           
Epoch 24:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.07, v_num=86, val_acc=0.631]
Epoch 25:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.07, v_num=86, val_acc=0.631]
Epoch 25:  85%|########5 | 100/117 [00:09<00:01, 10.53it/s, loss=1.07, v_num=86, val_acc=0.631]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.61it/s]
Epoch 25: 100%|##########| 117/117 [00:11<00:00, 10.35it/s, loss=1.01, v_num=86, val_acc=0.633]

                                                           
Epoch 25:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.01, v_num=86, val_acc=0.633]
Epoch 26:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.01, v_num=86, val_acc=0.633]
Epoch 26:  85%|########5 | 100/117 [00:09<00:01, 10.37it/s, loss=1.01, v_num=86, val_acc=0.633]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.06it/s]
Epoch 26: 100%|##########| 117/117 [00:11<00:00, 10.28it/s, loss=1.02, v_num=86, val_acc=0.634]

                                                           
Epoch 26:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.02, v_num=86, val_acc=0.634]
Epoch 27:   0%|          | 0/117 [00:00<?, ?it/s, loss=1.02, v_num=86, val_acc=0.634]
Epoch 27:  85%|########5 | 100/117 [00:09<00:01, 10.17it/s, loss=1.02, v_num=86, val_acc=0.634]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.78it/s]
Epoch 27: 100%|##########| 117/117 [00:11<00:00, 10.07it/s, loss=0.93, v_num=86, val_acc=0.635]

                                                           
Epoch 27:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.93, v_num=86, val_acc=0.635]
Epoch 28:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.93, v_num=86, val_acc=0.635]
Epoch 28:  85%|########5 | 100/117 [00:09<00:01, 10.38it/s, loss=0.93, v_num=86, val_acc=0.635]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.32it/s]
Epoch 28: 100%|##########| 117/117 [00:11<00:00, 10.31it/s, loss=0.952, v_num=86, val_acc=0.638]

                                                           
Epoch 28:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.952, v_num=86, val_acc=0.638]
Epoch 29:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.952, v_num=86, val_acc=0.638]
Epoch 29:  85%|########5 | 100/117 [00:09<00:01, 10.74it/s, loss=0.952, v_num=86, val_acc=0.638]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.70it/s]
Epoch 29: 100%|##########| 117/117 [00:11<00:00, 10.54it/s, loss=0.97, v_num=86, val_acc=0.640]

                                                           
Epoch 29:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.97, v_num=86, val_acc=0.640]
Epoch 30:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.97, v_num=86, val_acc=0.640]
Epoch 30:  85%|########5 | 100/117 [00:09<00:01, 10.46it/s, loss=0.97, v_num=86, val_acc=0.640]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.64it/s]
Epoch 30: 100%|##########| 117/117 [00:11<00:00, 10.28it/s, loss=0.894, v_num=86, val_acc=0.641]

                                                           
Epoch 30:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.894, v_num=86, val_acc=0.641]
Epoch 31:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.894, v_num=86, val_acc=0.641]
Epoch 31:  85%|########5 | 100/117 [00:09<00:01, 10.36it/s, loss=0.894, v_num=86, val_acc=0.641]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.23it/s]
Epoch 31: 100%|##########| 117/117 [00:11<00:00, 10.28it/s, loss=0.968, v_num=86, val_acc=0.644]

                                                           
Epoch 31:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.968, v_num=86, val_acc=0.644]
Epoch 32:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.968, v_num=86, val_acc=0.644]
Epoch 32:  85%|########5 | 100/117 [00:09<00:01, 10.11it/s, loss=0.968, v_num=86, val_acc=0.644]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.61it/s]
Epoch 32: 100%|##########| 117/117 [00:11<00:00, 10.00it/s, loss=0.951, v_num=86, val_acc=0.645]

                                                           
Epoch 32:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.951, v_num=86, val_acc=0.645]
Epoch 33:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.951, v_num=86, val_acc=0.645]
Epoch 33:  85%|########5 | 100/117 [00:09<00:01, 10.24it/s, loss=0.951, v_num=86, val_acc=0.645]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.78it/s]
Epoch 33: 100%|##########| 117/117 [00:11<00:00, 10.13it/s, loss=0.892, v_num=86, val_acc=0.647]

                                                           
Epoch 33:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.892, v_num=86, val_acc=0.647]
Epoch 34:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.892, v_num=86, val_acc=0.647]
Epoch 34:  85%|########5 | 100/117 [00:09<00:01, 10.18it/s, loss=0.892, v_num=86, val_acc=0.647]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.87it/s]
Epoch 34: 100%|##########| 117/117 [00:11<00:00, 10.09it/s, loss=0.944, v_num=86, val_acc=0.649]

                                                           
Epoch 34:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.944, v_num=86, val_acc=0.649]
Epoch 35:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.944, v_num=86, val_acc=0.649]
Epoch 35:  85%|########5 | 100/117 [00:09<00:01, 10.10it/s, loss=0.944, v_num=86, val_acc=0.649]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.09it/s]
Epoch 35: 100%|##########| 117/117 [00:11<00:00, 10.04it/s, loss=0.906, v_num=86, val_acc=0.651]

                                                           
Epoch 35:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.906, v_num=86, val_acc=0.651]
Epoch 36:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.906, v_num=86, val_acc=0.651]
Epoch 36:  85%|########5 | 100/117 [00:09<00:01, 10.40it/s, loss=0.906, v_num=86, val_acc=0.651]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.90it/s]
Epoch 36: 100%|##########| 117/117 [00:11<00:00, 10.28it/s, loss=0.858, v_num=86, val_acc=0.653]

                                                           
Epoch 36:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.858, v_num=86, val_acc=0.653]
Epoch 37:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.858, v_num=86, val_acc=0.653]
Epoch 37:  85%|########5 | 100/117 [00:09<00:01, 10.14it/s, loss=0.858, v_num=86, val_acc=0.653]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.90it/s]
Epoch 37: 100%|##########| 117/117 [00:11<00:00, 10.06it/s, loss=0.861, v_num=86, val_acc=0.655]

                                                           
Epoch 37:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.861, v_num=86, val_acc=0.655]
Epoch 38:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.861, v_num=86, val_acc=0.655]
Epoch 38:  85%|########5 | 100/117 [00:09<00:01, 10.12it/s, loss=0.861, v_num=86, val_acc=0.655]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.96it/s]
Epoch 38: 100%|##########| 117/117 [00:11<00:00, 10.05it/s, loss=0.798, v_num=86, val_acc=0.657]

                                                           
Epoch 38:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.798, v_num=86, val_acc=0.657]
Epoch 39:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.798, v_num=86, val_acc=0.657]
Epoch 39:  85%|########5 | 100/117 [00:09<00:01, 10.35it/s, loss=0.798, v_num=86, val_acc=0.657]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.31it/s]
Epoch 39: 100%|##########| 117/117 [00:11<00:00, 10.29it/s, loss=0.846, v_num=86, val_acc=0.659]

                                                           
Epoch 39:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.846, v_num=86, val_acc=0.659]
Epoch 40:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.846, v_num=86, val_acc=0.659]
Epoch 40:  85%|########5 | 100/117 [00:09<00:01, 10.09it/s, loss=0.846, v_num=86, val_acc=0.659]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.31it/s]
Epoch 40: 100%|##########| 117/117 [00:11<00:00, 10.07it/s, loss=0.832, v_num=86, val_acc=0.660]

                                                           
Epoch 40:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.832, v_num=86, val_acc=0.660]
Epoch 41:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.832, v_num=86, val_acc=0.660]
Epoch 41:  85%|########5 | 100/117 [00:09<00:01, 10.21it/s, loss=0.832, v_num=86, val_acc=0.660]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.59it/s]
Epoch 41: 100%|##########| 117/117 [00:11<00:00, 10.08it/s, loss=0.851, v_num=86, val_acc=0.661]

                                                           
Epoch 41:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.851, v_num=86, val_acc=0.661]
Epoch 42:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.851, v_num=86, val_acc=0.661]
Epoch 42:  85%|########5 | 100/117 [00:09<00:01, 10.31it/s, loss=0.851, v_num=86, val_acc=0.661]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.07it/s]
Epoch 42: 100%|##########| 117/117 [00:11<00:00, 10.23it/s, loss=0.799, v_num=86, val_acc=0.662]

                                                           
Epoch 42:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.799, v_num=86, val_acc=0.662]
Epoch 43:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.799, v_num=86, val_acc=0.662]
Epoch 43:  85%|########5 | 100/117 [00:09<00:01, 10.58it/s, loss=0.799, v_num=86, val_acc=0.662]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.07it/s]
Epoch 43: 100%|##########| 117/117 [00:11<00:00, 10.45it/s, loss=0.799, v_num=86, val_acc=0.664]

                                                           
Epoch 43:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.799, v_num=86, val_acc=0.664]
Epoch 44:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.799, v_num=86, val_acc=0.664]
Epoch 44:  85%|########5 | 100/117 [00:09<00:01, 10.31it/s, loss=0.799, v_num=86, val_acc=0.664]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.11it/s]
Epoch 44: 100%|##########| 117/117 [00:11<00:00, 10.23it/s, loss=0.748, v_num=86, val_acc=0.666]

                                                           
Epoch 44:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.748, v_num=86, val_acc=0.666]
Epoch 45:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.748, v_num=86, val_acc=0.666]
Epoch 45:  85%|########5 | 100/117 [00:09<00:01, 10.39it/s, loss=0.748, v_num=86, val_acc=0.666]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.89it/s]
Epoch 45: 100%|##########| 117/117 [00:11<00:00, 10.27it/s, loss=0.824, v_num=86, val_acc=0.667]

                                                           
Epoch 45:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.824, v_num=86, val_acc=0.667]
Epoch 46:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.824, v_num=86, val_acc=0.667]
Epoch 46:  85%|########5 | 100/117 [00:09<00:01, 10.13it/s, loss=0.824, v_num=86, val_acc=0.667]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.67it/s]
Epoch 46: 100%|##########| 117/117 [00:11<00:00, 10.03it/s, loss=0.772, v_num=86, val_acc=0.668]

                                                           
Epoch 46:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.772, v_num=86, val_acc=0.668]
Epoch 47:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.772, v_num=86, val_acc=0.668]
Epoch 47:  85%|########5 | 100/117 [00:10<00:01,  9.99it/s, loss=0.772, v_num=86, val_acc=0.668]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 10.13it/s]
Epoch 47: 100%|##########| 117/117 [00:11<00:00,  9.76it/s, loss=0.772, v_num=86, val_acc=0.668]
Epoch 47: 100%|##########| 117/117 [00:12<00:00,  9.68it/s, loss=0.784, v_num=86, val_acc=0.670]

                                                           
Epoch 47:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.784, v_num=86, val_acc=0.670]
Epoch 48:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.784, v_num=86, val_acc=0.670]
Epoch 48:  85%|########5 | 100/117 [00:09<00:01, 10.47it/s, loss=0.784, v_num=86, val_acc=0.670]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.33it/s]
Epoch 48: 100%|##########| 117/117 [00:11<00:00, 10.38it/s, loss=0.775, v_num=86, val_acc=0.671]

                                                           
Epoch 48:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.775, v_num=86, val_acc=0.671]
Epoch 49:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.775, v_num=86, val_acc=0.671]
Epoch 49:  85%|########5 | 100/117 [00:09<00:01, 10.14it/s, loss=0.775, v_num=86, val_acc=0.671]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.00it/s]
Epoch 49: 100%|##########| 117/117 [00:11<00:00, 10.08it/s, loss=0.764, v_num=86, val_acc=0.672]

                                                           
Epoch 49:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.764, v_num=86, val_acc=0.672]
Epoch 50:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.764, v_num=86, val_acc=0.672]
Epoch 50:  85%|########5 | 100/117 [00:09<00:01, 10.17it/s, loss=0.764, v_num=86, val_acc=0.672]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.02it/s]
Epoch 50: 100%|##########| 117/117 [00:11<00:00, 10.09it/s, loss=0.75, v_num=86, val_acc=0.673]

                                                           
Epoch 50:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.75, v_num=86, val_acc=0.673]
Epoch 51:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.75, v_num=86, val_acc=0.673]
Epoch 51:  85%|########5 | 100/117 [00:09<00:01, 10.29it/s, loss=0.75, v_num=86, val_acc=0.673]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.09it/s]
Epoch 51: 100%|##########| 117/117 [00:11<00:00, 10.21it/s, loss=0.767, v_num=86, val_acc=0.675]

                                                           
Epoch 51:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.767, v_num=86, val_acc=0.675]
Epoch 52:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.767, v_num=86, val_acc=0.675]
Epoch 52:  85%|########5 | 100/117 [00:09<00:01, 10.50it/s, loss=0.767, v_num=86, val_acc=0.675]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.83it/s]
Epoch 52: 100%|##########| 117/117 [00:11<00:00, 10.36it/s, loss=0.748, v_num=86, val_acc=0.676]

                                                           
Epoch 52:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.748, v_num=86, val_acc=0.676]
Epoch 53:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.748, v_num=86, val_acc=0.676]
Epoch 53:  85%|########5 | 100/117 [00:09<00:01, 10.24it/s, loss=0.748, v_num=86, val_acc=0.676]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.97it/s]
Epoch 53: 100%|##########| 117/117 [00:11<00:00, 10.16it/s, loss=0.744, v_num=86, val_acc=0.677]

                                                           
Epoch 53:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.744, v_num=86, val_acc=0.677]
Epoch 54:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.744, v_num=86, val_acc=0.677]
Epoch 54:  85%|########5 | 100/117 [00:09<00:01, 10.38it/s, loss=0.744, v_num=86, val_acc=0.677]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.48it/s]
Epoch 54: 100%|##########| 117/117 [00:11<00:00, 10.21it/s, loss=0.759, v_num=86, val_acc=0.678]

                                                           
Epoch 54:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.759, v_num=86, val_acc=0.678]
Epoch 55:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.759, v_num=86, val_acc=0.678]
Epoch 55:  85%|########5 | 100/117 [00:09<00:01, 10.47it/s, loss=0.759, v_num=86, val_acc=0.678]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.64it/s]
Epoch 55: 100%|##########| 117/117 [00:11<00:00, 10.31it/s, loss=0.755, v_num=86, val_acc=0.679]

                                                           
Epoch 55:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.755, v_num=86, val_acc=0.679]
Epoch 56:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.755, v_num=86, val_acc=0.679]
Epoch 56:  85%|########5 | 100/117 [00:09<00:01, 10.33it/s, loss=0.755, v_num=86, val_acc=0.679]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.75it/s]
Epoch 56: 100%|##########| 117/117 [00:11<00:00, 10.20it/s, loss=0.763, v_num=86, val_acc=0.680]

                                                           
Epoch 56:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.763, v_num=86, val_acc=0.680]
Epoch 57:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.763, v_num=86, val_acc=0.680]
Epoch 57:  85%|########5 | 100/117 [00:09<00:01, 10.37it/s, loss=0.763, v_num=86, val_acc=0.680]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.99it/s]
Epoch 57: 100%|##########| 117/117 [00:11<00:00, 10.26it/s, loss=0.76, v_num=86, val_acc=0.681]

                                                           
Epoch 57:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.76, v_num=86, val_acc=0.681]
Epoch 58:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.76, v_num=86, val_acc=0.681]
Epoch 58:  85%|########5 | 100/117 [00:09<00:01, 10.43it/s, loss=0.76, v_num=86, val_acc=0.681]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.62it/s]
Epoch 58: 100%|##########| 117/117 [00:11<00:00, 10.27it/s, loss=0.744, v_num=86, val_acc=0.682]

                                                           
Epoch 58:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.744, v_num=86, val_acc=0.682]
Epoch 59:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.744, v_num=86, val_acc=0.682]
Epoch 59:  85%|########5 | 100/117 [00:09<00:01, 10.11it/s, loss=0.744, v_num=86, val_acc=0.682]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.36it/s]
Epoch 59: 100%|##########| 117/117 [00:11<00:00, 10.10it/s, loss=0.751, v_num=86, val_acc=0.683]

                                                           
Epoch 59:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.751, v_num=86, val_acc=0.683]
Epoch 60:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.751, v_num=86, val_acc=0.683]
Epoch 60:  85%|########5 | 100/117 [00:09<00:01, 10.36it/s, loss=0.751, v_num=86, val_acc=0.683]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.18it/s]
Epoch 60: 100%|##########| 117/117 [00:11<00:00, 10.27it/s, loss=0.762, v_num=86, val_acc=0.684]

                                                           
Epoch 60:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.762, v_num=86, val_acc=0.684]
Epoch 61:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.762, v_num=86, val_acc=0.684]
Epoch 61:  85%|########5 | 100/117 [00:09<00:01, 10.40it/s, loss=0.762, v_num=86, val_acc=0.684]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.79it/s]
Epoch 61: 100%|##########| 117/117 [00:11<00:00, 10.27it/s, loss=0.747, v_num=86, val_acc=0.685]

                                                           
Epoch 61:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.747, v_num=86, val_acc=0.685]
Epoch 62:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.747, v_num=86, val_acc=0.685]
Epoch 62:  85%|########5 | 100/117 [00:09<00:01, 10.34it/s, loss=0.747, v_num=86, val_acc=0.685]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.95it/s]
Epoch 62: 100%|##########| 117/117 [00:11<00:00, 10.25it/s, loss=0.752, v_num=86, val_acc=0.686]

                                                           
Epoch 62:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.752, v_num=86, val_acc=0.686]
Epoch 63:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.752, v_num=86, val_acc=0.686]
Epoch 63:  85%|########5 | 100/117 [00:09<00:01, 10.39it/s, loss=0.752, v_num=86, val_acc=0.686]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.03it/s]
Epoch 63: 100%|##########| 117/117 [00:11<00:00, 10.29it/s, loss=0.728, v_num=86, val_acc=0.687]

                                                           
Epoch 63:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.728, v_num=86, val_acc=0.687]
Epoch 64:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.728, v_num=86, val_acc=0.687]
Epoch 64:  85%|########5 | 100/117 [00:09<00:01, 10.56it/s, loss=0.728, v_num=86, val_acc=0.687]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.16it/s]
Epoch 64: 100%|##########| 117/117 [00:11<00:00, 10.45it/s, loss=0.742, v_num=86, val_acc=0.688]

                                                           
Epoch 64:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.742, v_num=86, val_acc=0.688]
Epoch 65:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.742, v_num=86, val_acc=0.688]
Epoch 65:  85%|########5 | 100/117 [00:09<00:01, 10.49it/s, loss=0.742, v_num=86, val_acc=0.688]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.39it/s]
Epoch 65: 100%|##########| 117/117 [00:11<00:00, 10.40it/s, loss=0.739, v_num=86, val_acc=0.688]

                                                           
Epoch 65:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.739, v_num=86, val_acc=0.688]
Epoch 66:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.739, v_num=86, val_acc=0.688]
Epoch 66:  85%|########5 | 100/117 [00:09<00:01, 10.52it/s, loss=0.739, v_num=86, val_acc=0.688]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.97it/s]
Epoch 66: 100%|##########| 117/117 [00:11<00:00, 10.39it/s, loss=0.754, v_num=86, val_acc=0.689]

                                                           
Epoch 66:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.754, v_num=86, val_acc=0.689]
Epoch 67:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.754, v_num=86, val_acc=0.689]
Epoch 67:  85%|########5 | 100/117 [00:09<00:01, 10.64it/s, loss=0.754, v_num=86, val_acc=0.689]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.03it/s]
Epoch 67: 100%|##########| 117/117 [00:11<00:00, 10.50it/s, loss=0.747, v_num=86, val_acc=0.690]

                                                           
Epoch 67:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.747, v_num=86, val_acc=0.690]
Epoch 68:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.747, v_num=86, val_acc=0.690]
Epoch 68:  85%|########5 | 100/117 [00:09<00:01, 10.57it/s, loss=0.747, v_num=86, val_acc=0.690]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.05it/s]
Epoch 68: 100%|##########| 117/117 [00:11<00:00, 10.44it/s, loss=0.737, v_num=86, val_acc=0.691]

                                                           
Epoch 68:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.737, v_num=86, val_acc=0.691]
Epoch 69:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.737, v_num=86, val_acc=0.691]
Epoch 69:  85%|########5 | 100/117 [00:09<00:01, 10.48it/s, loss=0.737, v_num=86, val_acc=0.691]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.36it/s]
Epoch 69: 100%|##########| 117/117 [00:11<00:00, 10.40it/s, loss=0.733, v_num=86, val_acc=0.691]

                                                           
Epoch 69:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.733, v_num=86, val_acc=0.691]
Epoch 70:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.733, v_num=86, val_acc=0.691]
Epoch 70:  85%|########5 | 100/117 [00:09<00:01, 10.38it/s, loss=0.733, v_num=86, val_acc=0.691]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.75it/s]
Epoch 70: 100%|##########| 117/117 [00:11<00:00, 10.23it/s, loss=0.736, v_num=86, val_acc=0.692]

                                                           
Epoch 70:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.736, v_num=86, val_acc=0.692]
Epoch 71:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.736, v_num=86, val_acc=0.692]
Epoch 71:  85%|########5 | 100/117 [00:09<00:01, 10.58it/s, loss=0.736, v_num=86, val_acc=0.692]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.74it/s]
Epoch 71: 100%|##########| 117/117 [00:11<00:00, 10.40it/s, loss=0.739, v_num=86, val_acc=0.693]

                                                           
Epoch 71:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.739, v_num=86, val_acc=0.693]
Epoch 72:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.739, v_num=86, val_acc=0.693]
Epoch 72:  85%|########5 | 100/117 [00:09<00:01, 10.56it/s, loss=0.739, v_num=86, val_acc=0.693]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.45it/s]
Epoch 72: 100%|##########| 117/117 [00:11<00:00, 10.49it/s, loss=0.742, v_num=86, val_acc=0.693]

                                                           
Epoch 72:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.742, v_num=86, val_acc=0.693]
Epoch 73:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.742, v_num=86, val_acc=0.693]
Epoch 73:  85%|########5 | 100/117 [00:09<00:01, 10.56it/s, loss=0.742, v_num=86, val_acc=0.693]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.01it/s]
Epoch 73: 100%|##########| 117/117 [00:11<00:00, 10.42it/s, loss=0.745, v_num=86, val_acc=0.694]

                                                           
Epoch 73:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.745, v_num=86, val_acc=0.694]
Epoch 74:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.745, v_num=86, val_acc=0.694]
Epoch 74:  85%|########5 | 100/117 [00:09<00:01, 10.51it/s, loss=0.745, v_num=86, val_acc=0.694]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.83it/s]
Epoch 74: 100%|##########| 117/117 [00:11<00:00, 10.37it/s, loss=0.742, v_num=86, val_acc=0.695]

                                                           
Epoch 74:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.742, v_num=86, val_acc=0.695]
Epoch 75:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.742, v_num=86, val_acc=0.695]
Epoch 75:  85%|########5 | 100/117 [00:09<00:01, 10.53it/s, loss=0.742, v_num=86, val_acc=0.695]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.06it/s]
Epoch 75: 100%|##########| 117/117 [00:11<00:00, 10.42it/s, loss=0.746, v_num=86, val_acc=0.695]

                                                           
Epoch 75:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.746, v_num=86, val_acc=0.695]
Epoch 76:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.746, v_num=86, val_acc=0.695]
Epoch 76:  85%|########5 | 100/117 [00:09<00:01, 10.61it/s, loss=0.746, v_num=86, val_acc=0.695]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.23it/s]
Epoch 76: 100%|##########| 117/117 [00:11<00:00, 10.49it/s, loss=0.749, v_num=86, val_acc=0.696]

                                                           
Epoch 76:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.749, v_num=86, val_acc=0.696]
Epoch 77:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.749, v_num=86, val_acc=0.696]
Epoch 77:  85%|########5 | 100/117 [00:09<00:01, 10.68it/s, loss=0.749, v_num=86, val_acc=0.696]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.35it/s]
Epoch 77: 100%|##########| 117/117 [00:11<00:00, 10.57it/s, loss=0.732, v_num=86, val_acc=0.697]

                                                           
Epoch 77:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.732, v_num=86, val_acc=0.697]
Epoch 78:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.732, v_num=86, val_acc=0.697]
Epoch 78:  85%|########5 | 100/117 [00:09<00:01, 10.31it/s, loss=0.732, v_num=86, val_acc=0.697]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.89it/s]
Epoch 78: 100%|##########| 117/117 [00:11<00:00, 10.20it/s, loss=0.726, v_num=86, val_acc=0.697]

                                                           
Epoch 78:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.726, v_num=86, val_acc=0.697]
Epoch 79:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.726, v_num=86, val_acc=0.697]
Epoch 79:  85%|########5 | 100/117 [00:09<00:01, 10.60it/s, loss=0.726, v_num=86, val_acc=0.697]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.85it/s]
Epoch 79: 100%|##########| 117/117 [00:11<00:00, 10.44it/s, loss=0.732, v_num=86, val_acc=0.698]

                                                           
Epoch 79:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.732, v_num=86, val_acc=0.698]
Epoch 80:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.732, v_num=86, val_acc=0.698]
Epoch 80:  85%|########5 | 100/117 [00:09<00:01, 10.31it/s, loss=0.732, v_num=86, val_acc=0.698]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.16it/s]
Epoch 80: 100%|##########| 117/117 [00:11<00:00, 10.23it/s, loss=0.736, v_num=86, val_acc=0.698]

                                                           
Epoch 80:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.736, v_num=86, val_acc=0.698]
Epoch 81:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.736, v_num=86, val_acc=0.698]
Epoch 81:  85%|########5 | 100/117 [00:09<00:01, 10.40it/s, loss=0.736, v_num=86, val_acc=0.698]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.02it/s]
Epoch 81: 100%|##########| 117/117 [00:11<00:00, 10.31it/s, loss=0.743, v_num=86, val_acc=0.699]

                                                           
Epoch 81:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.743, v_num=86, val_acc=0.699]
Epoch 82:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.743, v_num=86, val_acc=0.699]
Epoch 82:  85%|########5 | 100/117 [00:09<00:01, 10.50it/s, loss=0.743, v_num=86, val_acc=0.699]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.83it/s]
Epoch 82: 100%|##########| 117/117 [00:11<00:00, 10.47it/s, loss=0.744, v_num=86, val_acc=0.700]

                                                           
Epoch 82:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.744, v_num=86, val_acc=0.700]
Epoch 83:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.744, v_num=86, val_acc=0.700]
Epoch 83:  85%|########5 | 100/117 [00:09<00:01, 10.63it/s, loss=0.744, v_num=86, val_acc=0.700]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.22it/s]
Epoch 83: 100%|##########| 117/117 [00:11<00:00, 10.51it/s, loss=0.734, v_num=86, val_acc=0.700]

                                                           
Epoch 83:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.734, v_num=86, val_acc=0.700]
Epoch 84:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.734, v_num=86, val_acc=0.700]
Epoch 84:  85%|########5 | 100/117 [00:09<00:01, 10.46it/s, loss=0.734, v_num=86, val_acc=0.700]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.98it/s]
Epoch 84: 100%|##########| 117/117 [00:11<00:00, 10.34it/s, loss=0.736, v_num=86, val_acc=0.701]

                                                           
Epoch 84:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.736, v_num=86, val_acc=0.701]
Epoch 85:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.736, v_num=86, val_acc=0.701]
Epoch 85:  85%|########5 | 100/117 [00:09<00:01, 10.80it/s, loss=0.736, v_num=86, val_acc=0.701]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.28it/s]
Epoch 85: 100%|##########| 117/117 [00:10<00:00, 10.66it/s, loss=0.727, v_num=86, val_acc=0.701]

                                                           
Epoch 85:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.727, v_num=86, val_acc=0.701]
Epoch 86:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.727, v_num=86, val_acc=0.701]
Epoch 86:  85%|########5 | 100/117 [00:09<00:01, 10.60it/s, loss=0.727, v_num=86, val_acc=0.701]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.22it/s]
Epoch 86: 100%|##########| 117/117 [00:11<00:00, 10.48it/s, loss=0.731, v_num=86, val_acc=0.702]

                                                           
Epoch 86:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.731, v_num=86, val_acc=0.702]
Epoch 87:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.731, v_num=86, val_acc=0.702]
Epoch 87:  85%|########5 | 100/117 [00:09<00:01, 10.44it/s, loss=0.731, v_num=86, val_acc=0.702]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.93it/s]
Epoch 87: 100%|##########| 117/117 [00:11<00:00, 10.32it/s, loss=0.741, v_num=86, val_acc=0.702]

                                                           
Epoch 87:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.741, v_num=86, val_acc=0.702]
Epoch 88:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.741, v_num=86, val_acc=0.702]
Epoch 88:  85%|########5 | 100/117 [00:09<00:01, 10.29it/s, loss=0.741, v_num=86, val_acc=0.702]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.07it/s]
Epoch 88: 100%|##########| 117/117 [00:11<00:00, 10.21it/s, loss=0.74, v_num=86, val_acc=0.703]

                                                           
Epoch 88:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.74, v_num=86, val_acc=0.703]
Epoch 89:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.74, v_num=86, val_acc=0.703]
Epoch 89:  85%|########5 | 100/117 [00:09<00:01, 10.58it/s, loss=0.74, v_num=86, val_acc=0.703]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.86it/s]
Epoch 89: 100%|##########| 117/117 [00:11<00:00, 10.42it/s, loss=0.729, v_num=86, val_acc=0.703]

                                                           
Epoch 89:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.729, v_num=86, val_acc=0.703]
Epoch 90:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.729, v_num=86, val_acc=0.703]
Epoch 90:  85%|########5 | 100/117 [00:09<00:01, 10.53it/s, loss=0.729, v_num=86, val_acc=0.703]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.87it/s]
Epoch 90: 100%|##########| 117/117 [00:11<00:00, 10.39it/s, loss=0.724, v_num=86, val_acc=0.704]

                                                           
Epoch 90:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.724, v_num=86, val_acc=0.704]
Epoch 91:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.724, v_num=86, val_acc=0.704]
Epoch 91:  85%|########5 | 100/117 [00:09<00:01, 10.54it/s, loss=0.724, v_num=86, val_acc=0.704]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.39it/s]
Epoch 91: 100%|##########| 117/117 [00:11<00:00, 10.45it/s, loss=0.753, v_num=86, val_acc=0.704]

                                                           
Epoch 91:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.753, v_num=86, val_acc=0.704]
Epoch 92:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.753, v_num=86, val_acc=0.704]
Epoch 92:  85%|########5 | 100/117 [00:09<00:01, 10.53it/s, loss=0.753, v_num=86, val_acc=0.704]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.40it/s]
Epoch 92: 100%|##########| 117/117 [00:11<00:00, 10.46it/s, loss=0.739, v_num=86, val_acc=0.705]

                                                           
Epoch 92:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.739, v_num=86, val_acc=0.705]
Epoch 93:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.739, v_num=86, val_acc=0.705]
Epoch 93:  85%|########5 | 100/117 [00:09<00:01, 10.67it/s, loss=0.739, v_num=86, val_acc=0.705]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.89it/s]
Epoch 93: 100%|##########| 117/117 [00:11<00:00, 10.50it/s, loss=0.744, v_num=86, val_acc=0.705]

                                                           
Epoch 93:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.744, v_num=86, val_acc=0.705]
Epoch 94:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.744, v_num=86, val_acc=0.705]
Epoch 94:  85%|########5 | 100/117 [00:09<00:01, 10.56it/s, loss=0.744, v_num=86, val_acc=0.705]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.74it/s]
Epoch 94: 100%|##########| 117/117 [00:11<00:00, 10.38it/s, loss=0.726, v_num=86, val_acc=0.705]

                                                           
Epoch 94:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.726, v_num=86, val_acc=0.705]
Epoch 95:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.726, v_num=86, val_acc=0.705]
Epoch 95:  85%|########5 | 100/117 [00:09<00:01, 10.74it/s, loss=0.726, v_num=86, val_acc=0.705]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.48it/s]
Epoch 95: 100%|##########| 117/117 [00:11<00:00, 10.62it/s, loss=0.726, v_num=86, val_acc=0.706]

                                                           
Epoch 95:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.726, v_num=86, val_acc=0.706]
Epoch 96:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.726, v_num=86, val_acc=0.706]
Epoch 96:  85%|########5 | 100/117 [00:09<00:01, 10.67it/s, loss=0.726, v_num=86, val_acc=0.706]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.66it/s]
Epoch 96: 100%|##########| 117/117 [00:11<00:00, 10.60it/s, loss=0.738, v_num=86, val_acc=0.706]

                                                           
Epoch 96:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.738, v_num=86, val_acc=0.706]
Epoch 97:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.738, v_num=86, val_acc=0.706]
Epoch 97:  85%|########5 | 100/117 [00:09<00:01, 10.65it/s, loss=0.738, v_num=86, val_acc=0.706]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 11.81it/s]
Epoch 97: 100%|##########| 117/117 [00:11<00:00, 10.48it/s, loss=0.749, v_num=86, val_acc=0.707]

                                                           
Epoch 97:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.749, v_num=86, val_acc=0.707]
Epoch 98:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.749, v_num=86, val_acc=0.707]
Epoch 98:  85%|########5 | 100/117 [00:09<00:01, 10.51it/s, loss=0.749, v_num=86, val_acc=0.707]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.40it/s]
Epoch 98: 100%|##########| 117/117 [00:11<00:00, 10.43it/s, loss=0.737, v_num=86, val_acc=0.707]

                                                           
Epoch 98:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.737, v_num=86, val_acc=0.707]
Epoch 99:   0%|          | 0/117 [00:00<?, ?it/s, loss=0.737, v_num=86, val_acc=0.707]
Epoch 99:  85%|########5 | 100/117 [00:09<00:01, 10.71it/s, loss=0.737, v_num=86, val_acc=0.707]

Validating: 0it [00:00, ?it/s]

Validating:   0%|          | 0/20 [00:00<?, ?it/s]

Validating: 100%|##########| 20/20 [00:01<00:00, 12.74it/s]
Epoch 99: 100%|##########| 117/117 [00:10<00:00, 10.66it/s, loss=0.738, v_num=86, val_acc=0.707]

                                                           
Epoch 99: 100%|##########| 117/117 [00:11<00:00, 10.37it/s, loss=0.738, v_num=86, val_acc=0.707]

Checkout the tensorboard logs while the model is training.

Run tensorboard –logdir lightning_logs/ to start tensorboard

Note

If you run the code on a remote machine you can’t just access the tensorboard logs. You need to forward the port. You can do this by using an editor such as Visual Studio Code which has a port forwarding functionality (make sure the remote extensions are installed and are connected with your machine).

Or you can use a shell command similar to this one to forward port 6006 from your remote machine to your local machine:

ssh username:host -N -L localhost:6006:localhost:6006