先看代码:

transforms.Compose([transforms.RandomResizedCrop(224),
 					transforms.RandomHorizontalFlip(),
                    transforms.ToTensor(),
                    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])

具体是对图像进行各种转换操作,并用函数compose将这些转换操作组合起来;

接下来看实例:
先读取一张图片:

from PIL import Image
img = Image.open("./tulip.jpg")

transforms.RandomResizedCrop(224) 将给定图像随机裁剪为不同的大小和宽高比,然后缩放所裁剪得到的图像为制定的大小;(即先随机采集,然后对裁剪得到的图像缩放为同一大小)
默认scale=(0.08, 1.0)

img = Image.open("./demo.jpg")
print("原图大小:",img.size)
data1 = transforms.RandomResizedCrop(224)(img)
print("随机裁剪后的大小:",data1.size)
data2 = transforms.RandomResizedCrop(224)(img)
data3 = transforms.RandomResizedCrop(224)(img)

plt.subplot(2,2,1),plt.imshow(img),plt.title("原图")
plt.subplot(2,2,2),plt.imshow(data1),plt.title("转换后的图1")
plt.subplot(2,2,3),plt.imshow(data2),plt.title("转换后的图2")
plt.subplot(2,2,4),plt.imshow(data3),plt.title("转换后的图3")
plt.show()

结果为:

原图大小: (500, 721)
随机裁剪后的大小: (224, 224)

该操作的含义在于:即使只是该物体的一部分,我们也认为这是该类物体;
在这里插入图片描述
transforms.RandomHorizontalFlip() 以给定的概率随机水平旋转给定的PIL的图像,默认为0.5;

img = Image.open("./demo.jpg")
img1 = transforms.RandomHorizontalFlip()(img)
img2 = transforms.RandomHorizontalFlip()(img)
img3 = transforms.RandomHorizontalFlip()(img)

plt.subplot(2,2,1),plt.imshow(img),plt.title("原图")
plt.subplot(2,2,2), plt.imshow(img1), plt.title("变换后的图1")
plt.subplot(2,2,3), plt.imshow(img2), plt.title("变换后的图2")
plt.subplot(2,2,4), plt.imshow(img3), plt.title("变换后的图3")
plt.show()

在这里插入图片描述
transforms.ToTensor() 将给定图像转为Tensor

img = Image.open("./demo.jpg")
img = transforms.ToTensor()(img)
print(img)

输出为:

tensor([[[0.4549, 0.4549, 0.4471,  ..., 0.5216, 0.5294, 0.5294],
         [0.4510, 0.4510, 0.4431,  ..., 0.5216, 0.5255, 0.5255],
         [0.4471, 0.4431, 0.4392,  ..., 0.5176, 0.5255, 0.5216],
         ...,
         [0.5529, 0.5333, 0.5059,  ..., 0.7922, 0.7922, 0.7922],
         [0.5647, 0.5451, 0.5176,  ..., 0.7922, 0.7922, 0.7922],
         [0.5882, 0.5725, 0.5451,  ..., 0.7843, 0.7843, 0.7843]],

        [[0.4980, 0.4980, 0.4902,  ..., 0.5059, 0.5137, 0.5137],
         [0.4941, 0.4941, 0.4863,  ..., 0.5059, 0.5098, 0.5098],
         [0.4902, 0.4863, 0.4824,  ..., 0.5020, 0.5098, 0.5059],
         ...,
         [0.5059, 0.4863, 0.4588,  ..., 0.7373, 0.7373, 0.7373],
         [0.5176, 0.4980, 0.4706,  ..., 0.7373, 0.7373, 0.7373],
         [0.5412, 0.5255, 0.4980,  ..., 0.7451, 0.7451, 0.7451]],

        [[0.5137, 0.5137, 0.5059,  ..., 0.5020, 0.5098, 0.5098],
         [0.5098, 0.5098, 0.5020,  ..., 0.5020, 0.5059, 0.5059],
         [0.5059, 0.5020, 0.4980,  ..., 0.4980, 0.5059, 0.5020],
         ...,
         [0.4431, 0.4235, 0.3961,  ..., 0.7373, 0.7373, 0.7373],
         [0.4549, 0.4353, 0.4078,  ..., 0.7373, 0.7373, 0.7373],
         [0.4941, 0.4784, 0.4510,  ..., 0.7490, 0.7490, 0.7490]]])

transforms.Normalize() 归一化处理

img = Image.open("./demo.jpg")
img = transforms.ToTensor()(img)
img = transforms.Normalize(mean=[0.5,0.5,0.5], std=[0.5,0.5,0.5])(img)
print(img)

输出:

tensor([[[-0.0902, -0.0902, -0.1059,  ...,  0.0431,  0.0588,  0.0588],
         [-0.0980, -0.0980, -0.1137,  ...,  0.0431,  0.0510,  0.0510],
         [-0.1059, -0.1137, -0.1216,  ...,  0.0353,  0.0510,  0.0431],
         ...,
         [ 0.1059,  0.0667,  0.0118,  ...,  0.5843,  0.5843,  0.5843],
         [ 0.1294,  0.0902,  0.0353,  ...,  0.5843,  0.5843,  0.5843],
         [ 0.1765,  0.1451,  0.0902,  ...,  0.5686,  0.5686,  0.5686]],

        [[-0.0039, -0.0039, -0.0196,  ...,  0.0118,  0.0275,  0.0275],
         [-0.0118, -0.0118, -0.0275,  ...,  0.0118,  0.0196,  0.0196],
         [-0.0196, -0.0275, -0.0353,  ...,  0.0039,  0.0196,  0.0118],
         ...,
         [ 0.0118, -0.0275, -0.0824,  ...,  0.4745,  0.4745,  0.4745],
         [ 0.0353, -0.0039, -0.0588,  ...,  0.4745,  0.4745,  0.4745],
         [ 0.0824,  0.0510, -0.0039,  ...,  0.4902,  0.4902,  0.4902]],

        [[ 0.0275,  0.0275,  0.0118,  ...,  0.0039,  0.0196,  0.0196],
         [ 0.0196,  0.0196,  0.0039,  ...,  0.0039,  0.0118,  0.0118],
         [ 0.0118,  0.0039, -0.0039,  ..., -0.0039,  0.0118,  0.0039],
         ...,
         [-0.1137, -0.1529, -0.2078,  ...,  0.4745,  0.4745,  0.4745],
         [-0.0902, -0.1294, -0.1843,  ...,  0.4745,  0.4745,  0.4745],
         [-0.0118, -0.0431, -0.0980,  ...,  0.4980,  0.4980,  0.4980]]])
Logo

旨在为数千万中国开发者提供一个无缝且高效的云端环境,以支持学习、使用和贡献开源项目。

更多推荐