File size: 4,958 Bytes

---
language:
- zh
- en
license: mit
library_name: pytorch
tags:
- image-classification
- cifar10
- cnn
- lite-model
- educational
datasets:
- cifar10
metrics:
- accuracy
model-index:
- name: SimpleConvNetLite
  results:
  - task:
      type: image-classification
      name: Image Classification
    dataset:
      name: CIFAR-10 (20% subset)
      type: cifar10
    metrics:
      - name: Accuracy
        type: accuracy
        value: 0.52
---

# SimpleConvNetLite: 轻量级CIFAR-10图像分类模型

这是一个为快速训练和部署而设计的轻量级卷积神经网络模型，在CIFAR-10数据集的子集上训练，可以在CPU上10分钟内完成训练。

## 模型描述

SimpleConvNetLite是一个简化版的CNN模型，专为快速训练和部署而设计。模型架构简单，参数量小，可以在资源受限的环境中运行。

### 模型架构

```
SimpleConvNetLite(
  (conv1): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (fc1): Linear(in_features=4096, out_features=64, bias=True)
  (fc2): Linear(in_features=64, out_features=10, bias=True)
)
```

- 1个卷积层（16个过滤器，3x3卷积核）
- 1个最大池化层
- 2个全连接层（64个隐藏单元）

参数总量: ~260K

## 训练数据

模型在CIFAR-10数据集的子集上进行训练：

- 只使用原始CIFAR-10数据集的**20%**
- 训练样本: 10,000张图像（原50,000的20%）
- 测试样本: 2,000张图像（原10,000的20%）
- 图像尺寸: 32x32像素，RGB 3通道
- 类别: 飞机、汽车、鸟、猫、鹿、狗、青蛙、马、船、卡车

## 训练过程

- **优化器**: Adam (lr=0.001)
- **批次大小**: 128
- **训练轮次**: 2
- **损失函数**: CrossEntropyLoss
- **数据预处理**: 
  - 调整尺寸到32x32
  - 标准化 (均值=[0.5, 0.5, 0.5], 标准差=[0.5, 0.5, 0.5])

## 训练时长

- **CPU (Intel i5或同等配置)**: 约5-10分钟
- **CPU (Intel i7或同等配置)**: 约3-5分钟
- **GPU (任何配置)**: 不到1分钟

## 性能指标

在CIFAR-10测试集子集上的准确率约为**50-55%**。

## 使用方法

### 使用Transformers库

```python
from transformers import AutoImageProcessor, AutoModelForImageClassification
from PIL import Image

# 加载模型和处理器
processor = AutoImageProcessor.from_pretrained("你的用户名/simple-cnn-cifar10-lite")
model = AutoModelForImageClassification.from_pretrained("你的用户名/simple-cnn-cifar10-lite")

# 加载图像并进行预处理
image = Image.open("path_to_image.jpg")
inputs = processor(images=image, return_tensors="pt")

# 预测
outputs = model(**inputs)
predicted_class_idx = outputs.logits.argmax(-1).item()
print(f"预测类别: {model.config.id2label[predicted_class_idx]}")
```

### 使用PyTorch

```python
import torch
from PIL import Image
import torchvision.transforms as transforms

# 定义模型结构
class SimpleConvNetLite(torch.nn.Module):
    def __init__(self, num_classes=10):
        super(SimpleConvNetLite, self).__init__()
        self.conv1 = torch.nn.Conv2d(3, 16, 3, padding=1)
        self.pool = torch.nn.MaxPool2d(2, 2)
        self.fc1 = torch.nn.Linear(16 * 16 * 16, 64)
        self.fc2 = torch.nn.Linear(64, num_classes)
        
    def forward(self, x):
        x = self.pool(torch.nn.functional.relu(self.conv1(x)))
        x = x.view(-1, 16 * 16 * 16)
        x = torch.nn.functional.relu(self.fc1(x))
        x = self.fc2(x)
        return x

# 加载模型
model = SimpleConvNetLite()
model.load_state_dict(torch.load("pytorch_model.bin", map_location=torch.device('cpu')))
model.eval()

# 图像预处理
transform = transforms.Compose([
    transforms.Resize((32, 32)),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

# 类别映射
classes = ('飞机', '汽车', '鸟', '猫', '鹿', '狗', '青蛙', '马', '船', '卡车')

# 加载图像并预测
image = Image.open("path_to_image.jpg").convert('RGB')
image_tensor = transform(image).unsqueeze(0)
with torch.no_grad():
    outputs = model(image_tensor)
    _, predicted = torch.max(outputs, 1)
    print(f"预测类别: {classes[predicted.item()]}")
```

## 优势和局限性

### 优势

- **快速训练**: 在CPU上可在10分钟内完成训练
- **轻量级**: 模型体积小，适合部署在资源受限的环境
- **易于理解**: 简单的架构设计，适合学习和教学目的

### 局限性

- **准确率较低**: 相比完整模型，精简版准确率约为50-55%
- **特征提取能力有限**: 只有一个卷积层，特征提取能力有限
- **仅用于演示**: 主要用于快速演示和教学，不适合生产环境

## 项目链接

- 项目代码: [GitHub仓库链接]
- Hugging Face Space演示: [你的用户名/simple-cnn-cifar10-lite-demo]

## 许可证

MIT

---

*本模型由[您的名字]创建，用于Hugging Face学习和演示目的。*