用 Keras 和 CNN 进行验证码识别

news/2025/9/17 21:50:34/文章来源:https://www.cnblogs.com/ocr12/p/19097689

在本教程中，我们将利用 Keras 和卷积神经网络（CNN）来构建一个验证码识别系统。Keras 是一个高层神经网络 API，它运行在 TensorFlow、Microsoft Cognitive Toolkit（CNTK）或 Theano 之上，能够让我们快速构建深度学习模型。CNN 是一种常用于图像识别任务的深度学习架构，能够自动从图像中提取特征。
用 Keras 和 CNN 进行验证码识别

环境准备

首先，确保你已经安装了 Python 环境，并安装了以下依赖包：

pip install keras tensorflow opencv-python numpy matplotlib pillow

keras：高层神经网络 API，用于构建和训练深度学习模型。

tensorflow：提供底层支持，执行训练和推理。

opencv-python：用于图像处理。

numpy：用于数据处理和矩阵操作。

matplotlib：用于可视化训练结果。

数据集准备与图像预处理

验证码通常包含多个字符，且由于噪声和扭曲的影响，直接进行字符识别是具有挑战性的。因此，我们需要先对图像进行预处理，包括灰度化、去噪、二值化等。

(1) 图像预处理
import cv2
import numpy as np

def preprocess_image(img_path):
# 读取图像
img = cv2.imread(img_path)

# 转换为灰度图
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)# 使用 Otsu 的方法进行二值化
_, binary = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)# 高斯模糊去噪
blurred = cv2.GaussianBlur(binary, (5, 5), 0)return blurred

示例图像路径

img_path = 'captcha_images/test1.png'
processed_img = preprocess_image(img_path)

显示处理后的图像

cv2.imshow('Processed Image', processed_img)
cv2.waitKey(0)
cv2.destroyAllWindows()

(2) 提取字符区域

为了分割出验证码中的每个字符，我们使用轮廓检测技术提取字符的边界框。

def extract_characters(processed_img):
contours, _ = cv2.findContours(processed_img, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
char_images = []
for contour in contours:
x, y, w, h = cv2.boundingRect(contour)
if w > 10 and h > 10: # 忽略小的噪点
char_img = processed_img[y:y+h, x:x+w]
char_images.append(char_img)

# 按照字符从左到右的顺序排序
char_images.sort(key=lambda x: x[0][0])  # 排序依据是字符的左上角 x 坐标
return char_images

提取字符区域

char_images = extract_characters(processed_img)

显示提取的字符

for i, char_img in enumerate(char_images):
cv2.imshow(f'Character {i+1}', char_img)
cv2.waitKey(0)

cv2.destroyAllWindows()

构建卷积神经网络（CNN）

接下来，我们使用 Keras 来构建一个简单的卷积神经网络（CNN）。CNN 擅长处理图像，并且能够通过多个卷积层和池化层自动提取图像的特征。

(1) 定义 CNN 模型

我们构建的 CNN 包括两个卷积层和两个池化层。最后通过一个全连接层进行输出。

from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense

def build_model(input_shape=(28, 28, 1), num_classes=36):
model = Sequential()

# 卷积层1
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=input_shape))
model.add(MaxPooling2D((2, 2)))# 卷积层2
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))# 展平层
model.add(Flatten())# 全连接层
model.add(Dense(128, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))  # 输出层，用 softmax 激活函数# 编译模型
model.compile(optimizer='adam',loss='sparse_categorical_crossentropy',  # 因为是分类问题，使用交叉熵损失函数metrics=['accuracy'])return model

初始化模型

model = build_model()
model.summary()

数据集加载与预处理

我们需要将验证码图像转换为适合 CNN 处理的格式。通常我们将每个图像调整为 28x28 像素，并将像素值规范化到 [0, 1] 范围。

(1) 数据集加载
from sklearn.model_selection import train_test_split
import os
from keras.preprocessing.image import load_img, img_to_array

def load_data(image_paths, labels, img_size=(28, 28)):
images = []
label_list = []

for img_path, label in zip(image_paths, labels):img = load_img(img_path, color_mode='grayscale', target_size=img_size)img = img_to_array(img)images.append(img)label_list.append(label)images = np.array(images, dtype=np.float32) / 255.0  # 归一化
label_list = np.array(label_list)return images, label_list

假设图像路径和标签如下

image_paths = ['captcha_images/train1.png', 'captcha_images/train2.png']
labels = [1, 2] # 对应的标签（根据实际情况调整）

加载数据并分割训练集与测试集

X, y = load_data(image_paths, labels)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

打印数据形状

print(f"Training data shape: {X_train.shape}, Testing data shape: {X_test.shape}")

训练模型

现在，我们可以用训练数据来训练 CNN 模型。我们将使用 Adam 优化器和交叉熵损失函数来优化模型。

(1) 训练模型

训练模型

history = model.fit(X_train, y_train, epochs=5, batch_size=32, validation_data=(X_test, y_test))

可视化训练过程

import matplotlib.pyplot as plt

绘制训练过程中的损失和准确率曲线

plt.plot(history.history['accuracy'], label='Train Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.title('Model Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.show()

plt.plot(history.history['loss'], label='Train Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Model Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.show()

测试与预测

训练完成后，我们可以评估模型在测试集上的表现，并对新的验证码图像进行预测。

(1) 评估模型

评估模型在测试集上的表现

test_loss, test_accuracy = model.evaluate(X_test, y_test)
print(f"Test Accuracy: {test_accuracy * 100:.2f}%")

(2) 预测新的验证码
def predict(model, img_path):
# 预处理图像
img = preprocess_image(img_path)
img = cv2.resize(img, (28, 28)) # 调整大小为 28x28
img = np.expand_dims(img, axis=-1) # 添加颜色通道
img = np.expand_dims(img, axis=0) # 扩展为批次大小
img = img / 255.0 # 归一化

# 使用模型进行预测
predictions = model.predict(img)
predicted_label = np.argmax(predictions)
return predicted_label

预测新的验证码图像

predicted_label = predict(model, 'captcha_images/test1.png')
print(f"Predicted label: {predicted_label}")

环境准备

首先，确保你已经安装了 Python 环境，并安装了以下依赖包：

pip install keras tensorflow opencv-python numpy matplotlib pillow

keras：高层神经网络 API，用于构建和训练深度学习模型。

tensorflow：提供底层支持，执行训练和推理。

opencv-python：用于图像处理。

numpy：用于数据处理和矩阵操作。

matplotlib：用于可视化训练结果。

数据集准备与图像预处理

(1) 图像预处理
import cv2
import numpy as np

def preprocess_image(img_path):
# 读取图像
img = cv2.imread(img_path)

# 转换为灰度图
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)# 使用 Otsu 的方法进行二值化
_, binary = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)# 高斯模糊去噪
blurred = cv2.GaussianBlur(binary, (5, 5), 0)return blurred

示例图像路径

img_path = 'captcha_images/test1.png'
processed_img = preprocess_image(img_path)

显示处理后的图像

cv2.imshow('Processed Image', processed_img)
cv2.waitKey(0)
cv2.destroyAllWindows()

(2) 提取字符区域

为了分割出验证码中的每个字符，我们使用轮廓检测技术提取字符的边界框。

# 按照字符从左到右的顺序排序
char_images.sort(key=lambda x: x[0][0])  # 排序依据是字符的左上角 x 坐标
return char_images

提取字符区域

char_images = extract_characters(processed_img)

显示提取的字符

for i, char_img in enumerate(char_images):
cv2.imshow(f'Character {i+1}', char_img)
cv2.waitKey(0)

cv2.destroyAllWindows()

构建卷积神经网络（CNN）

接下来，我们使用 Keras 来构建一个简单的卷积神经网络（CNN）。CNN 擅长处理图像，并且能够通过多个卷积层和池化层自动提取图像的特征。

(1) 定义 CNN 模型

我们构建的 CNN 包括两个卷积层和两个池化层。最后通过一个全连接层进行输出。

from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense

def build_model(input_shape=(28, 28, 1), num_classes=36):
model = Sequential()

# 卷积层1
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=input_shape))
model.add(MaxPooling2D((2, 2)))# 卷积层2
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))# 展平层
model.add(Flatten())# 全连接层
model.add(Dense(128, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))  # 输出层，用 softmax 激活函数# 编译模型
model.compile(optimizer='adam',loss='sparse_categorical_crossentropy',  # 因为是分类问题，使用交叉熵损失函数metrics=['accuracy'])return model

初始化模型

model = build_model()
model.summary()

数据集加载与预处理

我们需要将验证码图像转换为适合 CNN 处理的格式。通常我们将每个图像调整为 28x28 像素，并将像素值规范化到 [0, 1] 范围。

(1) 数据集加载
from sklearn.model_selection import train_test_split
import os
from keras.preprocessing.image import load_img, img_to_array

def load_data(image_paths, labels, img_size=(28, 28)):
images = []
label_list = []

for img_path, label in zip(image_paths, labels):img = load_img(img_path, color_mode='grayscale', target_size=img_size)img = img_to_array(img)images.append(img)label_list.append(label)images = np.array(images, dtype=np.float32) / 255.0  # 归一化
label_list = np.array(label_list)return images, label_list

假设图像路径和标签如下

image_paths = ['captcha_images/train1.png', 'captcha_images/train2.png']
labels = [1, 2] # 对应的标签（根据实际情况调整）

加载数据并分割训练集与测试集

X, y = load_data(image_paths, labels)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

打印数据形状

print(f"Training data shape: {X_train.shape}, Testing data shape: {X_test.shape}")

训练模型

现在，我们可以用训练数据来训练 CNN 模型。我们将使用 Adam 优化器和交叉熵损失函数来优化模型。

(1) 训练模型

训练模型

history = model.fit(X_train, y_train, epochs=5, batch_size=32, validation_data=(X_test, y_test))

可视化训练过程

import matplotlib.pyplot as plt

绘制训练过程中的损失和准确率曲线

测试与预测

训练完成后，我们可以评估模型在测试集上的表现，并对新的验证码图像进行预测。

(1) 评估模型

评估模型在测试集上的表现

test_loss, test_accuracy = model.evaluate(X_test, y_test)
print(f"Test Accuracy: {test_accuracy * 100:.2f}%")

# 使用模型进行预测
predictions = model.predict(img)
predicted_label = np.argmax(predictions)
return predicted_label

预测新的验证码图像

predicted_label = predict(model, 'captcha_images/test1.png')
print(f"Predicted label: {predicted_label}")

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.mzph.cn/news/906878.shtml

如若内容造成侵权/违法违规/事实不符，请联系多彩编程网进行投诉反馈email:809451989@qq.com，一经查实，立即删除！

从 Bank Conflict 数学表示看 Buffer 设计 Trade-Off

用 Keras 和 CNN 进行验证码识别

示例图像路径

显示处理后的图像

提取字符区域

显示提取的字符

初始化模型

假设图像路径和标签如下

加载数据并分割训练集与测试集

打印数据形状

训练模型

可视化训练过程

绘制训练过程中的损失和准确率曲线

评估模型在测试集上的表现

预测新的验证码图像

示例图像路径

显示处理后的图像

提取字符区域

显示提取的字符

初始化模型

假设图像路径和标签如下

加载数据并分割训练集与测试集

打印数据形状

训练模型

可视化训练过程

绘制训练过程中的损失和准确率曲线

评估模型在测试集上的表现

预测新的验证码图像

相关文章