当前位置：首页 > news >正文

从‘暹罗双胞胎’到AI识图：手把手用Python和Keras复现一个Siamese Network图片相似度比对模型

news 2026/6/30 12:43:44

从‘暹罗双胞胎’到AI识图：手把手用Python和Keras复现一个Siamese Network图片相似度比对模型

在医学史上，"暹罗双胞胎"这个术语源于19世纪一对泰国连体婴儿的传奇故事，而今天，这个生物学概念却意外地为计算机视觉领域提供了一种优雅的解决方案——孪生神经网络（Siamese Network）。想象一下，当你需要快速判断两张证件照是否属于同一个人，或者电商平台要识别用户上传的图片是否与商品库中的图片相似时，这种特殊结构的神经网络就能大显身手。

与传统神经网络不同，Siamese Network就像连体双胞胎一样共享"大脑"（权重参数），能够将两张图片映射到同一个特征空间进行比较。本文将带你从零开始实现一个完整的图片相似度比对系统，使用Python 3.8和TensorFlow 2.x框架，涵盖从数据准备、模型构建到Web部署的全流程。无论你是想为人脸识别系统打基础，还是解决产品图片去重问题，这个实战项目都能为你提供可直接复用的代码模板。

1. 环境配置与数据准备

1.1 搭建Python深度学习环境

推荐使用Anaconda创建隔离的Python环境，避免包版本冲突。以下命令将建立一个名为siamese的虚拟环境：

conda create -n siamese python=3.8 conda activate siamese pip install tensorflow==2.6 keras==2.6 pillow matplotlib opencv-python flask

对于GPU加速，需要额外安装CUDA 11.2和cuDNN 8.1，确保你的NVIDIA驱动支持这些版本。可以通过nvidia-smi命令检查GPU是否可用。

1.2 获取签名验证数据集

我们将使用ICDAR 2011签名验证比赛数据集作为示例，这个数据集包含两类签名：真实签名和伪造签名，非常适合相似度比对任务。下载并解压后，目录结构应如下：

/signature_data /train /genuine user1_1.png user1_2.png ... /forged user1_f1.png user1_f2.png ... /test /genuine /forged

提示：如果没有专业数据集，也可以自制简易数据集。比如拍摄同一物品不同角度的照片作为正样本，不同物品的照片作为负样本。

1.3 数据预处理流水线

我们需要将图片统一调整为105x105像素，并做归一化处理。使用Keras的ImageDataGenerator可以方便地实现数据增强：

from tensorflow.keras.preprocessing.image import ImageDataGenerator def create_pairs(images, labels, pair_count=1000): """生成正负样本对""" pairs = [] pair_labels = [] # 生成正样本对（相同类别） for _ in range(pair_count // 2): idx1, idx2 = np.random.choice(np.where(labels == 1)[0], 2) pairs.append([images[idx1], images[idx2]]) pair_labels.append(1) # 生成负样本对（不同类别） for _ in range(pair_count // 2): idx1 = np.random.choice(np.where(labels == 1)[0], 1) idx2 = np.random.choice(np.where(labels == 0)[0], 1) pairs.append([images[idx1], images[idx2]]) pair_labels.append(0) return np.array(pairs), np.array(pair_labels) # 数据增强配置 train_datagen = ImageDataGenerator( rescale=1./255, rotation_range=10, width_shift_range=0.1, height_shift_range=0.1, shear_range=0.2, zoom_range=0.2, horizontal_flip=True) val_datagen = ImageDataGenerator(rescale=1./255)

2. 构建孪生神经网络模型

2.1 特征提取网络设计

我们基于简化版VGG16构建共享权重的特征提取器。这个网络将同时处理两张输入图片：

from tensorflow.keras.models import Model from tensorflow.keras.layers import Input, Conv2D, MaxPooling2D, Flatten def create_base_network(input_shape): """构建共享权重的基网络""" input = Input(shape=input_shape) x = Conv2D(64, (3, 3), activation='relu', padding='same')(input) x = MaxPooling2D((2, 2))(x) x = Conv2D(128, (3, 3), activation='relu', padding='same')(x) x = MaxPooling2D((2, 2))(x) x = Conv2D(256, (3, 3), activation='relu', padding='same')(x) x = MaxPooling2D((2, 2))(x) x = Flatten()(x) x = Dense(1024, activation='relu')(x) return Model(input, x)

2.2 对比度损失函数实现

孪生网络的核心是Contrastive Loss，它能够拉近相似样本的距离，推远不相似样本的距离：

import tensorflow.keras.backend as K def contrastive_loss(y_true, y_pred, margin=1): """自定义对比度损失函数""" square_pred = K.square(y_pred) margin_square = K.square(K.maximum(margin - y_pred, 0)) return K.mean(y_true * square_pred + (1 - y_true) * margin_square)

2.3 完整模型组装

将基网络与对比度损失组合成完整的孪生网络：

from tensorflow.keras.layers import Lambda def build_siamese_model(input_shape): # 定义两个输入 input_a = Input(shape=input_shape) input_b = Input(shape=input_shape) # 共享权重的基网络 base_network = create_base_network(input_shape) processed_a = base_network(input_a) processed_b = base_network(input_b) # 计算特征向量间的欧式距离 distance = Lambda(lambda x: K.sqrt(K.sum(K.square(x[0] - x[1]), axis=1, keepdims=True)))([processed_a, processed_b]) # 构建模型 model = Model([input_a, input_b], distance) return model # 模型编译 siamese_model = build_siamese_model((105, 105, 1)) siamese_model.compile(loss=contrastive_loss, optimizer='adam', metrics=['accuracy'])

3. 模型训练与调优

3.1 训练参数配置

使用EarlyStopping防止过拟合，ModelCheckpoint保存最佳模型：

from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint callbacks = [ EarlyStopping(monitor='val_loss', patience=10, verbose=1), ModelCheckpoint('best_model.h5', monitor='val_loss', save_best_only=True) ] history = siamese_model.fit( [train_pairs[:, 0], train_pairs[:, 1]], train_labels, validation_data=([val_pairs[:, 0], val_pairs[:, 1]], val_labels), batch_size=32, epochs=50, callbacks=callbacks)

3.2 训练过程可视化

绘制损失曲线和准确率曲线，分析模型学习情况：

import matplotlib.pyplot as plt plt.figure(figsize=(12, 4)) plt.subplot(1, 2, 1) plt.plot(history.history['loss'], label='Train Loss') plt.plot(history.history['val_loss'], label='Val Loss') plt.legend() plt.title('Loss Evolution') plt.subplot(1, 2, 2) plt.plot(history.history['accuracy'], label='Train Acc') plt.plot(history.history['val_accuracy'], label='Val Acc') plt.legend() plt.title('Accuracy Evolution') plt.show()

3.3 关键调优技巧

动态边界调整：随着训练进行，逐步增大Contrastive Loss中的margin值
困难样本挖掘：在每轮训练后，找出预测错误的样本对，在下轮训练中增加这些样本的权重
特征维度调整：尝试不同的嵌入维度（256/512/1024），观察验证集表现

注意：孪生网络对数据平衡性敏感，确保正负样本比例接近1:1。如果数据集不平衡，可以在损失函数中添加类别权重。

4. 模型部署与应用

4.1 构建Flask Web接口

创建一个简单的Web应用，允许用户上传两张图片并返回相似度分数：

from flask import Flask, request, render_template import cv2 import numpy as np from werkzeug.utils import secure_filename app = Flask(__name__) app.config['UPLOAD_FOLDER'] = 'static/uploads/' @app.route('/', methods=['GET', 'POST']) def upload_file(): if request.method == 'POST': # 处理上传的图片 file1 = request.files['file1'] file2 = request.files['file2'] # 保存图片并预处理 img1 = preprocess_image(file1) img2 = preprocess_image(file2) # 预测相似度 distance = model.predict([np.array([img1]), np.array([img2])])[0][0] similarity = 1 - (distance / 2) # 转换为0-1的相似度分数 return f'Similarity Score: {similarity:.2f}' return render_template('upload.html') def preprocess_image(file): """图片预处理函数""" filename = secure_filename(file.filename) filepath = os.path.join(app.config['UPLOAD_FOLDER'], filename) file.save(filepath) img = cv2.imread(filepath, cv2.IMREAD_GRAYSCALE) img = cv2.resize(img, (105, 105)) img = img.astype('float32') / 255.0 img = np.expand_dims(img, axis=-1) return img if __name__ == '__main__': app.run(debug=True)

4.2 性能优化技巧

模型量化：使用TensorFlow Lite转换模型，减少推理时间和内存占用
异步处理：对于大批量比对任务，采用Celery实现异步队列处理
缓存机制：对频繁比对的图片对结果进行缓存，减少重复计算

4.3 实际应用场景扩展

孪生网络的应用远不止签名验证，以下是一些典型应用场景的实现调整建议：

人脸验证：
- 使用FaceNet预训练模型作为基网络
- 采用Triplet Loss替代Contrastive Loss
- 添加活体检测模块增强安全性
商品图片去重：
- 针对电商场景微调网络结构
- 引入注意力机制突出商品主体
- 构建大规模特征数据库实现快速检索
医学影像分析：
- 使用3D卷积处理CT/MRI序列
- 结合病变区域标注信息
- 开发医生交互式标注工具

# 示例：商品图片特征提取与相似度计算 def extract_features(image_paths): """批量提取图片特征""" features = [] for path in image_paths: img = load_and_preprocess(path) feature = base_network.predict(np.array([img]))[0] features.append(feature) return np.array(features) def find_similar_products(query_feature, product_features, top_k=5): """查找最相似商品""" distances = np.linalg.norm(product_features - query_feature, axis=1) nearest_indices = np.argsort(distances)[:top_k] return nearest_indices, distances[nearest_indices]

查看全文

http://www.cnnetsun.cn/news/2205184.html