news 2026/4/23 20:48:35

Day 37

作者头像

张小明

前端开发工程师

1.2k 24
文章封面图
Day 37

# DAY 37 早停策略和模型权重的保存知识点回顾:

1. 过拟合的判断:测试集和训练集同步打印指标

2. 模型的保存和加载

a. 仅保存权重

b. 保存权重和模型

c. 保存全部信息 checkpoint,还包含训练状态

3. 早停策略

作业:对信贷数据集训练后保存权重,加载权重后继续训练 50 轮,并采取早停策略

早停策略:

早停策略(Early Stopping)是深度学习训练中防止过拟合、优化训练效率的常用方法,核心逻辑很简单:

1. 核心目标

避免模型 “过度学习” 训练数据的细节(甚至噪声),提前停止训练,保留对 “未见过的测试数据” 泛化能力最好的模型。

2. 基本逻辑

训练过程中持续监控测试集的性能指标(通常是测试损失):

当测试集性能(如损失)持续下降时,说明模型还在有效学习,继续训练;

当测试集性能不再提升(甚至开始上升)时,说明模型已经 “过拟合”(只记住了训练数据,没学会泛化),此时提前终止训练。

3. 关键参数

Patience(耐心值):允许测试集性能 “不改善” 的最大轮数(比如设为 100,即连续 100 轮测试损失没下降,就触发早停);

最佳模型保存:训练中会实时保存 “测试性能最好时的模型”(因为早停时的模型可能已经开始退化,需要回退到最优状态)。

4. 作用

防止过拟合:避免模型在训练后期 “学偏”;

节省时间:不用训练到预设的最大轮数;

自动保留最优模型:不用手动挑选训练轮数。

## 一、过拟合判断

本次采用信贷数据

import pandas as pd import pandas as pd #用于数据处理和分析,可处理表格数据。 import numpy as np #用于数值计算,提供了高效的数组操作。 import matplotlib.pyplot as plt #用于绘制各种类型的图表 import warnings from sklearn.preprocessing import MinMaxScaler from sklearn.model_selection import train_test_split import torch import torch.nn as nn import torch.optim as optim import time warnings.filterwarnings('ignore') #忽略警告信息,保持输出清洁。 # 设置中文字体(解决中文显示问题) plt.rcParams['font.sans-serif'] = ['SimHei'] # Windows系统常用黑体字体 plt.rcParams['axes.unicode_minus'] = False # 正常显示负号 data = pd.read_csv('data.csv') #读取数据 # 先筛选字符串变量 discrete_features = data.select_dtypes(include=['object']).columns.tolist() # Home Ownership 标签编码 home_ownership_mapping = { 'Own Home': 1, 'Rent': 2, 'Have Mortgage': 3, 'Home Mortgage': 4 } data['Home Ownership'] = data['Home Ownership'].map(home_ownership_mapping) # Years in current job 标签编码 years_in_job_mapping = { '< 1 year': 1, '1 year': 2, '2 years': 3, '3 years': 4, '4 years': 5, '5 years': 6, '6 years': 7, '7 years': 8, '8 years': 9, '9 years': 10, '10+ years': 11 } data['Years in current job'] = data['Years in current job'].map(years_in_job_mapping) # Purpose 独热编码,记得需要将bool类型转换为数值 data = pd.get_dummies(data, columns=['Purpose']) data2 = pd.read_csv("data.csv") # 重新读取数据,用来做列名对比 list_final = [] # 新建一个空列表,用于存放独热编码后新增的特征名 for i in data.columns: if i not in data2.columns: list_final.append(i) # 这里打印出来的就是独热编码后的特征名 for i in list_final: data[i] = data[i].astype(int) # 这里的i就是独热编码后的特征名 # Term 0 - 1 映射 term_mapping = { 'Short Term': 0, 'Long Term': 1 } data['Term'] = data['Term'].map(term_mapping) data.rename(columns={'Term': 'Long Term'}, inplace=True) # 重命名列 continuous_features = data.select_dtypes(include=['int64', 'float64']).columns.tolist() #把筛选出来的列名转换成列表 # 连续特征用中位数补全 for feature in continuous_features: mode_value = data[feature].mode()[0] #获取该列的众数。 data[feature].fillna(mode_value, inplace=True) #用众数填充该列的缺失值,inplace=True表示直接在原数据上修改。 X = data.drop(['Credit Default'], axis=1) # 特征,axis=1表示按列删除 y = data['Credit Default'] # 标签 # 划分训练集和测试集 X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) # 归一化数据 scaler = MinMaxScaler() X_train = scaler.fit_transform(X_train) X_test = scaler.transform(X_test) # ===================== 3. 转换为PyTorch张量(适配GPU/CPU) ===================== # 设置设备 device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") print(f"使用设备: {device}") # 转换为张量(pandas→numpy→tensor,避免Series报错) # ===================== 修正后的张量转换部分 ===================== # 先打印类型和形状(排查问题) print("X_train类型:", type(X_train), "| 形状:", X_train.shape) print("y_train类型:", type(y_train), "| 形状:", y_train.shape) # 统一转换逻辑:先确保是numpy数组,再转张量 # 特征转换(X_train/X_test已被MinMaxScaler转为numpy数组) X_train = torch.FloatTensor(X_train).to(device) X_test = torch.FloatTensor(X_test).to(device) # 标签转换(兼容Series/numpy数组) if isinstance(y_train, pd.Series): y_train_np = y_train.values # Series→numpy数组 else: y_train_np = y_train # 已是numpy数组,直接用 y_train = torch.LongTensor(y_train_np).to(device) if isinstance(y_test, pd.Series): y_test_np = y_test.values else: y_test_np = y_test y_test = torch.LongTensor(y_test_np).to(device) # 打印维度(验证:特征数必须是31) print(f"X_train形状: {X_train.shape}") # 输出:(N, 31),N是训练样本数 print(f"y_train形状: {y_train.shape}") # 输出:(N,) print(f"特征列数(模型输入维度): {X_train.shape[1]}") # ===================== 4. 定义适配31维特征的MLP模型(核心修改) ===================== class MLP(nn.Module): def __init__(self, input_dim=31, hidden_dim=64, output_dim=2): super(MLP, self).__init__() # 输入层:31维特征 → 隐藏层64维(可调整) self.fc1 = nn.Linear(input_dim, hidden_dim) self.relu = nn.ReLU() self.dropout = nn.Dropout(0.2) # 新增:dropout防止过拟合 # 隐藏层 → 输出层(二分类:输出2维,对应0/1) self.fc2 = nn.Linear(hidden_dim, output_dim) def forward(self, x): out = self.fc1(x) out = self.relu(out) out = self.dropout(out) # 随机丢弃20%神经元 out = self.fc2(out) return out # 实例化模型 model = MLP(input_dim=X_train.shape[1]).to(device) print("模型结构:") print(model) # ===================== 5. 训练配置 ===================== criterion = nn.CrossEntropyLoss() # 二分类用CrossEntropyLoss(等价于nn.NLLLoss+LogSoftmax) optimizer = optim.Adam(model.parameters(), lr=0.001) # Adam优化器(比SGD更适合二分类) num_epochs = 1000 # 信贷数据无需20000轮,1000轮足够 # ===================== 6. 训练模型 ===================== train_losses = [] test_losses = [] epochs_list = [] start_time = time.time() with tqdm(total=num_epochs, desc="训练进度", unit="epoch") as pbar: for epoch in range(num_epochs): # 训练模式 model.train() # 前向传播 outputs = model(X_train) train_loss = criterion(outputs, y_train) # 反向传播+优化 optimizer.zero_grad() train_loss.backward() optimizer.step() # 每20轮计算测试集损失(监控过拟合) if (epoch + 1) % 20 == 0: model.eval() with torch.no_grad(): test_outputs = model(X_test) test_loss = criterion(test_outputs, y_test) train_losses.append(train_loss.item()) test_losses.append(test_loss.item()) epochs_list.append(epoch + 1) # 更新进度条 pbar.set_postfix({ 'Train Loss': f'{train_loss.item():.4f}', 'Test Loss': f'{test_loss.item():.4f}' }) # 更新进度条 pbar.update(1) train_time = time.time() - start_time print(f"\n训练耗时: {train_time:.2f} 秒") # ===================== 7. 评估模型 ===================== model.eval() with torch.no_grad(): # 测试集预测 test_outputs = model(X_test) _, predicted = torch.max(test_outputs, 1) # 计算准确率 correct = (predicted == y_test).sum().item() accuracy = correct / y_test.size(0) print(f"测试集准确率: {accuracy * 100:.2f}%") # ===================== 8. 可视化损失曲线 ===================== plt.figure(figsize=(10, 6)) plt.plot(epochs_list, train_losses, label='训练损失') plt.plot(epochs_list, test_losses, label='测试损失') plt.xlabel('Epoch') plt.ylabel('Loss') plt.title('训练/测试损失变化曲线') plt.legend() plt.grid(True) plt.show()

1. 曲线趋势与拟合阶段

前期(0~200 轮):训练损失(蓝色)和测试损失(橙色)同步快速下降,说明模型在有效学习数据的核心规律,是正常的 “学习阶段”。

中后期(200~1000 轮):训练损失继续缓慢下降并趋于稳定,测试损失下降至 0.46 左右后保持平稳;两者的差距始终较小(训练损失最终约 0.44,测试损失约 0.46),没有出现 “差距持续拉大” 的情况。

2. 过拟合的核心判断(未满足)

过拟合的典型特征是:训练损失持续下降(甚至趋近于 0),但测试损失下降到一定程度后开始上升,或训练损失远低于测试损失。

而这张图中:

测试损失未出现 “下降后上升” 的趋势,始终保持稳定;

训练损失与测试损失的差距很小,没有出现 “训练效果极好、测试效果极差” 的脱节情况。

3. 结论

该模型的拟合状态正常且泛化能力较好,不存在过拟合问题。

## 二、加载权重后继续训练50轮

import torch import torch.nn as nn import torch.optim as optim import pandas as pd import numpy as np import time import matplotlib.pyplot as plt from tqdm import tqdm from sklearn.model_selection import train_test_split from sklearn.preprocessing import MinMaxScaler import warnings warnings.filterwarnings("ignore") # ===================== 1. 全局配置 + 数据预处理(适配信贷数据集) ===================== # 设置GPU/CPU设备 device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") print(f"使用设备: {device}") # 数据预处理(你的信贷数据逻辑,修正路径和类型兼容) def preprocess_data(): # 设置中文字体(可选,可视化用) plt.rcParams['font.sans-serif'] = ['SimHei'] plt.rcParams['axes.unicode_minus'] = False # 读取数据(注意:路径用原始字符串避免转义) data = pd.read_csv(r'data.csv') # 1. 字符串特征编码 # Home Ownership 标签编码 home_ownership_mapping = {'Own Home': 1, 'Rent': 2, 'Have Mortgage': 3, 'Home Mortgage': 4} data['Home Ownership'] = data['Home Ownership'].map(home_ownership_mapping) # Years in current job 标签编码 years_in_job_mapping = { '< 1 year': 1, '1 year': 2, '2 years': 3, '3 years': 4, '4 years': 5, '5 years': 6, '6 years': 7, '7 years': 8, '8 years': 9, '9 years': 10, '10+ years': 11 } data['Years in current job'] = data['Years in current job'].map(years_in_job_mapping) # Purpose 独热编码 data = pd.get_dummies(data, columns=['Purpose']) data2 = pd.read_csv(r'data.csv') list_final = [col for col in data.columns if col not in data2.columns] for col in list_final: data[col] = data[col].astype(int) # Term 映射 + 重命名 term_mapping = {'Short Term': 0, 'Long Term': 1} data['Term'] = data['Term'].map(term_mapping) data.rename(columns={'Term': 'Long Term'}, inplace=True) # 2. 缺失值填充(连续特征用众数) continuous_features = data.select_dtypes(include=['int64', 'float64']).columns.tolist() for feature in continuous_features: mode_value = data[feature].mode()[0] data[feature].fillna(mode_value, inplace=True) # 3. 划分特征/标签 X = data.drop(['Credit Default'], axis=1) # 31维特征 y = data['Credit Default'] # 二分类标签(0/1) # 4. 划分训练集/测试集(8:2) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) # 5. 特征归一化(提升训练稳定性) scaler = MinMaxScaler() X_train = scaler.fit_transform(X_train) X_test = scaler.transform(X_test) # 6. 张量转换(兼容pandas Series/numpy数组) def to_tensor(data, dtype=torch.float32): """统一转换为张量,避免Series报错""" if isinstance(data, pd.Series): arr = data.values elif isinstance(data, np.ndarray): arr = data else: arr = np.array(data) return torch.tensor(arr, dtype=dtype).to(device) X_train = to_tensor(X_train, torch.float32) X_test = to_tensor(X_test, torch.float32) y_train = to_tensor(y_train, torch.long) # 分类任务标签用long型 y_test = to_tensor(y_test, torch.long) # 打印维度验证 print(f"数据预处理完成 | X_train形状: {X_train.shape} | y_train形状: {y_train.shape}") print(f"模型输入维度: {X_train.shape[1]} | 输出维度: 2(二分类)") return X_train, X_test, y_train, y_test # 执行数据预处理 X_train, X_test, y_train, y_test = preprocess_data() # ===================== 2. 定义适配信贷数据的MLP模型 ===================== class MLP(nn.Module): def __init__(self, input_dim=31, hidden_dim=64, output_dim=2): super(MLP, self).__init__() self.fc1 = nn.Linear(input_dim, hidden_dim) # 31维输入 → 64维隐藏层 self.relu = nn.ReLU() self.dropout = nn.Dropout(0.2) # 防止过拟合 self.fc2 = nn.Linear(hidden_dim, output_dim) # 64维 → 2维输出(二分类) def forward(self, x): out = self.fc1(x) out = self.relu(out) out = self.dropout(out) out = self.fc2(out) return out # ===================== 3. 基础训练(生成检查点) ===================== def train_base_model(): # 初始化模型/损失/优化器 model = MLP(input_dim=X_train.shape[1]).to(device) criterion = nn.CrossEntropyLoss() # 二分类用CrossEntropyLoss optimizer = optim.SGD(model.parameters(), lr=0.01) num_epochs = 20000 # 基础训练轮数(可根据需求调整) # 记录损失 train_losses = [] test_losses = [] epochs = [] best_loss = float('inf') start_time = time.time() with tqdm(total=num_epochs, desc="基础训练进度", unit="epoch") as pbar: for epoch in range(num_epochs): model.train() # 训练模式 # 前向传播 outputs = model(X_train) train_loss = criterion(outputs, y_train) # 反向传播+优化 optimizer.zero_grad() train_loss.backward() optimizer.step() # 每200轮记录测试损失 if (epoch + 1) % 200 == 0: model.eval() with torch.no_grad(): test_outputs = model(X_test) test_loss = criterion(test_outputs, y_test) model.train() train_losses.append(train_loss.item()) test_losses.append(test_loss.item()) epochs.append(epoch + 1) pbar.set_postfix({'Train Loss': f'{train_loss.item():.4f}', 'Test Loss': f'{test_loss.item():.4f}'}) # 每1000轮更新进度条 if (epoch + 1) % 1000 == 0: pbar.update(1000) # 补全进度条 if pbar.n < num_epochs: pbar.update(num_epochs - pbar.n) # 保存检查点(包含模型/优化器/损失/epoch) checkpoint = { "model_state_dict": model.state_dict(), "optimizer_state_dict": optimizer.state_dict(), "epoch": num_epochs, "train_losses": train_losses, "test_losses": test_losses, "epochs": epochs, "best_loss": min(test_losses) if test_losses else float('inf') } torch.save(checkpoint, "credit_checkpoint.pth") print(f"\n基础训练完成 | 耗时: {time.time() - start_time:.2f} 秒 | 检查点保存至 credit_checkpoint.pth") return model # 执行基础训练 base_model = train_base_model() # ===================== 4. 加载检查点并续训50轮 ===================== def continue_train_50_epochs(): # 步骤1:初始化模型/优化器(和基础训练一致) model = MLP(input_dim=X_train.shape[1]).to(device) criterion = nn.CrossEntropyLoss() optimizer = optim.SGD(model.parameters(), lr=0.01) # 优化器参数必须和基础训练一致 # 步骤2:加载检查点 checkpoint = torch.load("credit_checkpoint.pth", map_location=device) model.load_state_dict(checkpoint["model_state_dict"]) optimizer.load_state_dict(checkpoint["optimizer_state_dict"]) start_epoch = checkpoint["epoch"] # 基础训练结束轮数(20000) end_epoch = start_epoch + 50 # 续训50轮 # 加载历史损失 train_losses = checkpoint["train_losses"] test_losses = checkpoint["test_losses"] epochs = checkpoint["epochs"] print(f"\n开始续训50轮 | 从第 {start_epoch + 1} 轮到第 {end_epoch} 轮") start_time = time.time() # 步骤3:续训50轮 with tqdm(total=50, desc="续训进度", unit="epoch") as pbar: for epoch in range(start_epoch, end_epoch): model.train() # 强制训练模式(关键!) # 前向传播 outputs = model(X_train) train_loss = criterion(outputs, y_train) # 反向传播+优化 optimizer.zero_grad() train_loss.backward() optimizer.step() # 每轮计算测试损失 model.eval() with torch.no_grad(): test_outputs = model(X_test) test_loss = criterion(test_outputs, y_test) model.train() # 追加损失记录 train_losses.append(train_loss.item()) test_losses.append(test_loss.item()) epochs.append(epoch + 1) # 更新进度条 pbar.set_postfix({'Train Loss': f'{train_loss.item():.4f}', 'Test Loss': f'{test_loss.item():.4f}'}) pbar.update(1) # 步骤4:保存续训后检查点 new_checkpoint = { "model_state_dict": model.state_dict(), "optimizer_state_dict": optimizer.state_dict(), "epoch": end_epoch, "train_losses": train_losses, "test_losses": test_losses, "epochs": epochs, "best_loss": min(test_losses) } torch.save(new_checkpoint, "credit_checkpoint_continued.pth") print(f"续训完成 | 耗时: {time.time() - start_time:.2f} 秒 | 新检查点保存至 credit_checkpoint_continued.pth") # 步骤5:评估续训后模型 model.eval() with torch.no_grad(): # 测试集准确率 test_outputs = model(X_test) _, predicted = torch.max(test_outputs, 1) correct = (predicted == y_test).sum().item() test_accuracy = correct / y_test.size(0) # 训练集准确率(对比过拟合) train_outputs = model(X_train) _, train_pred = torch.max(train_outputs, 1) train_correct = (train_pred == y_train).sum().item() train_accuracy = train_correct / y_train.size(0) print(f"\n续训后评估 | 训练集准确率: {train_accuracy * 100:.2f}% | 测试集准确率: {test_accuracy * 100:.2f}%") # 步骤6:可视化完整损失曲线 plt.figure(figsize=(12, 6)) plt.plot(epochs, train_losses, label='训练损失') plt.plot(epochs, test_losses, label='测试损失') plt.axvline(x=start_epoch, color='red', linestyle='--', label=f'续训起始点(第{start_epoch}轮)') plt.xlabel('训练轮数(Epoch)') plt.ylabel('损失(Loss)') plt.title('信贷违约预测模型 - 基础训练+续训损失曲线') plt.legend() plt.grid(True) plt.show() return model # 执行续训50轮 continued_model = continue_train_50_epochs()

## 三、早停法

import torch import torch.nn as nn import torch.optim as optim import pandas as pd import numpy as np import time import matplotlib.pyplot as plt from tqdm import tqdm from sklearn.model_selection import train_test_split from sklearn.preprocessing import MinMaxScaler import warnings warnings.filterwarnings("ignore") # ===================== 1. 基础配置 ===================== # 设置GPU/CPU设备 device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") print(f"使用设备: {device}") # 设置中文字体(解决可视化中文乱码) plt.rcParams['font.sans-serif'] = ['SimHei'] plt.rcParams['axes.unicode_minus'] = False # ===================== 2. 数据预处理(适配信贷数据集) ===================== def preprocess_credit_data(): """信贷数据集预处理:编码、缺失值填充、划分、归一化""" # 读取数据(原始字符串避免路径转义) data = pd.read_csv(r'data.csv') # 1. 字符串特征编码 # Home Ownership 标签编码 home_mapping = {'Own Home': 1, 'Rent': 2, 'Have Mortgage': 3, 'Home Mortgage': 4} data['Home Ownership'] = data['Home Ownership'].map(home_mapping) # Years in current job 标签编码 job_years_mapping = { '< 1 year': 1, '1 year': 2, '2 years': 3, '3 years': 4, '4 years': 5, '5 years': 6, '6 years': 7, '7 years': 8, '8 years': 9, '9 years': 10, '10+ years': 11 } data['Years in current job'] = data['Years in current job'].map(job_years_mapping) # Purpose 独热编码(转为数值型) data = pd.get_dummies(data, columns=['Purpose']) data2 = pd.read_csv(r'data.csv') new_cols = [col for col in data.columns if col not in data2.columns] for col in new_cols: data[col] = data[col].astype(int) # Term 映射 + 重命名 term_mapping = {'Short Term': 0, 'Long Term': 1} data['Term'] = data['Term'].map(term_mapping) data.rename(columns={'Term': 'Long Term'}, inplace=True) # 2. 缺失值填充(连续特征用众数) continuous_cols = data.select_dtypes(include=['int64', 'float64']).columns.tolist() for col in continuous_cols: mode_val = data[col].mode()[0] data[col].fillna(mode_val, inplace=True) # 3. 划分特征/标签(X:31维特征,y:二分类标签) X = data.drop(['Credit Default'], axis=1) y = data['Credit Default'] # 4. 划分训练集/测试集(8:2) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) # 5. 特征归一化(提升神经网络训练稳定性) scaler = MinMaxScaler() X_train = scaler.fit_transform(X_train) X_test = scaler.transform(X_test) # 6. 张量转换(兼容pandas Series/numpy数组) def to_tensor(data, dtype=torch.float32): """统一转换为张量,避免Series报错""" if isinstance(data, pd.Series): arr = data.values elif isinstance(data, np.ndarray): arr = data else: arr = np.array(data) return torch.tensor(arr, dtype=dtype).to(device) # 特征转float32,标签转long(CrossEntropyLoss要求) X_train = to_tensor(X_train, torch.float32) X_test = to_tensor(X_test, torch.float32) y_train = to_tensor(y_train, torch.long) y_test = to_tensor(y_test, torch.long) # 打印维度验证 print(f"数据预处理完成 | X_train形状: {X_train.shape} | y_train形状: {y_train.shape}") print(f"模型输入维度: {X_train.shape[1]} | 输出维度: 2(二分类)") return X_train, X_test, y_train, y_test # 执行数据预处理 X_train, X_test, y_train, y_test = preprocess_credit_data() # ===================== 3. 定义适配信贷数据的MLP模型 ===================== class MLP(nn.Module): def __init__(self, input_dim=31, hidden_dim=64, output_dim=2): super(MLP, self).__init__() self.fc1 = nn.Linear(input_dim, hidden_dim) # 31维输入 → 64维隐藏层 self.relu = nn.ReLU() self.dropout = nn.Dropout(0.2) # 防止过拟合(信贷数据复杂度更高) self.fc2 = nn.Linear(hidden_dim, output_dim) # 64维 → 2维输出(二分类) def forward(self, x): out = self.fc1(x) out = self.relu(out) out = self.dropout(out) out = self.fc2(out) return out # 实例化模型并移至设备 model = MLP(input_dim=X_train.shape[1]).to(device) # ===================== 4. 训练配置(保留早停逻辑) ===================== # 损失函数(二分类用CrossEntropyLoss) criterion = nn.CrossEntropyLoss() # 优化器(SGD,可替换为Adam:optim.Adam(model.parameters(), lr=0.001)) optimizer = optim.SGD(model.parameters(), lr=0.01) # 训练参数 num_epochs = 20000 # 最大训练轮数 train_losses = [] # 训练损失记录 test_losses = [] # 测试损失记录 epochs_list = [] # 记录的epoch数 # 早停相关参数(适配信贷数据调整patience) best_test_loss = float('inf') # 最佳测试损失 best_epoch = 0 # 最佳epoch patience = 100 # 早停耐心值(信贷数据可适当增大) counter = 0 # 早停计数器 early_stopped = False # 是否早停标志 # ===================== 5. 训练模型(保留进度条+早停) ===================== start_time = time.time() # 创建tqdm进度条 with tqdm(total=num_epochs, desc="训练进度", unit="epoch") as pbar: for epoch in range(num_epochs): model.train() # 训练模式(启用Dropout) # 前向传播 outputs = model(X_train) train_loss = criterion(outputs, y_train) # 反向传播+优化 optimizer.zero_grad() train_loss.backward() optimizer.step() # 每200轮记录损失+检查早停 if (epoch + 1) % 200 == 0: # 测试集评估(关闭梯度) model.eval() with torch.no_grad(): test_outputs = model(X_test) test_loss = criterion(test_outputs, y_test) # 记录损失 train_losses.append(train_loss.item()) test_losses.append(test_loss.item()) epochs_list.append(epoch + 1) # 更新进度条 pbar.set_postfix({'Train Loss': f'{train_loss.item():.4f}', 'Test Loss': f'{test_loss.item():.4f}'}) # 早停逻辑 if test_loss.item() < best_test_loss: best_test_loss = test_loss.item() best_epoch = epoch + 1 counter = 0 # 保存最佳模型 torch.save(model.state_dict(), 'best_credit_model.pth') else: counter += 1 if counter >= patience: print(f"\n早停触发!第{epoch+1}轮,测试损失已连续{patience}轮未改善") print(f"最佳测试损失:{best_test_loss:.4f}(第{best_epoch}轮)") early_stopped = True break # 终止训练 model.train() # 切回训练模式 # 每1000轮更新进度条 if (epoch + 1) % 1000 == 0: pbar.update(1000) # 补全进度条 if pbar.n < num_epochs: pbar.update(num_epochs - pbar.n) # 计算训练耗时 train_time = time.time() - start_time print(f'\n训练总耗时: {train_time:.2f} seconds') # ===================== 6. 加载最佳模型评估 ===================== if early_stopped: print(f"\n加载第{best_epoch}轮的最佳模型进行评估...") model.load_state_dict(torch.load('best_credit_model.pth', map_location=device)) # 可视化损失曲线 plt.figure(figsize=(10, 6)) plt.plot(epochs_list, train_losses, label='训练损失') plt.plot(epochs_list, test_losses, label='测试损失') plt.axvline(x=best_epoch, color='red', linestyle='--', label=f'最佳模型(第{best_epoch}轮)') plt.xlabel('训练轮数(Epoch)') plt.ylabel('损失(Loss)') plt.title('信贷违约预测模型 - 训练/测试损失曲线') plt.legend() plt.grid(True) plt.show() # 测试集评估(二分类准确率) model.eval() with torch.no_grad(): outputs = model(X_test) _, predicted = torch.max(outputs, 1) correct = (predicted == y_test).sum().item() accuracy = correct / y_test.size(0) print(f'测试集准确率: {accuracy * 100:.2f}%')

从曲线趋势可以看出:该信贷违约预测模型的训练过程稳定,既没有欠拟合(损失未居高不下),也没有过拟合(训练 / 测试损失差距小),最终在 20000 轮达到了较好的拟合效果。

版权声明: 本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若内容造成侵权/违法违规/事实不符,请联系邮箱:809451989@qq.com进行投诉反馈,一经查实,立即删除!
网站建设 2026/4/23 14:17:04

揭秘管道符:Linux命令并行执行的秘密

我们耳熟能详的操作系统&#xff1a;Linux、Windows以及一些数通设备&#xff08;Huawei交换机等&#xff09;都是支持管道符&#xff08;|&#xff09;的。那么管道是如何工作的呢&#xff1f;跟我们“自认为”、“应如是”是否有出入呢&#xff1f;结合AI给的解释&#xff0c…

作者头像 李华
网站建设 2026/4/23 10:12:52

影视配音新思路:用EmotiVoice生成情绪化对白

影像叙事的“声”命力&#xff1a;用 EmotiVoice 重塑情绪化对白生成 在一部动画短片的后期制作室里&#xff0c;导演正为一段关键剧情反复纠结——主角说出“我原谅你”的那一刻&#xff0c;究竟是该带着释然的温柔&#xff0c;还是压抑着泪水的苦涩&#xff1f;传统流程下&am…

作者头像 李华
网站建设 2026/4/23 10:12:20

9 个降AI率工具,专科生高效避坑指南

9 个降AI率工具&#xff0c;专科生高效避坑指南 AI降重工具&#xff1a;专科生高效避坑的得力助手 随着AI技术在学术写作中的广泛应用&#xff0c;越来越多的专科生开始面临论文中AIGC率偏高的问题。这不仅影响论文的原创性评估&#xff0c;还可能直接导致查重率超标&#xff0…

作者头像 李华
网站建设 2026/4/23 2:21:57

EmotiVoice能否生成老年人声音?音色老化算法解析

EmotiVoice能否生成老年人声音&#xff1f;音色老化算法解析 在智能语音助手逐渐走进千家万户的今天&#xff0c;一个看似简单却极具挑战性的问题浮现出来&#xff1a;我们能否让AI“变老”&#xff1f;当一位用户希望听到祖辈的声音从设备中传出&#xff0c;或为老年角色赋予真…

作者头像 李华