50 lines
958 B
Markdown
50 lines
958 B
Markdown
|
|
# P02_errlens_training - 训练流程
|
||
|
|
|
||
|
|
## 训练流程
|
||
|
|
|
||
|
|
### 阶段一:数据准备
|
||
|
|
```bash
|
||
|
|
# 1. 下载原始数据
|
||
|
|
python src/data/download.py
|
||
|
|
|
||
|
|
# 2. 数据清洗
|
||
|
|
python src/data/clean.py
|
||
|
|
|
||
|
|
# 3. 数据标注(可选)
|
||
|
|
python src/data/label.py
|
||
|
|
```
|
||
|
|
|
||
|
|
### 阶段二:模型训练
|
||
|
|
```bash
|
||
|
|
# 训练命令
|
||
|
|
python src/training/train.py \
|
||
|
|
--data data/train \
|
||
|
|
--model-base microsoft/codebert-base \
|
||
|
|
--epochs 10 \
|
||
|
|
--batch-size 32 \
|
||
|
|
--lr 2e-5
|
||
|
|
```
|
||
|
|
|
||
|
|
### 阶段三:模型评估
|
||
|
|
```bash
|
||
|
|
# 评估命令
|
||
|
|
python src/evaluation/evaluate.py \
|
||
|
|
--model models/best_model \
|
||
|
|
--data data/test
|
||
|
|
```
|
||
|
|
|
||
|
|
### 阶段四:模型导出
|
||
|
|
```bash
|
||
|
|
# 导出为 ONNX 格式
|
||
|
|
python src/deployment/export.py \
|
||
|
|
--model models/best_model \
|
||
|
|
--output models/exported
|
||
|
|
```
|
||
|
|
|
||
|
|
## 评估指标
|
||
|
|
| 指标 | 说明 | 目标值 |
|
||
|
|
|------|------|--------|
|
||
|
|
| Precision | 精确率 | >= 0.90 |
|
||
|
|
| Recall | 召回率 | >= 0.85 |
|
||
|
|
| F1 Score | F1分数 | >= 0.87 |
|
||
|
|
| Accuracy | 准确率 | >= 0.92 |
|