MAC本地微调大模型(MLX + Qwen2.5)

/ AI / 2 条评论 / 581浏览

背景

微调过程中的各种数据集清洗、微调超参数调整学习,最新模型的测试,都需要一个高效的微调框架去进行验证、熟悉,本文就通过介绍苹果官方出品的MLX微调框架本地微调大模型。毕竟想随心所欲的使用云端微调服务,大部分都是需要收费,在没有对微调技术较为熟悉的情况下,本地微调大模型也是不失为一种ROI较高的学习方式。

微调

微调是基于一个已经训练好的神经网络模型,通过对其参数进行细微调整,使其更好地适应特定的任务或数据。通过在新的小规模数据集上继续训练模型的部分或全部层,模型能够在保留原有知识的基础上,针对新任务进行优化,从而提升在特定领域的表现。 根据微调参数范围划分,微调范围分为两种:

目前我们绝大部分场景都使用的是部分微调,种方法减少了计算和存储成本,同时降低了过拟合的风险,适合数据较少的任务,但在任务复杂度较高时可能无法充分发挥模型的潜力。

根据微调使用的数据集类型,大模型微调还可以分为:

好处

流程

  1. 准备数据集:找到和任务相关的数据,保证数据质量和标签准确,然后做好清洗和预处理。
  2. 选模型:根据任务和数据,选一个合适的微调的基座模型。
  3. 微调策略参数:根据任务需求和资源,选择合适的微调策略,配置LoRA参数、微调参数如学习率,确保模型收敛。
  4. 训练模型:在训练集上训练模型,,按照设定的超参数和优化算法,调整参数降低损失,防止过拟合。
  5. 评估模型:在验证集上评估模型性能。

MLX

image-20250106092225428

MLX是由苹果的机器学习研究团队推出的用于机器学习的阵列框架,该开源框架专为 Apple Silicon 芯片而设计优化,从NumPy、PyTorch、Jax和ArrayFire等框架中吸取灵感,提供简单友好的使用方法,它可以在Apple Silicon CPU/GPU 上进行 ML 训练和推理。由Apple公司开发的 MLX 库类似于 TensorFlow 和 PyTorch,支持 GPU 支持的任务。该库允许在新的 Apple Silicon(M 系列)芯片上对 LLM 进行微调。此外,MLX 还支持使用 LoRA 、QLoRA等方法对 LLM 进行微调

官网:ml-explore.github.io/mlx/build/h… github:github.com/ml-explore/…

下载模型

通过命令行下载模型,模型大小1G左右,下载命令如下,整体过程可以跑满下载带宽。

创建环境

conda create -n FineTune python=3.11.11
conda activate FineTune

下载模型

#安装依赖 可能需要科学
pip install -U huggingface_hub
#设置环境变量
export HF_ENDPOINT=https://hf-mirror.com 
#下载模型,保存至qwen2.5-0.5B目录
huggingface-cli download --resume-download Qwen/Qwen2.5-0.5B-Instruct --local-dir qwen2.5-0.5B

下载失败可以重复执行一下

下载完成后的文件列表

image-20250102102235910

准备数据集

{"prompt": "今天星期几", "completion": "星期八"}
{"prompt": "太阳什么时候升起?", "completion": "晚上八点"}
{"prompt": "忘情水是什么水", "completion": "忘情水是可以让人忘却烦恼的水"}
{"prompt": "蓝牙耳机坏了应该看什么科", "completion": "耳鼻喉科"}
{"prompt": "鲁迅为什么讨厌周树人", "completion": "因为他们是仇人"}

代码准备

git clone git@github.com:ml-explore/mlx-examples.git

将lora/data目录下的“train.jsonl”文件内容改为上面的微调数据集,由于我们不做测试与验证,所以测试数据集和验证数据集的内容就不修改了。验证方式通过对微调后的模型进行提问来简单验证。

image-20250102103416564

依赖安装

pip install mlx-lm
pip install transformers
pip install torch
pip install numpy

微调模型

进入lora目录,执行如下代码开始微调,由于我们目的是快速走通微调流程,就不去调整各种微调超参去测试微调效果了,仅指定需要微调的模型路径与数据集路径,其他参数走默认参数。

cd /Users/maruifu/Desktop/AI/mlx-examples/lora
mlx_lm.lora --model /Users/maruifu/Desktop/AI/qwen2.5-0.5B --train --data ./data

支持的微调方式包括lora、qlora、full(全参微调)

开始执行微调过程,由于数据集数量很少,执行过程很快,loss也下降的很快

(FineTune) maruifu@XMG-M4ProMax lora % mlx_lm.lora --model /Users/maruifu/Desktop/AI/qwen2.5-0.5B --train --data ./data
Loading pretrained model
Loading datasets
Training
Trainable parameters: 0.109% (0.541M/494.033M)
Starting training..., iters: 1000
Iter 1: Val loss 2.755, Val took 2.665s
Iter 10: Train loss 5.209, Learning Rate 1.000e-05, It/sec 6.475, Tokens/sec 1120.136, Trained Tokens 1730, Peak mem 2.007 GB
Iter 20: Train loss 2.642, Learning Rate 1.000e-05, It/sec 9.758, Tokens/sec 1688.204, Trained Tokens 3460, Peak mem 2.007 GB
Iter 30: Train loss 1.472, Learning Rate 1.000e-05, It/sec 9.751, Tokens/sec 1686.856, Trained Tokens 5190, Peak mem 2.007 GB
Iter 40: Train loss 0.911, Learning Rate 1.000e-05, It/sec 9.773, Tokens/sec 1690.745, Trained Tokens 6920, Peak mem 2.007 GB
Iter 50: Train loss 0.615, Learning Rate 1.000e-05, It/sec 9.508, Tokens/sec 1644.925, Trained Tokens 8650, Peak mem 2.007 GB
Iter 60: Train loss 0.413, Learning Rate 1.000e-05, It/sec 9.701, Tokens/sec 1678.330, Trained Tokens 10380, Peak mem 2.007 GB
Iter 70: Train loss 0.248, Learning Rate 1.000e-05, It/sec 9.745, Tokens/sec 1685.828, Trained Tokens 12110, Peak mem 2.007 GB
Iter 80: Train loss 0.132, Learning Rate 1.000e-05, It/sec 9.744, Tokens/sec 1685.675, Trained Tokens 13840, Peak mem 2.007 GB
Iter 90: Train loss 0.087, Learning Rate 1.000e-05, It/sec 9.737, Tokens/sec 1684.579, Trained Tokens 15570, Peak mem 2.007 GB
Iter 100: Train loss 0.067, Learning Rate 1.000e-05, It/sec 9.653, Tokens/sec 1669.912, Trained Tokens 17300, Peak mem 2.007 GB
Iter 100: Saved adapter weights to adapters/adapters.safetensors and adapters/0000100_adapters.safetensors.
Iter 110: Train loss 0.060, Learning Rate 1.000e-05, It/sec 9.692, Tokens/sec 1676.734, Trained Tokens 19030, Peak mem 2.007 GB
Iter 120: Train loss 0.055, Learning Rate 1.000e-05, It/sec 9.355, Tokens/sec 1618.465, Trained Tokens 20760, Peak mem 2.007 GB
Iter 130: Train loss 0.053, Learning Rate 1.000e-05, It/sec 9.727, Tokens/sec 1682.790, Trained Tokens 22490, Peak mem 2.007 GB
Iter 140: Train loss 0.049, Learning Rate 1.000e-05, It/sec 9.707, Tokens/sec 1679.372, Trained Tokens 24220, Peak mem 2.007 GB
Iter 150: Train loss 0.048, Learning Rate 1.000e-05, It/sec 9.714, Tokens/sec 1680.461, Trained Tokens 25950, Peak mem 2.007 GB
Iter 160: Train loss 0.046, Learning Rate 1.000e-05, It/sec 9.678, Tokens/sec 1674.324, Trained Tokens 27680, Peak mem 2.007 GB
Iter 170: Train loss 0.046, Learning Rate 1.000e-05, It/sec 9.542, Tokens/sec 1650.749, Trained Tokens 29410, Peak mem 2.007 GB
Iter 180: Train loss 0.045, Learning Rate 1.000e-05, It/sec 9.681, Tokens/sec 1674.849, Trained Tokens 31140, Peak mem 2.007 GB
Iter 190: Train loss 0.044, Learning Rate 1.000e-05, It/sec 9.620, Tokens/sec 1664.245, Trained Tokens 32870, Peak mem 2.007 GB
Iter 200: Val loss 2.911, Val took 1.904s
Iter 200: Train loss 0.044, Learning Rate 1.000e-05, It/sec 79.695, Tokens/sec 13787.287, Trained Tokens 34600, Peak mem 2.021 GB
Iter 200: Saved adapter weights to adapters/adapters.safetensors and adapters/0000200_adapters.safetensors.
Iter 210: Train loss 0.042, Learning Rate 1.000e-05, It/sec 9.653, Tokens/sec 1669.991, Trained Tokens 36330, Peak mem 2.021 GB
Iter 220: Train loss 0.042, Learning Rate 1.000e-05, It/sec 9.545, Tokens/sec 1651.331, Trained Tokens 38060, Peak mem 2.021 GB
Iter 230: Train loss 0.041, Learning Rate 1.000e-05, It/sec 9.635, Tokens/sec 1666.780, Trained Tokens 39790, Peak mem 2.021 GB
Iter 240: Train loss 0.042, Learning Rate 1.000e-05, It/sec 9.524, Tokens/sec 1647.641, Trained Tokens 41520, Peak mem 2.021 GB
Iter 250: Train loss 0.041, Learning Rate 1.000e-05, It/sec 9.610, Tokens/sec 1662.556, Trained Tokens 43250, Peak mem 2.021 GB
Iter 260: Train loss 0.041, Learning Rate 1.000e-05, It/sec 9.489, Tokens/sec 1641.622, Trained Tokens 44980, Peak mem 2.021 GB
Iter 270: Train loss 0.039, Learning Rate 1.000e-05, It/sec 9.682, Tokens/sec 1675.023, Trained Tokens 46710, Peak mem 2.021 GB
Iter 280: Train loss 0.039, Learning Rate 1.000e-05, It/sec 9.622, Tokens/sec 1664.556, Trained Tokens 48440, Peak mem 2.021 GB
Iter 290: Train loss 0.038, Learning Rate 1.000e-05, It/sec 9.673, Tokens/sec 1673.366, Trained Tokens 50170, Peak mem 2.021 GB
Iter 300: Train loss 0.038, Learning Rate 1.000e-05, It/sec 9.561, Tokens/sec 1654.011, Trained Tokens 51900, Peak mem 2.021 GB
Iter 300: Saved adapter weights to adapters/adapters.safetensors and adapters/0000300_adapters.safetensors.
Iter 310: Train loss 0.038, Learning Rate 1.000e-05, It/sec 9.613, Tokens/sec 1663.060, Trained Tokens 53630, Peak mem 2.021 GB
Iter 320: Train loss 0.038, Learning Rate 1.000e-05, It/sec 9.650, Tokens/sec 1669.448, Trained Tokens 55360, Peak mem 2.021 GB
Iter 330: Train loss 0.037, Learning Rate 1.000e-05, It/sec 9.650, Tokens/sec 1669.512, Trained Tokens 57090, Peak mem 2.021 GB
Iter 340: Train loss 0.039, Learning Rate 1.000e-05, It/sec 9.703, Tokens/sec 1678.638, Trained Tokens 58820, Peak mem 2.021 GB
Iter 350: Train loss 0.038, Learning Rate 1.000e-05, It/sec 9.567, Tokens/sec 1655.136, Trained Tokens 60550, Peak mem 2.021 GB
Iter 360: Train loss 0.037, Learning Rate 1.000e-05, It/sec 9.584, Tokens/sec 1657.958, Trained Tokens 62280, Peak mem 2.021 GB
Iter 370: Train loss 0.038, Learning Rate 1.000e-05, It/sec 9.583, Tokens/sec 1657.870, Trained Tokens 64010, Peak mem 2.021 GB
Iter 380: Train loss 0.037, Learning Rate 1.000e-05, It/sec 9.584, Tokens/sec 1657.979, Trained Tokens 65740, Peak mem 2.021 GB
Iter 390: Train loss 0.037, Learning Rate 1.000e-05, It/sec 9.658, Tokens/sec 1670.766, Trained Tokens 67470, Peak mem 2.021 GB
Iter 400: Val loss 2.923, Val took 1.876s
Iter 400: Train loss 0.037, Learning Rate 1.000e-05, It/sec 70.241, Tokens/sec 12151.771, Trained Tokens 69200, Peak mem 2.021 GB
Iter 400: Saved adapter weights to adapters/adapters.safetensors and adapters/0000400_adapters.safetensors.
Iter 410: Train loss 0.038, Learning Rate 1.000e-05, It/sec 9.660, Tokens/sec 1671.174, Trained Tokens 70930, Peak mem 2.021 GB
Iter 420: Train loss 0.036, Learning Rate 1.000e-05, It/sec 9.672, Tokens/sec 1673.215, Trained Tokens 72660, Peak mem 2.021 GB
Iter 430: Train loss 0.037, Learning Rate 1.000e-05, It/sec 9.672, Tokens/sec 1673.253, Trained Tokens 74390, Peak mem 2.021 GB
Iter 440: Train loss 0.037, Learning Rate 1.000e-05, It/sec 9.677, Tokens/sec 1674.095, Trained Tokens 76120, Peak mem 2.021 GB
Iter 450: Train loss 0.036, Learning Rate 1.000e-05, It/sec 9.677, Tokens/sec 1674.059, Trained Tokens 77850, Peak mem 2.021 GB
Iter 460: Train loss 0.036, Learning Rate 1.000e-05, It/sec 9.640, Tokens/sec 1667.714, Trained Tokens 79580, Peak mem 2.021 GB
Iter 470: Train loss 0.036, Learning Rate 1.000e-05, It/sec 9.569, Tokens/sec 1655.492, Trained Tokens 81310, Peak mem 2.021 GB
Iter 480: Train loss 0.036, Learning Rate 1.000e-05, It/sec 9.662, Tokens/sec 1671.469, Trained Tokens 83040, Peak mem 2.021 GB
Iter 490: Train loss 0.036, Learning Rate 1.000e-05, It/sec 9.599, Tokens/sec 1660.569, Trained Tokens 84770, Peak mem 2.021 GB
Iter 500: Train loss 0.036, Learning Rate 1.000e-05, It/sec 9.292, Tokens/sec 1607.535, Trained Tokens 86500, Peak mem 2.021 GB
Iter 500: Saved adapter weights to adapters/adapters.safetensors and adapters/0000500_adapters.safetensors.
Iter 510: Train loss 0.036, Learning Rate 1.000e-05, It/sec 9.631, Tokens/sec 1666.207, Trained Tokens 88230, Peak mem 2.021 GB
Iter 520: Train loss 0.036, Learning Rate 1.000e-05, It/sec 9.650, Tokens/sec 1669.500, Trained Tokens 89960, Peak mem 2.021 GB
Iter 530: Train loss 0.036, Learning Rate 1.000e-05, It/sec 9.666, Tokens/sec 1672.258, Trained Tokens 91690, Peak mem 2.021 GB
Iter 540: Train loss 0.035, Learning Rate 1.000e-05, It/sec 9.625, Tokens/sec 1665.041, Trained Tokens 93420, Peak mem 2.021 GB
Iter 550: Train loss 0.036, Learning Rate 1.000e-05, It/sec 9.623, Tokens/sec 1664.755, Trained Tokens 95150, Peak mem 2.021 GB
Iter 560: Train loss 0.036, Learning Rate 1.000e-05, It/sec 9.636, Tokens/sec 1667.108, Trained Tokens 96880, Peak mem 2.021 GB
Iter 570: Train loss 0.035, Learning Rate 1.000e-05, It/sec 9.682, Tokens/sec 1674.924, Trained Tokens 98610, Peak mem 2.021 GB
Iter 580: Train loss 0.035, Learning Rate 1.000e-05, It/sec 9.674, Tokens/sec 1673.663, Trained Tokens 100340, Peak mem 2.021 GB
Iter 590: Train loss 0.035, Learning Rate 1.000e-05, It/sec 9.613, Tokens/sec 1663.114, Trained Tokens 102070, Peak mem 2.021 GB
Iter 600: Val loss 2.914, Val took 1.905s
Iter 600: Train loss 0.035, Learning Rate 1.000e-05, It/sec 78.821, Tokens/sec 13636.113, Trained Tokens 103800, Peak mem 2.021 GB
Iter 600: Saved adapter weights to adapters/adapters.safetensors and adapters/0000600_adapters.safetensors.
Iter 610: Train loss 0.035, Learning Rate 1.000e-05, It/sec 9.362, Tokens/sec 1619.649, Trained Tokens 105530, Peak mem 2.021 GB
Iter 620: Train loss 0.035, Learning Rate 1.000e-05, It/sec 9.369, Tokens/sec 1620.838, Trained Tokens 107260, Peak mem 2.021 GB
Iter 630: Train loss 0.035, Learning Rate 1.000e-05, It/sec 9.408, Tokens/sec 1627.645, Trained Tokens 108990, Peak mem 2.021 GB
Iter 640: Train loss 0.035, Learning Rate 1.000e-05, It/sec 9.428, Tokens/sec 1631.094, Trained Tokens 110720, Peak mem 2.021 GB
Iter 650: Train loss 0.035, Learning Rate 1.000e-05, It/sec 9.460, Tokens/sec 1636.570, Trained Tokens 112450, Peak mem 2.021 GB
Iter 660: Train loss 0.035, Learning Rate 1.000e-05, It/sec 9.505, Tokens/sec 1644.345, Trained Tokens 114180, Peak mem 2.021 GB
Iter 670: Train loss 0.035, Learning Rate 1.000e-05, It/sec 9.595, Tokens/sec 1659.934, Trained Tokens 115910, Peak mem 2.021 GB
Iter 680: Train loss 0.035, Learning Rate 1.000e-05, It/sec 9.625, Tokens/sec 1665.183, Trained Tokens 117640, Peak mem 2.021 GB
Iter 690: Train loss 0.035, Learning Rate 1.000e-05, It/sec 9.619, Tokens/sec 1664.156, Trained Tokens 119370, Peak mem 2.021 GB
Iter 700: Train loss 0.035, Learning Rate 1.000e-05, It/sec 9.641, Tokens/sec 1667.847, Trained Tokens 121100, Peak mem 2.021 GB
Iter 700: Saved adapter weights to adapters/adapters.safetensors and adapters/0000700_adapters.safetensors.
Iter 710: Train loss 0.035, Learning Rate 1.000e-05, It/sec 9.637, Tokens/sec 1667.250, Trained Tokens 122830, Peak mem 2.021 GB
Iter 720: Train loss 0.035, Learning Rate 1.000e-05, It/sec 9.659, Tokens/sec 1671.092, Trained Tokens 124560, Peak mem 2.021 GB
Iter 730: Train loss 0.034, Learning Rate 1.000e-05, It/sec 9.651, Tokens/sec 1669.550, Trained Tokens 126290, Peak mem 2.021 GB
Iter 740: Train loss 0.034, Learning Rate 1.000e-05, It/sec 9.636, Tokens/sec 1667.022, Trained Tokens 128020, Peak mem 2.021 GB
Iter 750: Train loss 0.034, Learning Rate 1.000e-05, It/sec 9.651, Tokens/sec 1669.658, Trained Tokens 129750, Peak mem 2.021 GB
Iter 760: Train loss 0.035, Learning Rate 1.000e-05, It/sec 9.624, Tokens/sec 1664.878, Trained Tokens 131480, Peak mem 2.021 GB
Iter 770: Train loss 0.035, Learning Rate 1.000e-05, It/sec 9.694, Tokens/sec 1677.147, Trained Tokens 133210, Peak mem 2.021 GB
Iter 780: Train loss 0.034, Learning Rate 1.000e-05, It/sec 9.659, Tokens/sec 1670.965, Trained Tokens 134940, Peak mem 2.021 GB
Iter 790: Train loss 0.034, Learning Rate 1.000e-05, It/sec 9.652, Tokens/sec 1669.851, Trained Tokens 136670, Peak mem 2.021 GB
Iter 800: Val loss 2.931, Val took 1.918s
Iter 800: Train loss 0.035, Learning Rate 1.000e-05, It/sec 70.855, Tokens/sec 12257.872, Trained Tokens 138400, Peak mem 2.021 GB
Iter 800: Saved adapter weights to adapters/adapters.safetensors and adapters/0000800_adapters.safetensors.
Iter 810: Train loss 0.034, Learning Rate 1.000e-05, It/sec 8.452, Tokens/sec 1462.112, Trained Tokens 140130, Peak mem 2.021 GB
Iter 820: Train loss 0.035, Learning Rate 1.000e-05, It/sec 9.346, Tokens/sec 1616.807, Trained Tokens 141860, Peak mem 2.021 GB
Iter 830: Train loss 0.035, Learning Rate 1.000e-05, It/sec 9.096, Tokens/sec 1573.530, Trained Tokens 143590, Peak mem 2.021 GB
Iter 840: Train loss 0.034, Learning Rate 1.000e-05, It/sec 8.860, Tokens/sec 1532.860, Trained Tokens 145320, Peak mem 2.021 GB
Iter 850: Train loss 0.035, Learning Rate 1.000e-05, It/sec 9.409, Tokens/sec 1627.715, Trained Tokens 147050, Peak mem 2.021 GB
Iter 860: Train loss 0.034, Learning Rate 1.000e-05, It/sec 9.511, Tokens/sec 1645.410, Trained Tokens 148780, Peak mem 2.021 GB
Iter 870: Train loss 0.034, Learning Rate 1.000e-05, It/sec 9.524, Tokens/sec 1647.594, Trained Tokens 150510, Peak mem 2.021 GB
Iter 880: Train loss 0.034, Learning Rate 1.000e-05, It/sec 9.534, Tokens/sec 1649.359, Trained Tokens 152240, Peak mem 2.021 GB
Iter 890: Train loss 0.034, Learning Rate 1.000e-05, It/sec 9.167, Tokens/sec 1585.897, Trained Tokens 153970, Peak mem 2.021 GB
Iter 900: Train loss 0.034, Learning Rate 1.000e-05, It/sec 9.380, Tokens/sec 1622.811, Trained Tokens 155700, Peak mem 2.021 GB
Iter 900: Saved adapter weights to adapters/adapters.safetensors and adapters/0000900_adapters.safetensors.
Iter 910: Train loss 0.034, Learning Rate 1.000e-05, It/sec 9.480, Tokens/sec 1640.073, Trained Tokens 157430, Peak mem 2.021 GB
Iter 920: Train loss 0.034, Learning Rate 1.000e-05, It/sec 9.489, Tokens/sec 1641.646, Trained Tokens 159160, Peak mem 2.021 GB
Iter 930: Train loss 0.034, Learning Rate 1.000e-05, It/sec 9.523, Tokens/sec 1647.469, Trained Tokens 160890, Peak mem 2.021 GB
Iter 940: Train loss 0.034, Learning Rate 1.000e-05, It/sec 9.282, Tokens/sec 1605.724, Trained Tokens 162620, Peak mem 2.021 GB
Iter 950: Train loss 0.034, Learning Rate 1.000e-05, It/sec 8.924, Tokens/sec 1543.773, Trained Tokens 164350, Peak mem 2.021 GB
Iter 960: Train loss 0.034, Learning Rate 1.000e-05, It/sec 9.216, Tokens/sec 1594.282, Trained Tokens 166080, Peak mem 2.021 GB
Iter 970: Train loss 0.035, Learning Rate 1.000e-05, It/sec 9.420, Tokens/sec 1629.594, Trained Tokens 167810, Peak mem 2.021 GB
Iter 980: Train loss 0.035, Learning Rate 1.000e-05, It/sec 9.440, Tokens/sec 1633.078, Trained Tokens 169540, Peak mem 2.021 GB
Iter 990: Train loss 0.035, Learning Rate 1.000e-05, It/sec 9.433, Tokens/sec 1631.846, Trained Tokens 171270, Peak mem 2.021 GB
Iter 1000: Val loss 2.941, Val took 2.211s
Iter 1000: Train loss 0.035, Learning Rate 1.000e-05, It/sec 66.983, Tokens/sec 11588.048, Trained Tokens 173000, Peak mem 2.021 GB
Iter 1000: Saved adapter weights to adapters/adapters.safetensors and adapters/0001000_adapters.safetensors.
Saved final weights to adapters/adapters.safetensors.

在训练1000Iter后,最终在生成了lora目录下生成微调后的模型适配器权重文件目录adapters

image-20250102105443671

合并模型

将原始模型通过mlx_lm.fuse命令生成与低秩适配器融合后新的模型,新模型命名为“qwen2.5-0.5B-new” 融合成功后生成新模型文件夹“qwen2.5-0.5B-new”

mlx_lm.fuse --model /Users/maruifu/Desktop/AI/qwen2.5-0.5B --adapter-path adapters --save-path qwen2.5-0.5B-new

image-20250102105734935

验证效果

本次验证由于不用于实际项目,所以就通过测试数据集来验证效果,如果有需要,可以通过自定义“test.jsonl”数据集运行下面命令,计算perplexity

python lora.py --model <path_to_model> \
               --adapter-file <path_to_adapters.npz> \
               --test

本次验证通过简单推理几个问题来验证微调后的模型效果,推理命令示例如下

#原始模型推理问题
mlx_lm.generate --model /Users/maruifu/Desktop/AI/qwen2.5-0.5B --prompt "蓝牙耳机坏了应该看什么科"
#微调后的模型推理问题
mlx_lm.generate --model qwen2.5-0.5B-new --prompt "蓝牙耳机坏了应该看什么科"


#原始模型推理问题
mlx_lm.generate --model /Users/maruifu/Desktop/AI/qwen2.5-0.5B --prompt "今天星期几?"
#微调后的模型推理问题
mlx_lm.generate --model qwen2.5-0.5B-new --prompt "今天星期几?"
(FineTune) maruifu@XMG-M4ProMax lora % mlx_lm.generate --model /Users/maruifu/Desktop/AI/qwen2.5-0.5B --prompt "蓝牙耳机坏了应该看什么科"
==========
Prompt: <|im_start|>system
You are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>
<|im_start|>user
蓝牙耳机坏了应该看什么科<|im_end|>
<|im_start|>assistant

蓝牙耳机坏了,通常需要检查以下几个方面来确定问题所在:

1. **电源和连接线**:确保耳机的电源线和连接线没有损坏。如果连接线松动或损坏,可能会导致耳机无法正常工作。

2. **耳机内部**:检查耳机内部是否有损坏或松动的部件。例如,如果耳机的麦克风或扬声器出现问题,可能会导致耳机无法正常工作。

3. **耳机的电池**:如果耳机的电池已经
==========
Prompt: 36 tokens, 639.492 tokens-per-sec
Generation: 100 tokens, 220.372 tokens-per-sec
Peak memory: 1.005 GB
(FineTune) maruifu@XMG-M4ProMax lora % mlx_lm.generate --model qwen2.5-0.5B-new --prompt "蓝牙耳机坏了应该看什么科"
==========
Prompt: <|im_start|>system
You are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>
<|im_start|>user
蓝牙耳机坏了应该看什么科<|im_end|>
<|im_start|>assistant

耳鼻喉科
==========
Prompt: 36 tokens, 607.020 tokens-per-sec
Generation: 5 tokens, 244.010 tokens-per-sec
Peak memory: 1.005 GB
(FineTune) maruifu@XMG-M4ProMax lora % mlx_lm.generate --model /Users/maruifu/Desktop/AI/qwen2.5-0.5B --prompt "今天星期几?"             
==========
Prompt: <|im_start|>system
You are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>
<|im_start|>user
今天星期几?<|im_end|>
<|im_start|>assistant

很抱歉,我无法直接获取当前日期和时间。不过,我可以帮助你查询或回答关于日期和时间的问题。请告诉我你需要查询的具体日期和时间,我会尽力提供帮助。
==========
Prompt: 33 tokens, 501.995 tokens-per-sec
Generation: 41 tokens, 221.940 tokens-per-sec
Peak memory: 1.004 GB
(FineTune) maruifu@XMG-M4ProMax lora % mlx_lm.generate --model qwen2.5-0.5B-new --prompt "今天星期几?"                                   
==========
Prompt: <|im_start|>system
You are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>
<|im_start|>user
今天星期几?<|im_end|>
<|im_start|>assistant

星期八
==========
Prompt: 33 tokens, 600.904 tokens-per-sec
Generation: 3 tokens, 279.420 tokens-per-sec
Peak memory: 1.004 GB
(FineTune) maruifu@XMG-M4ProMax lora % 

可以看到,微调后的模型已经起作用了

写到最后

本次介绍苹果官方出品的MLX微调框架简单微调过程,希望能帮助对微调感兴趣的同学,理解微调过程的各种环节。虽然本地微调框架很难用于实际的生产项目,但对微调流程中的“数据清洗”、"超参调整"、"模型验证"等微调环节的学习,还是能起到积极正向的效果,使得大家对模型微调越来越熟悉、越来越有感觉。

  1. lz好,我跟着流程走了一遍,我用弱智吧完整的2000条语句,但是 Train loss 到1.5以后就反复横跳,最后还是1.5,问答后结果也显示没成功。是不是数据集变大了还要另外的修改参数?

    回复
    1. @牧马人

      数据集没问题的情况下,修改一下学习率或者使用正则化方法

      回复