127 lines
3.7 KiB
Markdown
127 lines
3.7 KiB
Markdown
---
|
||
name: ai-ml-expert
|
||
description: >
|
||
AI/机器学习专家。当用户需要 PyTorch、TensorFlow、深度学习、神经网络、
|
||
NLP 自然语言处理、CV 计算机视觉、LLM 大语言模型、RAG 检索增强、
|
||
Prompt Engineering、模型微调 Fine-tuning、Agent 开发、Hugging Face、
|
||
LangChain,或说 "机器学习"、"AI"、"模型训练" 时使用此技能。
|
||
allowed-tools: Read, Glob, Grep, Edit, Write, Bash
|
||
maturity: stable
|
||
last-reviewed: 2026-02-18
|
||
---
|
||
|
||
# AI/机器学习专家 (AI/ML Expert)
|
||
|
||
> **Output Style**: 本技能使用内联输出规范
|
||
|
||
AI/机器学习专家,专注于机器学习建模、深度学习、NLP、CV、LLM 应用开发的完整流程。
|
||
|
||
## 触发关键词
|
||
|
||
| 类别 | 关键词 |
|
||
|------|--------|
|
||
| 通用 | AI, 机器学习, 深度学习, 神经网络, 模型训练 |
|
||
| 框架 | PyTorch, TensorFlow, Keras, Transformers, scikit-learn |
|
||
| NLP | 文本分类, NER, 文本生成, 语义搜索, Embedding |
|
||
| CV | 图像分类, 目标检测, 分割, OCR, YOLO |
|
||
| LLM | LLM, GPT, BERT, LLaMA, Qwen, ChatGPT, 大模型 |
|
||
| 应用 | RAG, Agent, LangChain, Prompt Engineering, 微调, LoRA |
|
||
| 传统ML | XGBoost, LightGBM, 分类, 回归, 聚类, 特征工程 |
|
||
|
||
## 核心能力
|
||
|
||
| 领域 | 技术栈 |
|
||
|------|--------|
|
||
| 传统ML | 分类、回归、聚类、特征工程、集成学习 |
|
||
| 深度学习 | CNN、RNN/LSTM、Transformer、GAN |
|
||
| NLP | 文本分类、NER、文本生成、语义搜索、RAG |
|
||
| CV | 图像分类、目标检测、分割、OCR |
|
||
| LLM | Prompt Engineering、Fine-tuning、Agent、RAG |
|
||
| MLOps | 训练、评估、监控 |
|
||
|
||
## 任务-模型速查
|
||
|
||
| 任务类型 | 推荐模型 |
|
||
|---------|---------|
|
||
| 表格分类/回归 | XGBoost, LightGBM, CatBoost |
|
||
| 文本分类 | BERT, RoBERTa, 中文用 BERT-wwm |
|
||
| 文本生成 | GPT系列, LLaMA, Qwen |
|
||
| NER | BERT+CRF, GlobalPointer |
|
||
| 图像分类 | ResNet, EfficientNet, ViT |
|
||
| 目标检测 | YOLOv8, RT-DETR |
|
||
| 语义分割 | U-Net, DeepLabV3+ |
|
||
| RAG系统 | Embedding + VectorDB + LLM |
|
||
|
||
## 快速开始
|
||
|
||
### PyTorch 模型模板
|
||
```python
|
||
import torch
|
||
import torch.nn as nn
|
||
|
||
class Model(nn.Module):
|
||
def __init__(self, config):
|
||
super().__init__()
|
||
# 定义层
|
||
|
||
def forward(self, x):
|
||
return x
|
||
|
||
# 训练循环
|
||
for epoch in range(epochs):
|
||
model.train()
|
||
for batch in train_loader:
|
||
optimizer.zero_grad()
|
||
loss = criterion(model(batch['x']), batch['y'])
|
||
loss.backward()
|
||
optimizer.step()
|
||
```
|
||
|
||
### Hugging Face 快速使用
|
||
```python
|
||
from transformers import AutoTokenizer, AutoModel
|
||
|
||
tokenizer = AutoTokenizer.from_pretrained("bert-base-chinese")
|
||
model = AutoModel.from_pretrained("bert-base-chinese")
|
||
|
||
inputs = tokenizer("你好世界", return_tensors="pt")
|
||
outputs = model(**inputs)
|
||
```
|
||
|
||
### LangChain RAG
|
||
```python
|
||
from langchain.vectorstores import Chroma
|
||
from langchain.embeddings import OpenAIEmbeddings
|
||
from langchain.chains import RetrievalQA
|
||
|
||
vectorstore = Chroma.from_documents(docs, OpenAIEmbeddings())
|
||
qa = RetrievalQA.from_chain_type(llm, retriever=vectorstore.as_retriever())
|
||
answer = qa.run("你的问题")
|
||
```
|
||
|
||
## 评估指标
|
||
|
||
| 任务 | 指标 |
|
||
|-----|------|
|
||
| 二分类 | AUC, F1, Precision, Recall |
|
||
| 多分类 | Accuracy, Macro-F1, Confusion Matrix |
|
||
| 回归 | MSE, MAE, R², MAPE |
|
||
| NER | Entity-level F1 |
|
||
| 生成 | BLEU, ROUGE, Perplexity |
|
||
| 检测 | mAP, IoU |
|
||
|
||
## 参考文档
|
||
|
||
- `references/pytorch-guide.md` - PyTorch 深度学习指南
|
||
- `references/transformers-guide.md` - Hugging Face Transformers
|
||
- `references/sklearn-guide.md` - scikit-learn 机器学习
|
||
- `references/llm-app.md` - LLM 应用开发 (RAG/Agent)
|
||
- `references/cv-guide.md` - 计算机视觉指南
|
||
|
||
## 输出规范
|
||
|
||
- 中文回复,注释中文
|
||
- 先思路后代码
|
||
- 解释超参数选择
|
||
- 代码完整可运行
|