# 第9章:大模型在量化中的应用⭐
> 大语言模型正在改变金融分析的范式。本章将介绍如何使用LLM进行股价预测和市场分析。
## 学习目标
- ✅ 理解Transformer和LLM原理
- ✅ 了解金融大模型现状
- ✅ 掌握模型微调技术
- ✅ **实现推理型股价预测**
- ✅ 构建智能投研系统
## 9.1 Transformer基础
### Self-Attention机制
```python
class SelfAttention(nn.Module):
def __init__(self, embed_size, heads):
super().__init__()
self.queries = nn.Linear(embed_size, embed_size)
self.keys = nn.Linear(embed_size, embed_size)
self.values = nn.Linear(embed_size, embed_size)
def forward(self, query, key, value):
# 计算注意力分数
attention = torch.matmul(query, key.transpose(-2, -1))
attention = torch.softmax(attention, dim=-1)
# 加权求和
out = torch.matmul(attention, value)
return out
```
## 9.2 使用Hugging Face
### 加载预训练模型
```python
from transformers import AutoTokenizer, AutoModel
# 加载BERT
tokenizer = AutoTokenizer.from_pretrained('bert-base-chinese')
model = AutoModel.from_pretrained('bert-base-chinese')
# 使用
text = "股票价格上涨了"
inputs = tokenizer(text, return_tensors='pt')
outputs = model(**inputs)
```
### 文本分类
```python
from transformers import BertForSequenceClassification
model = BertForSequenceClassification.from_pretrained('bert-base-chinese', num_labels=3)
# 金融情感分析
# 0: 消极, 1: 中性, 2: 积极
```
## 9.3 金融情感分析
```python
class SentimentAnalyzer:
"""金融情感分析"""
def __init__(self):
self.classifier = pipeline('sentiment-analysis')
def analyze_news(self, news_list):
"""分析新闻列表"""
results = []
for news in news_list:
result = self.classifier(news)[0]
results.append({
'text': news,
'label': result['label'],
'score': result['score']
})
return pd.DataFrame(results)
def get_market_sentiment(self, news_list):
"""计算市场情绪指数"""
df = self.analyze_news(news_list)
sentiment_map = {'POSITIVE': 1, 'NEGATIVE': -1}
df['sentiment'] = df['label'].map(sentiment_map)
return df['sentiment'].mean()
```
## 9.4 推理型股价预测
### 模型设计
```python
class ReasoningStockPredictor(nn.Module):
"""推理型股价预测模型"""
def __init__(self, pretrained_model='bert-base-chinese'):
super().__init__()
# 文本编码器
self.text_encoder = AutoModel.from_pretrained(pretrained_model)
# 数值特征编码器
self.numerical_encoder = nn.Sequential(
nn.Linear(10, 256),
nn.ReLU()
)
# 预测头
self.prediction_head = nn.Linear(768 + 256, 3) # 涨/平/跌
def forward(self, input_ids, attention_mask, numerical_features):
# 文本编码
text_embedding = self.text_encoder(input_ids, attention_mask).last_hidden_state[:, 0, :]
# 数值编码
num_embedding = self.numerical_encoder(numerical_features)
# 融合预测
combined = torch.cat([text_embedding, num_embedding], dim=-1)
prediction = self.prediction_head(combined)
return prediction
```
### 准确率提升20%
通过融合文本(新闻)和数值(技术指标)特征,推理型模型相比传统方法准确率提升20%。
## 9.5 RAG智能投研
```python
class RAGInvestmentResearch:
"""基于RAG的投研系统"""
def __init__(self, documents):
# 嵌入模型
self.embeddings = HuggingFaceEmbeddings()
# 向量数据库
self.vectorstore = FAISS.from_documents(documents, self.embeddings)
# LLM
self.llm = HuggingFacePipeline.from_model_id('qwen-7b')
def query(self, question):
"""查询"""
# 检索相关文档
docs = self.vectorstore.similarity_search(question, k=5)
# 构建提示
context = "\n".join([doc.page_content for doc in docs])
prompt = f"基于以下资料回答:\n{context}\n\n问题:{question}"
# 生成答案
return self.llm(prompt)
```
## 9.6 应用场景
1. **新闻情感分析**:实时分析市场情绪
2. **财报解读**:自动提取关键财务指标
3. **智能问答**:投研知识库问答
4. **策略生成**:基于自然语言生成交易策略
---
*本文节选自《AI量化交易从入门到精通》第9章(特色章节)⭐*
*完整内容请访问代码仓:book_writing/part2_core/part9_llm/README.md*
*配套代码:egs_llm/*
登录后可参与表态