一、问题背景
在使用 Elasticsearch 存储较大字段数据时,出现如下异常:
ElasticsearchStatusException: Elasticsearch exception [type=illegal_argument_exception, reason=Document contains at least one immense term in field="fieldZgbpka" (whose UTF8 encoding is longer than the max length 32766)...
如下图所示:
二、错误分析
这是 Elasticsearch 的一个安全机制:默认情况下,不允许索引超大字段内容(即 term 超过 32766 bytes)。Elasticsearch 会跳过这些字段的内容,同时抛出 illegal_argument_exception
。
常见触发原因:
-
向索引中写入了超长字符串,如大型 JSON、富文本、Base64 图片等;
-
默认
text
类型字段被自动建立倒排索引。
三、解决方案:
设置 index: false
禁用字段索引
如果该字段不参与搜索,仅作存储用途(比如附件内容、HTML、大文本、JSON 字符串等),可以将其设为不可索引,即:
"fieldZgbpka": {
"type": "text",
"index": false
}
四:解决步骤(我这个方案比较笨)
1.先查询有那些索引
GET wk_single_pack/_mapping
大概结果如下图:
2.删除旧索引
然后建立一个新的索引,对应的改成text 把索引禁用掉fase别的保持不变。
这是删除原有索引的命令,看到true就成功了。
DELETE /wk_single_pack
3.建立新的索引
PUT wk_single_pack
{
"mappings": {
"_doc": {
"properties": {
"checkStatus": {
"type": "keyword",
"fields": {
"sort": {
"type": "icu_collation_keyword",
"language": "zh",
"country": "CN"
}
}
},
"createTime": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss"
},
"createUserId": {
"type": "keyword",
"fields": {
"sort": {
"type": "icu_collation_keyword",
"language": "zh",
"country": "CN"
}
}
},
"createUserName": {
"type": "keyword",
"fields": {
"sort": {
"type": "icu_collation_keyword",
"language": "zh",
"country": "CN"
}
}
},
"documentCode": {
"type": "keyword",
"fields": {
"sort": {
"type": "icu_collation_keyword",
"language": "zh",
"country": "CN"
}
}
},
"fieldOouzhm": {
"type": "keyword",
"fields": {
"sort": {
"type": "icu_collation_keyword",
"language": "zh",
"country": "CN"
}
}
},
"fieldZgbpka": {
"type": "text",
"index": false
},
"fieldZorzlq": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
},
"sort": {
"type": "icu_collation_keyword",
"language": "zh",
"country": "CN"
}
}
},
"orderNumber": {
"type": "keyword",
"fields": {
"sort": {
"type": "icu_collation_keyword",
"language": "zh",
"country": "CN"
}
}
},
"ownerDeptId": {
"type": "keyword",
"fields": {
"sort": {
"type": "icu_collation_keyword",
"language": "zh",
"country": "CN"
}
}
},
"ownerDeptName": {
"type": "keyword",
"fields": {
"sort": {
"type": "icu_collation_keyword",
"language": "zh",
"country": "CN"
}
}
},
"ownerUserId": {
"type": "keyword",
"fields": {
"sort": {
"type": "icu_collation_keyword",
"language": "zh",
"country": "CN"
}
}
},
"ownerUserName": {
"type": "keyword",
"fields": {
"sort": {
"type": "icu_collation_keyword",
"language": "zh",
"country": "CN"
}
}
},
"salesman": {
"type": "keyword",
"fields": {
"sort": {
"type": "icu_collation_keyword",
"language": "zh",
"country": "CN"
}
}
},
"totalAmount": {
"type": "scaled_float",
"scaling_factor": 100
},
"updateTime": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss"
}
}
}
}
}如下图,根据自己的实际情况来。
![]()
五、注意事项
-
避免存储超大内容到默认 text 字段;
-
如果确实需要,可用
index: false
或改为keyword
; -
若需搜索大字段(如全文搜索),考虑引入专门的搜索字段(如摘要);
-
别忘了提前设计 Mapping,Elasticsearch 动态 Mapping 默认会索引所有字段。
六、总结
本次问题是由于字段内容超出 Elasticsearch 的最大 term 限制 32766 字节导致。解决方案二通过设置字段 index: false
,禁止其参与倒排索引,既解决了异常问题,又保留了数据的存储功能。
在实际开发中,我们要对每个字段的数据结构与使用场景做好规划,合理设计索引策略,避免不必要的性能浪费与错误。