网站开发合同属于知识产权类吗谁给个网站啊急急急2021
网站开发合同属于知识产权类吗,谁给个网站啊急急急2021,购买天猫店铺去哪个平台,企业网站建设 安全文章目录前期准备应用场景1.constant_score查询-不考虑文档频率得分#xff0c;与搜索关键字命中更多的返回结果2.sort排序-分数相同情况下#xff0c;按照指定价格域排序3.不考虑文档频率TF/IDF情况下#xff0c;不同域打分权重不同进行召回4.不考虑文档频率TF/IDF情况下与搜索关键字命中更多的返回结果2.sort排序-分数相同情况下按照指定价格域排序3.不考虑文档频率TF/IDF情况下不同域打分权重不同进行召回4.不考虑文档频率TF/IDF情况下不同域打分权重不同,再加上制定field的分数最后最终得分返回egtitle\^3\content^1time5.不考虑TFIDF得分同一区域下不同品牌权重不同6.如何基于地理位置查询并且类似于自如租房查找周边价格便宜并且距离近的搜索但是距离不会完全限定死7.有些场景需要根据配置参数值进行排序例如在所有手机中xiaomi手机得分最高8.bm25相似度调优禁用归一化9.query_string使用10.黄桃、罐头badcase-命中黄桃和罐头商品排在前面没有完全命中排在后面解决方案监控_stats索引监控前期准备
索引mappings
{shop_titled_index: {mappings: {properties: {brand: {type: text,fields: {keyword: {type: keyword,ignore_above: 256}}},price: {type: long},region: {type: long},shopId: {type: long},skuId: {type: long},title: {type: text,fields: {keyword: {type: keyword,ignore_above: 256}}}}}}
}准备数据 {_index: shop_titled_index,_type: _doc,_id: dJAM3HYByj_ONITHr0gq,_score: 1,_source: {brand: iphone,price: 8000,title: iphone 12 64G red 5G,skuId: 2020122201,shopId: 2,region: 1001}}{_index: shop_titled_index,_type: _doc,_id: 9ZA6inYByj_ONITHT0bH,_score: 1,_source: {brand: iphone,price: 8000,title: iphone 12 64G red 5G,skuId: 2020122201,shopId: 1,region: 1001}}应用场景
1.constant_score查询-不考虑文档频率得分与搜索关键字命中更多的返回结果
{query: {bool: {should: [{constant_score: {filter: {match: {title: iphone}},boost: 1}},{constant_score: {filter: {match: {title: 12}}}}]}}2.sort排序-分数相同情况下按照指定价格域排序
{query: {bool: {should: [{constant_score: {filter: {match: {title: iphone}},boost: 1}},{constant_score: {filter: {match: {title: 12}}}}]}},sort: [{_score: {order: desc}},{price: {order: asc}}]
}3.不考虑文档频率TF/IDF情况下不同域打分权重不同进行召回
{query: {bool: {should: [{constant_score: {filter: {match: {title: red}},boost: 1}},{constant_score: {filter: {match: {brand: iphone}},boost:3}}]}},sort:[{_score:{order:desc},price:{order:asc}}]
}4.不考虑文档频率TF/IDF情况下不同域打分权重不同,再加上制定field的分数最后最终得分返回egtitle^3content^1time
{query: {function_score: {query: {bool: {should: [{constant_score: {filter: {match: {title: red}},boost: 1}},{constant_score: {filter: {match: {brand: iphone}},boost: 3}}]}},field_value_factor: {field: shopId},boost_mode: sum}}
}5.不考虑TFIDF得分同一区域下不同品牌权重不同
文档https://www.elastic.co/guide/cn/elasticsearch/guide/current/function-score-filters.html
{query: {function_score: {query: {term: {region:1002}},boost: 1,functions: [{filter: {term: {brand.keyword: huawei}},weight: 3},{filter:{match:{brand:xiaomi}},weight:1}],score_mode: sum,boost_mode: sum}}
}使用注意以下查询会由于function_score没有主query则会返回所有文档 {query: {function_score: {functions: [{filter: {term: {brand.keyword: huawei}},weight: 3},{filter:{match:{brand:xiaomi}},weight:1}],score_mode: sum,boost_mode: sum}}
}6.如何基于地理位置查询并且类似于自如租房查找周边价格便宜并且距离近的搜索但是距离不会完全限定死
参考文档https://www.cnblogs.com/xiaoxiaoliu/p/11054405.html
新建索引创建mappings
post geo_index/_mappings
{properties: {location: {type: geo_point},price: {type: double},name: {type: text}}
}3.准备数据
{location:{lon:116.488781,lat:39.950565},price:4000,name:朝阳公园 两室一厅 12m
}{location:{lon:116.327805,lat:39.900988},price:2400,name:北京西站 三室一厅 9m
}{location: {lon: 116.403981,lat: 39.916485},price: 88888,name: 故宫 无价之宝
}{location: {lon: 116.341316,lat: 39.948795},price: 3700,name: 北京动物园 三室一厅 19m
}4.geo_distance找出附近两公里以内数据
GET geo_index/_search
{query: {constant_score: {filter: {geo_distance: {distance: 2km,location: {lat: 39.93869837,lon: 116.48357391}}},boost: 1.2}}
}输出
{took: 2,timed_out: false,_shards: {total: 1,successful: 1,skipped: 0,failed: 0},hits: {total: {value: 1,relation: eq},max_score: 1.2,hits: [{_index: geo_index,_type: _doc,_id: 1JC14HYByj_ONITHikiw,_score: 1.2,_source: {location: {lon: 116.488781,lat: 39.950565},price: 4000,name: 朝阳公园 两室一厅 12m}}]}
}5.找出数据并按照距离排序
文档https://www.elastic.co/guide/cn/elasticsearch/guide/current/sorting-by-distance.html
{query: {constant_score: {filter: {geo_distance: {distance: 10km,location: {lat: 39.93869837,lon: 116.48357391}}},boost: 1.2}},sort: {_geo_distance: {location: [{lat: 39.93869837,lon: 116.48357391}],unit: km,distance_type: arc,order: asc}}
}6.根据附近租房和价格查找数据
我更偏向距离更近因此将权重调高 参考https://www.elastic.co/guide/cn/elasticsearch/guide/current/decay-functions.html#CO119-4
{query: {function_score: {query: {range:{price:{gte:2000,lte:5000}}},functions: [{gauss: {location: {origin: {lon: 116.47464752,lat: 39.94606859},offset: 100m,scale: 1000m}},weight:2.0},{gauss: {price: {origin: 3000,offset: 100,scale:500}}}],score_mode: sum,boost_mode: replace}}
}结果
{took: 5,timed_out: false,_shards: {total: 1,successful: 1,skipped: 0,failed: 0},hits: {total: {value: 4,relation: eq},max_score: 0.7460326,hits: [{_index: geo_index,_type: _doc,_id: 95A14XYByj_ONITHg0if,_score: 0.7460326,_source: {location: {lon: 116.47155762,lat: 39.9523853},price: 3500,name: 亮马桥 两室一厅 12m}},{_index: geo_index,_type: _doc,_id: 1JC14HYByj_ONITHikiw,_score: 0.36586136,_source: {location: {lon: 116.488781,lat: 39.950565},price: 4000,name: 朝阳公园 两室一厅 12m}},{_index: geo_index,_type: _doc,_id: 1ZC34HYByj_ONITHRkht,_score: 5.823735e-39,_source: {location: {lon: 116.341316,lat: 39.948795},price: 3700,name: 北京动物园 三室一厅 19m}},{_index: geo_index,_type: _doc,_id: 1pC44HYByj_ONITHAkgJ,_score: 0,_source: {location: {lon: 116.327805,lat: 39.900988},price: 2400,name: 北京西站 三室一厅 9m}}]}
}7.有些场景需要根据配置参数值进行排序例如在所有手机中xiaomi手机得分最高
function_score结合scrit_score排序
{query: {function_score: {query: {match_all:{}},functions: [{script_score: {script: {lang: painless,params: {brand: xiaomi},source: if(doc[brand.keyword].size() 0)return 0f; String brandStr doc[brand.keyword].value ?: new String();if(params.brand.compareTo(brandStr) 0){return 1f}return 0}}}],score_mode:sum,boost_mode:replace}}
}score_mode定义的是如何将各个function的分值合并成一个综合的分值 boost_mode则定义如何将这个综合的分值作用在原始query产生的分值上 8.bm25相似度调优禁用归一化 BM25:bm25提供两个调参因子 k1:k1 这个参数控制着词频结果在词频饱和度中的上升速度。默认值为 1.2 。值越小饱和度变化越快值越大饱和度变化越慢。词频饱和度可以参看下面官方文档的截图图中反应了词频对应的得分曲线k1 控制 tf of BM25 这条曲线。 b:这个参数控制着字段长归一值所起的作用 0.0 会禁用归一化 1.0 会启用完全归一化。默认值为 0.75 mapping设置
{settings: {index: {number_of_shards: 1,provided_name: my_sim_index,similarity: {cbm25: {type: BM25,b: 0}},creation_date: 1610181315498,number_of_replicas: 1,uuid: V8NhMRofQRu-oPFt6hheWA,version: {created: 7070099}}},mappings: {_doc: {properties: {body: {similarity: BM25,type: text},title: {similarity: cbm25,type: text}}}}
}数据准备
{title: Elasticsearch allows you to configure a scoring algorithm or similarity per field. The similarity setting provides a simple way of choosing a similarity algorithm other than the default BM25, such as TF/IDF.,body: Elasticsearch allows you to configure a scoring algorithm or similarity per field. The similarity setting provides a simple way of choosing a similarity algorithm other than the default BM25, such as TF/IDF.
}
{title: A simple boolean similarity, which is used when full-text ranking is not needed and the score should only be based on whether the query terms match or not. Boolean similarity gives terms a score equal to their query boost.,body: A simple boolean similarity, which is used when full-text ranking is not needed and the score should only be based on whether the query terms match or not. Boolean similarity gives terms a score equal to their query boost.
}{title: or similarity per field. The similarity setting provides a simple way of choosing a similarity,body: or similarity per field. The similarity setting provides a simple way of choosing a similarity
}搜索 title用两cbm25忽略文档长度归一化搜索结果与文档长度无关
GET my_sim_index/_search
{query:{match:{title:similarity}}
}输出
{took: 1,timed_out: false,_shards: {total: 1,successful: 1,skipped: 0,failed: 0},hits: {total: {value: 3,relation: eq},max_score: 0.20983505,hits: [{_index: my_sim_index,_type: _doc,_id: nZBO5nYByj_ONITHhknJ,_score: 0.20983505,_source: {title: Elasticsearch allows you to configure a scoring algorithm or similarity per field. The similarity setting provides a simple way of choosing a similarity algorithm other than the default BM25, such as TF/IDF.,body: Elasticsearch allows you to configure a scoring algorithm or similarity per field. The similarity setting provides a simple way of choosing a similarity algorithm other than the default BM25, such as TF/IDF.}},{_index: my_sim_index,_type: _doc,_id: oZBW5nYByj_ONITHkEli,_score: 0.20983505,_source: {title: or similarity per field. The similarity setting provides a simple way of choosing a similarity,body: or similarity per field. The similarity setting provides a simple way of choosing a similarity}},{_index: my_sim_index,_type: _doc,_id: npBP5nYByj_ONITHK0mo,_score: 0.18360566,_source: {title: A simple boolean similarity, which is used when full-text ranking is not needed and the score should only be based on whether the query terms match or not. Boolean similarity gives terms a score equal to their query boost.,body: A simple boolean similarity, which is used when full-text ranking is not needed and the score should only be based on whether the query terms match or not. Boolean similarity gives terms a score equal to their query boost.}}]}
}0.20983505得分相同尽管文档长度不一样 利用body搜索
GET my_sim_index/_search
{query:{match:{body:similarity}}
}可以看出最后虽然都命中similary两次但是会受到文档长度影响 9.query_string使用
{query:{query_string:{query:(title:red)^1.0 AND (brand:iphone)}}
}10.黄桃、罐头badcase-命中黄桃和罐头商品排在前面没有完全命中排在后面解决方案
方案一利用contant_score 添加一个忽略TFIDF得分并且自定义得分的查询过滤器用来给完全命中的商品排在前面 should: [{constant_score: {filter: {query_string: {query: allWord:((黄桃) AND (罐头))}},boost: 500}}]方案二 在原function_score查询语句下的functions里面添加过滤器并添加权重 function_score : {query : {bool : {must : [{query_string : {query : (title:((黄桃 罐头))^2.4 OR catBrand:((黄桃 罐头))^0.6 OR facet:((黄桃 罐头))^0.6 OR allWord:((黄桃 罐头))^0.0),fields : [ ],use_dis_max : true,tie_breaker : 0.0,default_operator : or,auto_generate_phrase_queries : false,max_determinized_states : 10000,enable_position_increments : true,fuzziness : AUTO,fuzzy_prefix_length : 0,fuzzy_max_expansions : 50,phrase_slop : 0,escape : false,split_on_whitespace : true,boost : 1.0}}],filter : [{term : {skuDocType : {value : 1,boost : 1.0}}},{bool : {must_not : [{term : {spMask : {value : 1,boost : 1.0}}}],disable_coord : false,adjust_pure_negative : true,boost : 1.0}}],disable_coord : false,adjust_pure_negative : true,boost : 1.0}},functions : [{filter: {query_string: {query:allWord:(黄桃 AND 罐头)}},weight:400},{filter : {match_all : {boost : 1.0}},script_score : {script : {id : osop_score_script,lang : painless,params : {catSearch : false,fakeCat : cat16035591,weight : true,topSku : {pop8013634719 : 300.0,1130765898 : 300.0},hotCatIds : {cat16035591 : 0.9666818804198996}}}}}],score_mode : sum,boost_mode : sum,max_boost : 3.4028235E38,boost : 1.0}监控
_stats索引监控
Elasticsearch Index Monitoring(索引监控)之Index Stats API详解 请求方式
GET 索引名/_stats参数解释
1 { 2 _nodes: {3 total: 1,4 successful: 1,5 failed: 06 },7 cluster_name: ELKTEST,8 nodes: {9 lnlHC8yERCKXCuAc_2DPCQ: {10 timestamp: 1534242595995,11 name: OPS01-ES01,12 transport_address: 10.9.125.148:9300,13 host: 10.9.125.148,14 ip: 10.9.125.148:9300,15 roles: [16 master,17 data,18 ingest19 ],20 attributes: {21 ml.machine_memory: 8203104256,22 xpack.installed: true,23 ml.max_open_jobs: 20,24 ml.enabled: true25 },26 indices: {27 docs: {28 count: 8111612, # 显示节点上有多少文档29 deleted: 16604 # 有多少已删除的文档还未从数据段中删除30 },31 store: {32 size_in_bytes: 2959876263 # 显示该节点消耗了多少物理存储33 },34 indexing: { #表示索引文档的次数这个是通过一个计数器累加计数的。当文档被删除时它不会减少。注意这个值永远是递增的发生在内部索引数据的时候包括那些更新操作35 index_total: 17703152,36 index_time_in_millis: 2801934,37 index_current: 0,38 index_failed: 0,39 delete_total: 46242,40 delete_time_in_millis: 2130,41 delete_current: 0,42 noop_update_total: 0,43 is_throttled: false,44 throttle_time_in_millis: 0 # 这个值高的时候说明磁盘流量设置太低45 },46 get: {47 total: 185179,48 time_in_millis: 22341,49 exists_total: 185178,50 exists_time_in_millis: 22337,51 missing_total: 1,52 missing_time_in_millis: 4,53 current: 054 },55 search: { 56 open_contexts: 0, # 主动检索的次数57 query_total: 495447, # 查询总数58 query_time_in_millis: 298344, # 节点启动到此查询消耗总时间 query_time_in_millis / query_total的比值可以作为你的查询效率的粗略指标。比值越大每个查询用的时间越多你就需要考虑调整或者优化。59 query_current: 0, #后面关于fetch的统计是描述了查询的第二个过程也就是query_the_fetch里的fetch)。fetch花的时间比query的越多表示你的磁盘很慢或者你要fetch的的文档太多。或者你的查询参数分页条件太大例如size等于1万60 fetch_total: 130194,61 fetch_time_in_millis: 51211,62 fetch_current: 0,63 scroll_total: 22,64 scroll_time_in_millis: 2196665,65 scroll_current: 0,66 suggest_total: 0,67 suggest_time_in_millis: 0,68 suggest_current: 069 },70 merges: { # 包含lucene段合并的信息它会告诉你有多少段合并正在进行参与的文档数这些正在合并的段的总大小以及花在merge上的总时间。 如果你的集群写入比较多这个merge的统计信息就很重要。merge操作会消耗大量的磁盘io和cpu资源。如果你的索引写入很多你会看到大量的merge操作71 current: 0,72 current_docs: 0,73 current_size_in_bytes: 0,
..
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/diannao/89121.shtml
如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!