我有“文档”(活动记录),其属性称为“偏差”。该属性具有“ Bin X”,“ Bin $”,“ Bin q”,“ Bin%”等值。
我正在尝试使用tire / elasticsearch搜索属性。我正在使用空白分析器索引偏差属性。这是我用于创建索引的代码:
settings :analysis => { :filter => { :ngram_filter => { :type => "nGram", :min_gram => 2, :max_gram => 255 }, :deviation_filter => { :type => "word_delimiter", :type_table => ['$ => ALPHA'] } }, :analyzer => { :ngram_analyzer => { :type => "custom", :tokenizer => "standard", :filter => ["lowercase", "ngram_filter"] }, :deviation_analyzer => { :type => "custom", :tokenizer => "whitespace", :filter => ["lowercase"] } } } do mapping do indexes :id, :type => 'integer' [:equipment, :step, :recipe, :details, :description].each do |attribute| indexes attribute, :type => 'string', :analyzer => 'ngram_analyzer' end indexes :deviation, :analyzer => 'whitespace' end end
当查询字符串不包含特殊字符时,搜索似乎工作正常。例如,Bin X将仅返回其中包含单词BinAND的那些记录X。但是,搜索类似Bin $或的Bin %结果将显示单词Bin几乎忽略了该符号的所有结果(带有符号的结果在没有搜索结果的搜索中会显示得更高)。
Bin X
Bin
X
Bin $
Bin %
这是我创建的搜索方法
def self.search(params) tire.search(load: true) do query { string "#{params[:term].downcase}:#{params[:query]}", default_operator: "AND" } size 1000 end end
这是我构建搜索表单的方式:
<div> <%= form_tag issues_path, :class=> "formtastic issue", method: :get do %> <fieldset class="inputs"> <ol> <li class="string input medium search query optional stringish inline"> <% opts = ["Description", "Detail","Deviation","Equipment","Recipe", "Step"] %> <%= select_tag :term, options_for_select(opts, params[:term]) %> <%= text_field_tag :query, params[:query] %> <%= submit_tag "Search", name: nil, class: "btn" %> </li> </ol> </fieldset> <% end %> </div>
您可以清理查询字符串。这是一种消毒剂,适用于我尝试扔给它的所有东西:
def sanitize_string_for_elasticsearch_string_query(str) # Escape special characters # http://lucene.apache.org/core/old_versioned_docs/versions/2_9_1/queryparsersyntax.html#Escaping Special Characters escaped_characters = Regexp.escape('\\/+-&|!(){}[]^~*?:') str = str.gsub(/([#{escaped_characters}])/, '\\\\\1') # AND, OR and NOT are used by lucene as logical operators. We need # to escape them ['AND', 'OR', 'NOT'].each do |word| escaped_word = word.split('').map {|char| "\\#{char}" }.join('') str = str.gsub(/\s*\b(#{word.upcase})\b\s*/, " #{escaped_word} ") end # Escape odd quotes quote_count = str.count '"' str = str.gsub(/(.*)"(.*)/, '\1\"\3') if quote_count % 2 == 1 str end params[:query] = sanitize_string_for_elasticsearch_string_query(params[:query])