6.6.1 Lowercase, Uppercase

영어나 유럽어 기반의 텍스트는 대소문자가 있어 검색할 때는 대소문자에 상관 없이검색이 가능하도록 처리 해 주어야 합니다. 보통은 텀 들을 모두 소문자로 변경하여 저장하는데 이 역할을 하는 것이 Lowercase 토큰 필터입니다. Lowercase 토큰 필터는 거의 모든 텍스트 검색 사례에서 사용되는 토큰 필터입니다.

Uppercase 토큰 필터는 모든 텀을 대문자로 변경하는 것 이며 Lowercase 와 동일하게 설정합니다. 다음은 "Harry Potter and the Philosopher's Stone" 문장을 lowercase와 uppercase 로 분석한 예제입니다.

lowercase 토큰 필터로 문장 분석

GET _analyze
{
  "filter": [ "lowercase" ],
  "text": [ "Harry Potter and the Philosopher's Stone" ]
}

lowercase 토큰 필터로 문장 분석 결과

{
  "tokens" : [
    {
      "token" : "harry potter and the philosopher's stone",
      "start_offset" : 0,
      "end_offset" : 40,
      "type" : "word",
      "position" : 0
    }
  ]
}

uppercase 토큰 필터로 문장 분석

GET _analyze
{
  "filter": [ "uppercase" ],
  "text": [ "Harry Potter and the Philosopher's Stone" ]
}

uppercase 토큰 필터로 문장 분석 결과

{
  "tokens" : [
    {
      "token" : "HARRY POTTER AND THE PHILOSOPHER'S STONE",
      "start_offset" : 0,
      "end_offset" : 40,
      "type" : "word",
      "position" : 0
    }
  ]
}

Previous6.6 토큰 필터 - Token Filter Next6.6.2 Stop

Last updated 6 years ago