Author: Soper, E.; Hosier, J.; Bales, D.; Gurbani, V. K.
Title: Semantic search pipeline: From query expansion to concept forging Cord-id: 8gs8e426 Document date: 2021_1_1
ID: 8gs8e426
Snippet: When searching a database for a topic (e.g. Covid-19), there may not exist a precise match, especially if the topic is novel. Furthermore, the topic may surface in the data under different guises ('Covid-19, ' 'coronavirus, ' 'pandemic', etc.). The results of a keyword search are limited by the querier's imagination and familiarity with the data. Such searches have high precision, but low recall. In order to increase the recall of searches, we present the Semantic Search Pipeline, a novel approa
Document: When searching a database for a topic (e.g. Covid-19), there may not exist a precise match, especially if the topic is novel. Furthermore, the topic may surface in the data under different guises ('Covid-19, ' 'coronavirus, ' 'pandemic', etc.). The results of a keyword search are limited by the querier's imagination and familiarity with the data. Such searches have high precision, but low recall. In order to increase the recall of searches, we present the Semantic Search Pipeline, a novel approach to document retrieval that uses distributional semantic models and locality sensitive hashing to expand queries and efficiently identify other relevant documents that may not contain the obvious query terms. We evaluate the pipeline using a dataset curated from financial customer service call centers, resulting in an increase in recall of 32% over a simple keyword baseline, with a negligible drop in precision. Furthermore, we present the notion of concept forging, a process of tracing a topic or concept through time and through its various surface realizations. Applied to Covid-19, the search pipeline retrieves a set of documents that allow us to uncover the short- and long-term effects of Covid-19 on the lives of the people and businesses impacted by it. Although Covid-19 is a timely test case, our search pipeline is general in nature and can be easily applied to any range of topics. © 2021 IEEE.
Search related documents:
Co phrase search for related documents- Try single phrases listed below for: 1
Co phrase search for related documents, hyperlinks ordered by date