Evaluations

In order to evaluate our system, we compiled a list of keyword queries. The first set of keyword queries contained only one term such as python, hardware, and bat. The second set of queries contained two terms such as python snake, computer hardware, and bat mammal. We collected a number of signal documents and a number of noise documents to construct our document collection for testing the system. For example, if the intent for the search is to find documents about a python snake, the signal documents would contain the snake sense of the word python, while noise documents would contain the programming language or entertainment senses of python.

We had the system perform a simple query search and an enhanced query search for each of our keyword queries. In the case of simple query search, a term vector was built using the original keyword(s) in the query text. In the case of enhanced query search, we used the query that was generated by ARCH. The search results were retrieved from the signal and noise document collection by using a cosine similarity measure for matching.

Click here for the evaluation results