TREC Dynamic Domain Track
News

  • Jun 19, 2015 Topics released.
  • Jun 19, 2015 Track guidelines are updated.
  • Jun 4, 2015 Track guidelines are updated.
  • Nov 20, 2014 Dynamic Domain Track planning session was held at TREC Conference, NIST, Gaithersburg.
Welcome to the Dynamic Domain Track website

  • What is Dynamic Domain (DD)? The Dynamic Domain track is new in TREC 2015. The name contains two parts. "Dynamic" means that it is a dynamic search task. It is search in multiple runs of interactions and the participting systems are expected to adjust their systems dynamically based on the relevance judgments provided along the way. "Domain" means that it is a search task that comes across multiple interesting domains, which usually produce complex and exploratary search with multiple runs of user and search engine interactions.
  • Goal: The goal of the Dynamic Domain (DD) Track is to support research in dynamic, exploratory search within complex information domains.
  • How to participate: register here before May 1, 2015
  • Task: This is an interactive search task. The participating systems (your systems) will start from an initial query (the only query provided), retrieve 5 documents and submit them to a simulater program that the Track organizers provided. The simulator (we call it the jig program) will provide graded relevance judgments to the 5 documents that your system just retrieved. With the judgments, your system decides if you would like to retrieve more documents or stop. If your system decides to submit more documents, they would need to be 5 documents and the jig will provide relevance judgments to those documents again. The retrieval loop continues until your system decides to stop. All the interactions, aka, the multiple cycles of retrieval results will be used to evaluate your system's performance in the entire process. As we can see, this is not a one-shot retrieval, but a whole process of multiple retrievals. An effective participating system is expected to be able to find the relevant documents as many as possible, using less runs of interactions.
  • Topics: Topics used in the TREC DD Track all contain multiple subtopics. The main topic, would be used as the query to start the search. For each subtopic, there are a set of passages manually judged as relevant for it. The topics, subtopics, and the relevant passages would all be released to the participating systems; which means, we are releasing the ground truth now. However, subtopics and passages SHOULD NOT be used in your retrieval algorithms nor be used to generated your submitted runs.
  • Domains and datasets for 2015: TREC DD Track provides interesing and understudied domains of documents. In 2015, we release our corpora from the following four domains:
  • illicit goods: this data is related to how illicit and counterfeit goods such as fake viagra are made, advertised, and sold on the Internet. The dataset comprises 3.3 million posts (500k threads) from underground hacking forums.
  • ebola: this data is related to the Ebola outbreak in Africa in 2014-2015. The dataset comprises 165,000 tweets relating to the outbreak, some 500,000 web pages from sites hosted in the affected countries and from websites such as The World Health Organization, Financial Tracking Service and The World Bank. Such information resources are designed to provide information to citizens and aid workers on the ground.
  • local politics: this data is related to regional politics in the Pacific Northwest and the small-town politicians and personalities that work it. The dataset comprises 6.8 million web pages from the TREC 2014 KBA Stream Corpus.
  • polar: this data is a set of 50,000+ crawled web pages, scientific data (HDF, NetCDF files, Grib files), zip files, PDFs, images and science code related to the polar sciences and available publicly from the NSF funded Advanced Cooperative Artic Data and Information System (ACADIS), NASA funded Antarctic Master Directory (AMD), and National Snow and Ice Data Center (NSIDC) Arctic Data Explorer.

Organizers
  • Grace Hui Yang - Georgetown University
  • John Frank - MIT, Diffeo
  • Ian Soboroff - NIST