Keeping up to date on emerging entities that appear every day is indispensable for various applications, such as social-trend analysis and marketing research. Previous studies have attempted to detect unseen entities that are not registered in a particular knowledge base as emerging entities and consequently find non-emerging entities since the absence of entities in knowledge bases does not guarantee their emergence. We therefore introduce a novel task of discovering truly emerging entities when they have just been introduced to the public through microblogs and propose an effective method based on time-sensitive distant supervision, which exploits distinctive early-stage contexts of emerging entities.
Our IJCAI paper have errata in the descriptions of the dataset used for evaluation. Please download corrected version from arxiv.
-O
option. For installation, this document might be useful.
data
contains tweet IDs of emerging and prevalant contexts of target entity used for training our proposed model.data_mapping
contains IDs (each ID corresponds to each file in data
), entity names (space characters were removed) and their types.recall13406.id
in the directory contains tweet IDs.data_mapping
contains entity names (space characters were removed), their types and epoch times when the entity was first registered in Wikipedia.@inproceedings{akasaki2019ee,
title = {Early Discovery of Emerging Entities in Microblogs_},
author = {Satoshi Akasaki, Naoki Yoshinaga and Masashi Toyoda},
booktitle = {Proceedings of the 28th International Joint Conference on Artificial Intelligence},
pages = {to appear},
year = {2019},
}