Multi-session Twitter Dialogue Dataset

We create a real-world long-term open-domain dialogue dataset constructed from Japanese conversation logs on Twitter. Ideally, dialogue systems should naturally understand the dependencies from past sessions. Therefore, we use our archive of Tweets that are retrieved by using Twitter API and construct multi-session Twitter dialogue datasets.

We regard one reply tree as one dialog session and only used dialog sessions consisting of utterances alternately posted by two specific users. After creating the test dataset, we remove tweets of the users appearing in the development dataset, and then eliminate tweets of the users seen in the test or development dataset from the training dataset. Also, we remove dialogues containing URLs, images, and posts tweeted by bots. In order to exclude too short or long dialogues, we used only episodes with 11-25 sessions which consist of 5-30 turns.

In training and testing our dialogue-context retriever for dialogue systems, assuming a user who starts the conversation in the final session as a human user and the other user as a dialogue system, the dialogue systems are requested to generate responses for 2n th (n>0) user uttterances in the final session.

Dataset [Download]

Since we provides only IDs of tweets used in our experiments, you should collect corresponding tweets using those IDs. There is a possibility that you cannot get some tweets because the user has already deleted them or change public account to private one. Therefore, we prepare supplemental datasets which contains additional tweets ids. [Download]

Data

Collected tweet ids are split into these datasets

train.txt
- dialogue logs in 2011-2017
- Train models
dev.txt
- dialogue logs in 2018
- evaluate the performance of models during training
test.txt
- dialogue logs in 2019
- evaluate the performance of models at last

Citation

TBD