Dataset of Japanese task対話ツイートのツイートIDのみのペア。ツイート本文はなし。
The input data will be those randomly sampled from tweets in the year 2015. The pool of tweets (the target for extraction) is the randomly sampled tweet pairs (mention-reply pairs) in the year 2014. The size of the pool is just over one million; that is 500K pairs.
The following data will be provided from the organizers:
(1) Twitter data (by using their IDs) 1M in size
(2) Development data. Input samples and output samples annotated with reference labels. Here, the number of annotators is ten.
2015-11-21
NTCIR-12 Task on Short Text Conversation
http://ntcir12.noahlab.com.hk/japanese/stc-jpn.htm
2015-11-19
Teaching Machines to Read and Comprehend (slide)
http://lxmls.it.pt/2015/lxmls15.pdf
ConclusionLisbon Machine Learning School 2015 のスライド。トピックは自然言語処理。
Summary
* supervised machine reading is a viable research direction with the available data,
* LSTM based recurrent networks constantly surprise with their ability to encode dependencies in sequences,
* attention is a very effective and exible modelling technique.
Future directions
* more and better data, corpus querying, and cross document queries,
* recurrent networks incorporating long term and working memory are well suited to NLU task.
2015-11-16
Computational Linguistics and Deep Learning
http://www.mitpressjournals.org/doi/abs/10.1162/COLI_a_00239
Christopher D. Manning
Stanford University
MIT Press Journals - Computational Linguistics - Early Access - Citation
2015-11-15
Deep Learningの教材 - こんな夢を見た
http://hytae.hatenablog.com/entry/2015/11/14/Deep_Learning%E3%81%AE%E6%95%99%E6%9D%90
Deep Learningを勉強するにあたって内容がまとまっている教材についてリスト化しました。
2015-11-11
Marvin: Deep Learning in N Dimensions
http://marvin.is/
Marvin was born to be hacked, relying on few dependencies and basic C++. All code lives in two files (marvin.hpp and marvin.cu) and all numbers take up two bytes (FP16).
Marvin’s life depends on an NVIDIA GPU with CUDA 7.5 and cuDNN 3.https://github.com/PrincetonVision/marvin/
Marvin is a GPU-only neural network framework made with simplicity, hackability, speed, memory consumption, and high dimensional data in mind.
The MIT License (MIT)
Copyright (c) 2015 Princeton Vision Group
Understanding Convolutional Neural Networks for NLP | WildML
http://www.wildml.com/2015/11/understanding-convolutional-neural-networks-for-nlp/
In this post I’ll try to summarize what CNNs are, and how they’re used in NLP.
2015-11-06
Code Excited Linear Prediction - Wikipedia
https://ja.wikipedia.org/wiki/Code_Excited_Linear_Prediction
Code Excited Linear Prediction(CELP、セルプ)は、1985年に米AT&Tの M.R. Schroeder と B.S. Atal が提案した音声符号化アルゴリズム。携帯電話の音声圧縮のベースとなっているアルゴリズムらしい。