Bookmarks: language model

2015-10-26

[1510.02693] Feedforward Sequential Memory Neural Networks without Recurrent Feedback

feedforward sequential memory networks (FSMN), which can learn long-term dependency without using recurrent feedback.

提案されている FSMN は、非再帰形ディジタルフィルタと同じ型のネットワークのようだ。

Table 2 のアーキテクチャでは、メモリブロック付きの隠れ層素子数は 600 とあるが、30 次の FIR フィルタということは 30 * 600 = 18k となるわけで、計算量が多すぎるのではないだろうか。また、隠れ層の数も各手法で異なっており、性能比較が妥当かどうか疑問である。

なお、非再帰形ディジタルフィルタについては次のページを参考されたい。

ディジタルフィルタとｚ変換
http://laputa.cs.shinshu-u.ac.jp/~yizawa/InfSys1/basic/chap10/index.htm

2015-09-10

[1508.06615] Character-Aware Neural Language Models

http://arxiv.org/abs/1508.06615

We describe a simple neural language model that relies only on character-level inputs. Predictions are still made at the word-level. Our model employs a convolutional neural network (CNN) over characters, whose output is given to a long short-term memory (LSTM) recurrent neural network language model (RNN-LM).

入力は文字、出力は単語。
単語を構成する文字の各々について、文字ごとに15次元の（分布意味）埋込みベクトルに変換して、行列 C^k を作る。
その単語の行列 C^k に対して畳み込みネットワーク (CNN) と max pooling を適用してベクトルを作る。
系列の学習は LSTM。
層の途中に highway network (HW-Net) を入れている。なくても機能するが、あれば性能が上がる。