Active Learning with Verbcorner

GamesWithWords Admin
GamesWithWords
Published in
3 min readMar 14, 2018

--

Thanks to Han for writing this blog post about his work with Verbcorner!

Hi guys, my name’s Han. I’m a new research assistant at L3, and I’d like to tell you about the project which I’m working on currently.

Using Verbcorner, an online platform for hosting linguistic experiments, L3 has accumulated a large dataset containing over five hundred and sixty thousand unique responses from quizzes. Each response comes with a sentence, a question about the sentence, and the response given by a quiz participant.

I’ve made progress on compiling these responses into a single dataset, incorporating as much information as possible from VerbNet, an online dataset of verbs and associated classes and frames. VerbNet was used to generate the questions posed by Verbcorner, and contained information about the contextual classes and syntactic frames used for a Verbcorner sentence.

Each entry in the dataset now contains more information:

Machine learning tasks can be divided into two very broad categories: Supervised, and unsupervised.

Supervised models receive a set of data, with labels telling it what each data-point represents. They then learn to classify these data-points into distinct labeled categories, using regression. As it learns from more and more examples, it becomes better at generalizing to new data-points which it has never seen before.

Image Credit

Active learning is a semi-supervised task. The idea is to train a deep neural network on the existing set of Verbcorner data. The network receives a matrix of data-point features at its first layer, and a set of labels corresponding to each data-point at the final layer. It processes this input at every subsequent layer until it produces a prediction vector.

The difference between the predictions and the labels are then used to adjust weights and parameters throughout the network, which is how it learns. If the model can learn to predict responses to the Verbcorner data with reasonable accuracy, then it could be used for active learning.

A deep network, with more layers and nodes, can capture more complicated patterns, and result in higher accuracy, but the ideal number of layers and nodes is difficult to get right.

Image Credit

For active learning, the model can be shown novel questions which it has never seen before. Its response to the question can then be compared with responses from quiz participants for similarity. The more similar the responses are, the better the model is performing, and the comparison can be used by the neural network to adjust its parameters appropriately.

This will allow the model to keep learning as Verbcorner grows. If its performance manages to reach a high-enough level of accuracy, maybe it could even be a convincing stand-in for a quiz participant!

--

--