In our last few lecture notes(11 , 12 & 13) we have been talking all about Structured Prediction, i.e., predicting a label for input sequence, where the space of all possible labels is too big, but has some inherent structure. We were trying to utilize this structure all through our previous endeavours, to efficiently navigate the big output space.
Up until now, in each of our globally normalized model, we have been trying to use DP(which utilizes the structure of output space) to calculate the partition function. But, we have had to…
Previous Post : Lecture 12
Prerequisites : Context Free Grammars, Chomsky Normal Form, CKY Algorithm. You can read about them from here.
In the following, “grammar” refers to CFG. CKY algorithm is also referred to as CYK algorithm, sometimes.
In this lecture, we will explore further, the ways of generating trees over(or “parsing”) our input sentence. In particular, we would like to make use of models, which instead of making tree one action at a time(and hence making locally optimal choices), make the tree by choosing multiple actions/whole tree at a time(and…
In this lecture, we learn how to adapt the structure of outputs of a neural net to predict some tree structure over an input sequence.
What kind of tree structure over the sequence ?
First, note that the words in a sentence, are related to other words in the sentence. Two words can be related , if one is the subject of the other, or one is the object of another, or in any other way. These relations usually end up making trees over the input sentence.
But why do…
A collection of important points while going through the course “Audio Signal Processing for Music Applications” by Xavier Serra and Prof. Julius O. Smith, III on Coursera .
The STFT of a windowed signal is in the shape of DFT of the window, but repeated at the frequencies(k0 and -k0) of the original signal , & with the phase of the original signal.
Last time , we discussed two routing protocols, Link State Routing(L.S.R.) and Distance Vector Routing Protocol(D.V.R.P.). This lecture discusses how networks with these Routing protocols behave in case of failures, and how to resolve them.
We assume all the above failures are “Fail & Stop” kind , i.e., when something fails, it just stops (it doesn’t go on sending wrong information) .
Suppose that the link between…
Problem : We talked before about these special kind of computers in our network called, switches , that implement a set of rules deciding on which outward link to send a packet received from some in-ward link. In this lecture, we try to formulate exactly the set of rules that are needed for this purpose.
A table(key: addresses/range of addresses ; Value: outward link to choose) is maintained in each switch. The switch looks up the entry corresponding to the destination address on the packet, and sends along that link. This is called Packet Forwarding …
The basic problem that we are trying to figure out is how to share a medium of communication(Ethernet, or range of frequencies, etc.) between several nodes trying to communicate using the shared medium ?
We simplify our problem by assuming that only one device can communicate using the shared medium, at a time. This is like observing only one lane of a 16 bit bus. Or some minimum range-width of frequencies needed for communication. We also assume, that each node has a queue that can store messages to be transmitted.
Now, we design a protocol that specifies rules…
1.) Problem :- You can design reliable connection b/w 2 nodes. You want to allow any device in your network to communicate with any other. And, the network should be application agnostic. [[May not be optimal for a particular application]]
2.) Three themes :- Reliability, Scalability, Efficiency
3.) Can’t connect every computer to every other, because of cost, or medium itself(like radio) might be limited range.
4.) Main Idea :- Design special computers called “switches” . …
“Structured prediction” is used for prediction problems where the input is a sentence, and the label assigned to a sentence comes from a space whose size is exponential in length of the input sentence, or infinite. Some examples for these are, POS tagging, NER, or even translation ! Such a task usually has a structure in its output space and hence called “Structured Prediction task” .
What “structure” ? And what brings it about ?
Imagine the labels in sentiment classification. And imagine the label(a tuple of POS tags) of a…
Searching 🧐 for the forgotten and lost truths