ML Resources — September 22

Suhas Pai
Aggregate Intellect
2 min readSep 22, 2021

A common discussion point surrounding language models is whether we should model the world as it is or the model the world the way we wish to see it, and subsequently, should we curate the training data for large language models or leave it as it is. But as Anna Rogers points out in her position paper, any training set is just a sample of the universe of language, and curation has taken place already, whether we like it or not.

The second update from the year-long BigScience workshop is out, with videos including progress updates of the working groups and invited talks. Check them out here.

Nils Reimers gave a talk at the TMLS NLP conference yesterday on the latest developments in neural search. His talk includes introduction to bi-encoders, cross-encoders, doc2query, and other modern architectures.

It has been shown that deeper neural models help with tasks needing reasoning. But it has always been assumed that training deeper models needs prohibitive amounts of data. Xu et al. show that this need not be the case. They provide optimization improvements including a method called DT-Fixup for enabling training deeper models with small datasets.

There have been longer-range models that enable accounting for longe-range dependencies in language. But how often do these models exploit these dependencies? Not a lot, show Sun et al. who show that this mostly improves prediction for a small set of tokens that need to be copied from the distant context.

Aggregate Intellect

Aggregate Intellect

Aggregate Intellect is a Global Marketplace where ML Developers Connect, Collaborate, and Build. Connect with peers & experts at https://ai.science or Join our Slack Community.

  • Check out the user generated Recipes that provide step by step, and bite sized guides on how to do various tasks
  • Join our ML Product Challenges to build AI-based products for a chance to win cash prizes
  • Connect with peers & experts through the ML Discussion Groups or Expert Office Hours

--

--