Welcome back! EleutherAI has a brand new (and big) GPT model that was open-sourced over this past week. The model (JAX-based) was trained for 5 weeks on the Pile dataset, Eleuther’s own ~800GB data dump. The model is called GPT-J, a 6 billion parameter model that rivals the performance of GPT-3 of the same size. And apparently it performs well on code generation:
Here’s a comparison of all the major language models on various datasets:
EleutherAI has a demo webpage for you to try out the model:
And a Colab for inference over TPUs 😁:
Want to thank Connected Papers for the shout-out this week! 😎
FYI, after the upcoming NLP Index update, we’ll pass the 6,000 repo mark! 🚀
TextStyleBrush can recognize style of text in pictures and edit the words while maintaining the style.
It’s “… the first self-supervised AI model that replaces text in images of both handwriting and scenes — in one shot — using a single example word.”
AI can now emulate text style in images in one shot - using just a single word
We're introducing TextStyleBrush, an AI research project that can copy the style of text in a photo using just a single…
Getting Started with Tensorflow-Metal PluggableDevice
Install TensorFlow v2.5 and the tensorflow-metal PluggableDevice to accelerate training with Metal on Mac GPUs.
Tensorflow Plugin - Metal - Apple Developer
Find presentations, documentation, sample code, and resources for building macOS, iOS, and tvOS apps with the Metal…
Do You Really Need Redis? How to Get Away with Just PostgreSQL
Chris Farber highlights how to use Postgres for common Redis use-cases. In all, he describes 3 use-cases of job-queuing, application locks, and pub/sub! Have to say, the pub/sub example was surprising:
Do You Need Redis? PostgreSQL Does Queuing, Locking, & Pub/Sub
There’s a tried-and-true architecture that I’ve seen many times for supporting your web services and applications…
Reasoning with Knowledge Graphs (Slides)
Goes over two papers:
Reasoning with Language Models and Knowledge Graphs for Question Answering https://arxiv.org/abs/2104.06378
Multi-hop logical reasoning on KGs https://arxiv.org/abs/2010.11465
Repo Cypher 👨💻
A collection of recently released repos that caught our 👁
A method to automatically generate slides for scientific papers based on a corpus of 5000 paper-slide pairs compiled from conference proceedings websites.
This article is published at the Scientific Scholarly Processing (SDP) 2021 workshop. Download the original papers and…
The first end-to-end model for cross-document (CD) coreference resolution from raw text, which extends the prominent model for withindocument coreference to the CD setting.
This repository contains code and models for end-to-end cross-document coreference resolution, as decribed in our…
A corpus of over 40,000 StackOverflow question texts to be used in conjunction with their corresponding intents from the CoNaLa dataset.
This is the repository for the paper Reading StackOverflow Encourages Cheating: Adding Question TextImproves Extractive…
A dataset with 12,023 pairs of utterances and SQL queries collected from real usage on the Stack Exchange website.
Code and data from the paper: Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data.
Addressing Inquiries about History (AIH) contains two stages: (1) during the inquiry stage, questions about the facts and opinions mentioned in the conversation history are inserted into the conversation between chatbots. (2) during the contradiction recognition stage, the responses of the inserted questions are collected, and automatic models or human judges can be adopted to decide whether the responses are consistent with the dialogue history.
This repository contains the code for Findings of ACL 2021 paper Addressing Inquiries about History: An Efficient and…
Few-shot intent detection with/without Out-of-Scope (OOS) intents.
Few-Shot-Intent-Detection is a repository designed for few-shot intent detection with/without Out-of-Scope (OOS)…
A domain-specific T5 model that has been pre-trained on large biomedical corpora. Model outperforms the current SOTA methods (i.e. BERT, BioBERT, Base T5) on tasks in named entity relation, relation extraction, natural language inference, and question answering.
SciFive provided a Text-Text framework for biomedical language and natural language in NLP. Under the T5's framework…
Provides implementation of sequence models (e.g. Bart, ProphetNet) for text generation, summarization, translation tasks etc. It automatically optimizes inference speed based on popular NLP toolkits (e.g. FairSeq and HuggingFace-Transformers) without accuracy loss.
FastSeq FastSeq provides efficient implementation of popular sequence models (e.g. Bart, ProphetNet) for text…
A dataset of python programming puzzles which can be used to teach and evaluate an AI’s programming proficiency.
This repo contains a dataset of python programming puzzles which can be used to teach and evaluate an AI's programming…
XtremeDistilTransformers comes with Tensorflow 2.3 and HuggingFace Transformers with an unified API with the following features:
- Distil any supported pre-trained language models as teachers (e.g, Bert, Electra, Roberta)
- Initialize student model with any pre-trained model (e.g, MiniLM, DistilBert, TinyBert), or initialize from scratch
- Multilingual text classification and sequence tagging
- Distil multiple hidden states from teacher
- Distil deep attention networks from teacher
- Pairwise and instance-level classification tasks (e.g, MNLI, MRPC, SST)
- Progressive knowledge transfer with gradual unfreezing
- Fast mixed precision training for distillation (e.g, mixed_float16, mixed_bfloat16)
- ONNX runtime inference
Releasing [ XtremeDistilTransformers] with Tensorflow 2.3 and HuggingFace Transformers with an unified API with the…
A new benchmark for lexical substitution, the task of finding appropriate substitutes for a target word in a context.
This repository houses the Stanford Word Substitution (Swords) benchmark. Swords ⚔️ is a benchmark for the task of…
Dataset consists of 87,026 verified claims. Each claim is annotated with evidence in the form of sentences and/or cells from tables in Wikipedia, as well as a label indicating whether this evidence supports, refutes, or does not provide enough information to reach a verdict.
This repository maintains the code to generate and prepare the dataset, as well as the code of the annotation platform…
A data augmentation approach that combines a self-trained neural retrieval model with a few-shot learned NLU model, to automatically create MR-to-Text data from open-domain texts.
Code for paper " Xinnuo Xu, Guoyin Wang, Young-Bum Kim, Sungjin Lee AUGNLG: Few-shot Natural Language Generation using…
Dataset of the Week: FLORES
What is it?
An evaluation benchmark for low-resource and multilingual machine translation. It’s a many-to-many multilingual translation benchmark dataset consisting of 3,001 sentences extracted from English Wikipedia and covering a variety of different topics and domains for 101 languages.
Where is it?
Every Sunday we do a weekly round-up of NLP news and code drops from researchers around the world.
For complete coverage, follow our Twitter: @Quantum_Stat