How to Extract Structured Data from Documents - Complete Tutorial
In this project we learn how to build AI methods for information extraction from documents to comprehensively beat traditional OCR
optical-character-recognition information-extraction computer-vision natural-language-processing
COVID-Q: A Dataset of 1,690 Questions about COVID-19
This dataset consists of COVID-19 questions which have been annotated into a broad category (e.g. Transmission, Prevention) and a more specific class such ...
covid-19 question-answering dataset covid-q
NLP experiment in story-telling that creates illustrations (text to sketch) and content (text generation)
natural-language-processing text-generation gpt transformers
Hugdatafast: huggingface/nlp + fastai
The elegant integration of huggingface/nlp and fastai2 and handy transforms using pure huggingface/nlp
natural-language-processing dataset fastai huggingface
Quora Question Pair Similarity
Identify which questions asked on Quora are duplicates of questions that have already been asked. Using Text features, classifying them as duplicates or ...
text-classification machine-learning python natural-language-processing
Gutenberg Dialog
Build a dialog dataset from online books in many languages.
dataset language-modeling natural-language-processing datasets
The Abstraction and Reasoning Corpus (ARC)
Can a computer learn complex, abstract tasks from just a few examples? ARC can be used to measure a human-like form of general fluid intelligence.
artificial-general-intelligence common-sense-reasoning arc dataset
Dakshina Dataset
A collection of text in both Latin and native scripts for 12 South Asian languages.
dataset natural-language-processing languages dakshina
Library to scrape and clean web pages to create massive datasets.
dataset natural-language-processing data-collection text-mining
