tiny stories dataset

roneneldan/TinyStories · Datasets at Hugging Face huggingface.co › datasets › TinyStories

Dataset containing synthetically generated (by GPT-3.5 and GPT-4) short stories that only use a small vocabulary.

TinyStories: How Small Can Language Models Be and Still ... arxiv.org › cs

12 мая 2023 г. · We introduce TinyStories, a synthetic dataset of short stories that only contain words that a typical 3 to 4-year-olds usually understand.

TinyStories | Kaggle www.kaggle.com › datasets › thedevastator › ti...

A Diverse, Richly Annotated Corpus of Short-Form Stories.

roneneldan/TinyStories-33M - Hugging Face huggingface.co › roneneldan › TinyStories-33M

12 авг. 2024 г. · Model trained on the TinyStories Dataset, see https://arxiv.org/abs/2305.07759. Based on GPT-Neo architecture.

tiny-stories-ds - Kaggle www.kaggle.com › datasets › tomasbebra › tiny...

Kaggle is the world's largest data science community with powerful tools and resources to help you achieve your data science goals.

TinyStories: A Tiny Dataset with Big Impact | by Satwik Gawand satwikgawand.medium.com › tinystories-a-tiny-...

23 мая 2023 г. · It introduces TinyStories, a synthetic dataset which is a collection of short stories that consist of words that 3 to 4-year-olds can usually understand.

PraveenRaja42/Tiny-Stories-GPT - GitHub github.com › PraveenRaja42 › Tiny-Stories-GPT

A re-implementation of GPT language model in PyTorch, both training and inference. The model is trained on the TinyStories dataset with GPT-2 tokeniser.

Tinystories - A dataset for training tiny models to produce ... www.reddit.com › mlscaling › comments › tiny...

16 мая 2023 г. · A dataset for training tiny models to produce coherent English text with small vocabulary. R, T, Emp, Data, Smol, MS Attempts to produce KB-level TinyStories models : r/LocalLLaMA The Smallest GPT with Coherent English (by Microsoft) - Reddit Другие результаты с сайта www.reddit.com

TinyStories Is A Synthetic DataSet Created With GPT-4 & Used ... cobusgreyling.substack.com › tinystories-is-a-sy...

4 июл. 2024 г. · The Small Language Model from Microsoft, called Phi-3, was trained using a novel dataset called TinyStories.

Tiny Stories and Your Faovirte Author - BrainScriblr brainscriblr.beehiiv.com › brainscriblr-tiny-stories

The new constrained dataset, Tiny Stories, is for analyzing core AI language capabilities. Researchers created this focused corpus of short, simple stories.

Запросы по теме

как скачать датасет с huggingface

how to use dataset from huggingface

prompts dataset

ai datasets