py150 dataset

150k Python Dataset | SRI Lab - ETH Zürich www.sri.inf.ethz.ch › py150

Version 1.0 Files [190MB]. An archive of source files used to generate the py150 Dataset. Download. The archive contains the following files: data.tar.gz ...

ETH Py150 Open Dataset - Papers With Code paperswithcode.com › dataset › eth-py150-open

Introduced by Kanade et al. in Learning and Evaluating Contextual Embedding of Source Code. A massive, deduplicated corpus of 7.4M Python files from GitHub.

google-research-datasets/eth_py150_open - Hugging Face huggingface.co › datasets › eth_py150_open

A redistributable subset of the ETH Py150 corpus, introduced in the ICML 2020 paper 'Learning and Evaluating Contextual Embedding of Source Code'

eth_py150_open - GitHub github.com › google-research-datasets › eth_py...

A redistributable subset of the ETH Py150 corpus [https://www.sri.inf.ethz.ch/py150], introduced in the ICML 2020 paper 'Learning and Evaluating Contextual ...

python-150k-code - Kaggle www.kaggle.com › datasets › pranithchowdary

150k Python Dataset. We provide source file used to obtain the py150 dataset. The archive contains the following files: python_code_data.txt -- Contains raw ...

naturalcc/preprocessing/py150/README.md at main - GitHub github.com › CGCL-codes › naturalcc › blob

Py150 dataset for Code Completion task ; step 1. download py150 dataset. bash dataset/py150/download.sh ; step 2. flatten py150 into new ast data. python -m ...

CodeXGLUE - PY150 Benchmark (Code Completion) paperswithcode.com › sota › code-completion-...

The current state-of-the-art on CodeXGLUE - PY150 is CodeGPT-adapted. See a full comparison of 3 papers with code.

150k Python Dataset - Kaggle www.kaggle.com › datasets › 150k-python-data...

Dataset consisting of 150'000 Python ASTs. ... The dataset is split into two parts -- 100'000 files used for training and 50'000 files used for evaluation.

google-research-datasets/eth_py150_open at ... - Hugging Face huggingface.co › eth_py150_open › blame › et...

We're on a journey to advance and democratize artificial intelligence through open source and open science.

The number of characters in the ETH Py150 Open dataset ... www.researchgate.net › figure › The-number-o...

Download scientific diagram | The number of characters in the ETH Py150 Open dataset (Kanade et al., 2020). from publication: Neural Interpretation of ...

Запросы по теме

python code dataset

python source code dataset

github code dataset

source code classification dataset

codexglue