The GitHub Code dataset consists of 115M code files from GitHub in 32 programming languages with 60 extensions totaling in 1TB of data. The dataset was created ... |
The MS COCO (Microsoft Common Objects in Context) dataset is a large-scale object detection, segmentation, key-point detection, and captioning dataset. Code Generation 53 · Images 2977 · CIFAR-10 (Canadian Institute... · Texts 2813 |
A collection of datasets (and other resources) for big code analysis. If you want to contribute to this list, please send a pull request. |
Lyra is a dataset for code generation that consists on Python code with embedded SQL. This dataset contains 2,000 carefully annotated database manipulation ... |
30 мая 2023 г. · The GitHub Code dataset consists of 115M code files from GitHub in 32 programming languages with 60 extensions totaling in 1TB of data. 7. MBPP. |
We provide a dataset consisting of parsed Parsed ASTs that were used to train and evaluate the DeepSyn tool. |
11 апр. 2023 г. · CodeSearchNet is a collection of datasets and benchmarks that explore the problem of code retrieval using natural language. |
This is a cleaner version of Github-code dataset, we add the following filters: Average line length < 100; Alpha numeric characters fraction > ... |
Find the latest code and datasets from Amazon scientists and researchers, which have been released across GitHub and other platforms. |
We collect, organize and open-source the large-scale multimodal instruction dataset, Infinity-MM, consisting of tens of millions of samples. Through quality ... Datasets · Anomaly Detection 116 · Face Recognition 65 · Time series 241 |
Novbeti > |
Axtarisha Qayit Anarim.Az Anarim.Az Sayt Rehberliyi ile Elaqe Saytdan Istifade Qaydalari Anarim.Az 2004-2023 |