github code dataset - Axtarish в Google
The GitHub Code dataset consists of 115M code files from GitHub in 32 programming languages with 60 extensions totaling in 1TB of data. The dataset was created ...
11 апр. 2023 г. · CodeSearchNet is a collection of datasets and benchmarks that explore the problem of code retrieval using natural language. Code of Conduct · Instructions · README.md · MIT License
A collection of datasets (and other resources) for big code analysis. If you want to contribute to this list, please send a pull request.
This is a cleaner version of Github-code dataset, we add the following filters: Average line length < 100; Alpha numeric characters fraction > ...
This repository contains all the needed tools and scripts to reproduce the datasets, as well as the academic papers they may relate to.
This is a list of topic-centric public data sources in high quality. They are collected and tidied from blogs, answers, and user responses. README.rst · Issues 68 · Pull requests 61 · Actions
This repository gathers all the code used to build the BigCode datasets such as The Stack as well as the preprocessing necessary used for model training.
CoDesc is a noise removed, large parallel dataset of source codes and corresponding natural language descriptions. This dataset is procured from several similar ...
This dataset is a collection of 1052 GitHub repositories, along with other columns such as the primary language used in it, fork count, open pull requests, and ...
A new vulnerable source code dataset for deep learning based vulnerability detection (RAID 2023) https://surrealyz.github.io/files/pubs/raid23-diversevul.pdf
Novbeti >

Ростовская обл. -  - 
Axtarisha Qayit
Anarim.Az


Anarim.Az

Sayt Rehberliyi ile Elaqe

Saytdan Istifade Qaydalari

Anarim.Az 2004-2023