X Tutup
Skip to content

Latest commit

 

History

History

README.md

Documentation on Tools

Introduction

This directory contain detailed documentation on various tools that can be used to manipulate the source files available in the Project CodeNet dataset.

It also contains working documents on proposals and ideas for tools, formats, and applications.

HSQLDB

HSQLDB is a simple but complete database software that can use CSV files as persistent storage. This document describes how to convert the Project CodeNet metadata to be used with HSQLDB.

srcml

srcML is a tool for the analysis of programming language source code. The document describes typical use cases.

Syntax-correct Tokenstream

It is possible to obtain a token (class) stream from a source code file such that the result is still syntactically correct. This document shows how.

Universal Tokens

Some thoughts and a proposal to normalize the token classes for various programming languages are presented here.

X Tutup