Blockchain for Student Data Privacy and Consent

Published in IEEE 2018 International Conference on Computer Communication and Informatics, 2018

Shlok Gilda, Maanav Mehrotra. ICCCI 2018.

[PDF]

Abstract

Programming languages are the primary tools of the software development industry. As of today, the programming language of the vast majority of the published source code is manually specified or programmatically assigned based solely on the respective file extension. This work shows that the identification of the programming language can be done automatically by utilizing an artificial neural network based on supervised learning and intelligent feature extraction from the source code files. We employ a multi-layer neural network - word embedding layers along with a Convolutional Neural Network - to achieve this goal. Our criteria for an automatic source code identification solution include high accuracy, fast performance, and large programming language coverage. The model achieves a 97% accuracy rate while classifying 60 programming languages.