Jessica Lòpez Espejel and Mahaman Sanoussi Yahaya Alassan and Walid Dahhane and El Hassane Ettifouri
JaCoText A Pretrained Model for Java CodeText Generation
100 - 105
2023
17
2
International Journal of Computer and Systems Engineering
https://publications.waset.org/pdf/10012935
https://publications.waset.org/vol/194
World Academy of Science, Engineering and Technology
Pretrained transformerbased models have shown high performance in natural language generation task. However, a new wave of interest has surged automatic programming language generation. This task consists of translating natural language instructions to a programming code. Despite the fact that wellknown pretrained models on language generation have achieved good performance in learning programming languages, effort is still needed in automatic code generation. In this paper, we introduce JaCoText, a model based on Transformers neural network. It aims to generate java source code from natural language text. JaCoText leverages advantages of both natural language and code generation models. More specifically, we study some findings from the state of the art and use them to (1) initialize our model from powerful pretrained models, (2) explore additional pretraining on our java dataset, (3) carry out experiments combining the unimodal and bimodal data in the training, and (4) scale the input and output length during the finetuning of the model. Conducted experiments on CONCODE dataset show that JaCoText achieves new stateoftheart results.
Open Science Index 194, 2023