CodeRL

Program Synthesis Model,
CodeRL

CodeRL is a novel framework for program synthesis tasks that combines pretrained language models (LMs) with deep reinforcement learning (RL) techniques to overcome the limitations of existing code generation methods.

Model Details View All Models

Technical Experts

50 Custom AI projects

4.8 Minimum Rating

An Overview of CodeRL

CodeRL is a novel framework for program synthesis tasks that combines pretrained language models (LMs) with deep reinforcement learning (RL) techniques to overcome the limitations of existing code generation methods.

CodeRL's training process involves an actor network.

Actor-Critic Architecture

CodeRL trains a code-generating AI called an "actor-network" alongside a "critic-network" that evaluates how well the generated code works.

CodeRL introduces a critical sampling strategy

Critical Sampling Strategy

CodeRL uses a sampling technique during testing that incorporates feedback from both unit tests and the critic network

CodeRL achieves new SOTA results

New SOTA Results

CodeRL achieves new SOTA results on two distinct benchmarks - the challenging APPS benchmark and the simpler MBPP benchmark.

Blockchain Success Starts here

  • About Model

  • Model Highlights

  • Training Details

  • Model Types

  • Key results

  • Model Features

  • Model Tasks

  • Fine-tuning

  • Benchmark Results

  • Sample Codes

  • Limitations

  • Other LLMs

Model Highlight
CodeT5-largeA 770M-CodeT5 model trained with Masked Span Prediction objective on CSN obtained new state-of-the-art results on various CodeXGLUE benchmarks.
CodeT5-large-ntp-pyThe 770M-CodeT5 model was pre-trained using Masked Span Prediction objective on CSN and GCPY, and then with Next Token Prediction objective on GCPY.
CodeT5-finetuned_criticThe model is based on CodeT5-base and is capable of predicting Compile Error, Runtime Error, Failed Tests, and Passed Tests outcomes.
CodeT5-finetuned_critic_binarySimilar to the previous model, this one was trained to predict whether unit tests passed or failed. A critic was used to aid in generating procedures during inference.
CodeT5-finetuned_CodeRLA CodeT5 model which was initialized from the prior pretrained CodeT5-large-ntp-py and then finetuned on APPS following our CodeRL training framework.
TaskDatasetScore
Pass@1APPS2.69
Pass@5APPS6.81
Pass@1000APPS20.98
1@kAPPS8.48
5@kAPPS12.62
Code-to-Text generationCodeXGLUE19.87
Text-to-Code generationCodeXGLUE45.08
Code-to-Code generation (Java to C#)CodeXGLUE83.56
Code-to-Code generation (C# to Java)CodeXGLUE79.77
Code refine (medium)CodeXGLUE89.22
zero-shot transfer abilityMBPP63