CodeGeeX

Code LLMs Explained,
CodeGeeX

CodeGeeX, a large-scale multilingual code generation model with 13 billion parameters pre-trained on a large code corpus of over 20 programming languages. The Model is published by Knowledge Engineering Group (KEG) & Data Mining at Tsinghua University. As of June 22, 2022, CodeGeeX has been trained on more than 850 billion tokens on a cluster of 1,536 Ascend 910 AI Processors.

Model Details View All Models

Technical Experts

50 Custom AI projects

4.8 Minimum Rating

An Overview of CodeGeeX

CodeGeeX is a large-scale multilingual code generation model that boasts 13 billion parameters and is trained on an extensive code corpus encompassing more than 10+ programming languages.

Extensive training on a massive dataset

850B tokens

CodeGeeX, the multilingual code generation model, has undergone extensive training on a massive dataset consisting of over 850 billion tokens.

Variety of programming languages

10+ languages

CodeGeeX is a versatile code generation model that supports 10+ popular programming languages, such as Python, Java, C++, C, JavaScript, and Go.

Available in popular IDEs as an extension or plugin

Extensions

CodeGeeX offers Customizable Programming Assistant in VS Code and JetBrains IDEs as an extension/plugin. It empowers users with a better coding experience.

Blockchain Success Starts here

  • About Model

  • Model Highlights

  • Training Details

  • Key Results

  • Model Features

  • Model Tasks

  • Fine-tuning

  • Benchmark Results

  • Sample Codes

  • Limitations

  • Other LLMs

TaskDatasetScore
pass@100 avgHumanEval-X62
Crosslingual Code Translation (avg of pass@100)HumanEval-X72.5