Santacoder

Code LLMs Explained,
SantaCoder

SantaCoder is a 1.1 B parameters program synthesis model pre-trained on Python, Java & JavaScript. The main model uses Multi Query Attention and it was trained for the Fill-in-the-Middle objective using near-deduplication and comment-to-code ratio as filtering criteria.

Model Details View All Models

Technical Experts

50 Custom AI projects

4.8 Minimum Rating

An Overview of SantaCoder

SantaCoder model was trained on GitHub code. It is not an instruction model, users should either phrase commands like they occur in source code or write a function signature and docstring and let the model complete the function body.

Model was trained with 1.1 Billion parameters on Java, JavaScript, and Python.

1.1 Billion Parameters

The base training dataset for the model contains 268 GB of Python, Java, and JavaScript files. Data removed opt-out requests, near-deduplication, and PII-redaction.

Santacode Model Trained with Multi Query Attention and Advanced Techniques.

Multi Query Attention

Multi Query Attention can significantly speed up inference for larger batch sizes, while fill-in-the-middle enables code models to do infilling tasks.

The model is trained with an enormous amount of 236 billion tokens

236 Billion tokens

The model was trained for Multi Query Attention and Fill-in-Middle with a total of 600,000 iterations, processing an enormous amount of 236 billion tokens during its training.

Blockchain Success Starts here

  • About Model

  • Model Highlights

  • Training Details

  • Key Results

  • Model Features

  • Model Tasks

  • Fine-tuning

  • Benchmark Results

  • Sample Codes

  • Limitations

  • Other LLMs

TaskDatasetScore
Multi Query Attention (FIM)HumanEval0.34
Multi Head AttentionHumanEval0.37
Multi Query AttentionHumanEval0.37
Multi Query Attention (FIM)MBPP0.61
Multi Head AttentionMBPP0.64
Multi Query AttentionMBPP0.62
left-to-right (pass@100)HumanEval0.45
fill-in-the-middle (line filling, exact match)HumanEval0.55
Python docstring generationBLEU Score18.13