OPT

LLMs Explained,
OPT

Meta AI first introduced OPT (Open Pre-trained Transformer) Language Model and released it in metaseq's repository on May 3rd, 2022. OPT was pre-trained primarily with English text, but some non-English data remains in the training corpus via CommonCrawl. A causal language modeling (CLM) objective was used to train the model. OPT is a decoder-only model in the same family as GPT-3. As such, the self-supervised causal language modeling objective was used to train it. All models with parameters ranging from 125M to 66B are released. Full research access to OPT-175B will be granted upon request to academic researchers, those affiliated with the government, civil society, and academia, and those working in industry research laboratories. The model creation logbook and codebase, metaseq, are also released, allowing OPT-175B to be trained on 992 80GB A100 GPUs with 147 TFLOP/s utilization per GPU.

Model Details View All Models

Technical Experts

50 Custom AI projects

4.8 Minimum Rating

An Overview of OPT

OPT models are useful for many natural language processing tasks and can potentially advance the field significantly. The OPT models are centered on sustainability and responsibility, and their creators intend to share them fully and responsibly with interested researchers.

OPT requires only 1/7th Carbon Footprint to develop

1/7th carbon footprint

Researchers show that OPT-175B is comparable to GPT-3 in terms of performance while requiring only one-seventh the carbon footprint to develop.

OPT-175B was trained on 992 80GB A100 GPUs

992 80GB A100 GPUs

OPT-175B was trained on 992 80GB A100 GPUs using Fully Sharded Data Parallel with Megatron-LM Tensor Parallelism and achieved up to 147 TFLOP/s per GPU utilization.

OPT was trained with 180B tokens of data

180B tokens of data

The training data for OPT contains 180B tokens, corresponding to 800 GB of data. It is a collection of the data used in RoBERTa, the Pile, as well as the PushShift.io Reddit

Blockchain Success Starts here

  • Introduction

  • Business Applications

  • Model Features

  • Model Tasks

  • Fine-tuning

  • Benchmarking

  • Sample Codes

  • Limitations

  • Other LLMs