Dolly

LLMs Explained,
Dolly

Databricks' Dolly is a commercially licensed large language model developed on the Databricks machine learning platform. It is specifically designed to excel at instruction-following tasks. Built upon pythia-12b, Dolly is trained using approximately 15,000 instruction/response fine-tuning records known as databricks-dolly-15k.

Model Card View All Models

100+ Technical Experts

50 Custom AI projects

4.8 Minimum Rating

Blockchain Success Starts here

  • Introduction

  • Key Highlights

  • Training Details

  • Business Applications

  • Model Features

  • Model Tasks

  • Fine-tuning

  • Sample Codes

  • Limitations

  • Other LLMs

Training Details

Training data

Dolly is trained on ~15k instruction/response fine tuning records databricks-dolly-15k generated by Databricks employees in capability domains from the InstructGPT paper, including brainstorming, classification, closed QA, generation, information extraction, open QA and summarization.

Training Observations

The evaluation results indicate that dolly-v2-12b does not achieve state-of-the-art performance and, in certain evaluation benchmarks, it even falls short of dolly-v1-6b. The researchers hypothesize that this discrepancy can be attributed to the composition and size of the fine-tuning datasets used. However, a comprehensive understanding of the factors contributing to these variations necessitates additional investigation and analysis. Further studies are required to provide a more conclusive explanation for these observed differences.