Business-friendly LLMs Leaderboard

Business Friendly
LLMs
Leaderboard

LLMs are game-changers for businesses. They're not just tools, but powerful catalysts capable of supercharging operational efficiency and accelerating growth.
The Problem: There are hundreds of LLMs available in the market, but most don't allow commercial use. Only a few are business-friendly, and finding them among the vast options can be time-consuming and challenging.
Our Solution: We created this business-friendly LLM leaderboard to help entrepreneurs easily identify the right open-source LLM with commercial usage rights.View Models

Leaders

As of 8th August, the top 3 leaders in our business-friendly LLM leaderboard are T5, GenZ and UL2. Based on our scoring methodology, these models scored, 72, 70 and 60 respectively. The scoring methodology is explained below. The current leader is T5. With an MMLU score of 55.1 with 11B parameters and Apache 2.0 license, the model stands as the best fit for businesses that want to use an open-source LLM for a commercial project without any usage restrictions and need to fine-tune the model for specific needs. GenZ is ranked in second place, the model achieved SOTA in its category for the MT Bench benchmark with 87% accuracy compared to ChatGPT and on par with the LLaMA2 70B chat model, which is a 5X bigger model that requires 40X more GPU memory.

Leaderboard

RankModelLicenseCapability Score
Calculated based on MMLU benchmark and general model performance
Usability Score
Calculated based on MMLU benchmark and general model performance
Ease of Adoption
Calculated based on adoption cost, computing power requirements
Ag. Score
Aggregate score of the model
#1T5 11BApache 2.055.18.98972.05
#2GenZ 13BApache 2.053.688.78770.34
#3UL2 20BApache 2.039.288059.6
#4Pythia 12BApache 2.026.768.88857.38
#5Open Assistant 12BApache 2.026.558.88857.275
#6Cerebras-GPT 13BApache 2.025.928.78756.46
#7LLaMA 2 13BCustom54.88.760.957.85
#8GPT NeoX 20BApache 2.029.9288054.96
#9LLaMA 2 34BCustom62.66.646.254.4
10Dolly 12B MIT25.928.861.643.76
#11LLaMA 2 70B Custom68.932144.95
#12MPT-30B CC BY-SA-3.047.9372837.965

Need help with Generative AI?

If you have any questions or need a helping hand, don't hesitate to reach out.

Let’s Get Started

The first step towards greatness begins now, let's embark on this journey.

Help us Help you.

Share more information with us, and we'll send relevant information that cater to your unique needs.

Final Touch

Kindly share some details about your company to help us identify the best-suited person to contact you.

Contact Details

Next

Project Details

Next

Company Information

Submit

Ranking Methodology

The models are ranked based on an evaluation that considers the model's capabilities, ease of adoption which accounts for the model adoption cost, and the usability of the model, which accounts for the availability of commercial usage rights.

The capability score is calculated using the MMLU (Massive Multitask Language Understanding) benchmark scores. A higher value of MMLU indicates that the model is proficient in understanding language across a wide range of tasks. This proficiency suggests that the model has a robust capability that can benefit business applications. MMLU score is normalized to 10 to calculate the capability score.

Ease of adoption is calculated based on the model size. The bigger the model size, the more computing power it needs for fine-tuning, which means higher costs in adopting the model. So bigger models have low scores for ease of adoption. For the evaluation, we only considered models bigger than 10B parameters. The score is calculated by plotting the parameter value inversely proportional to a scale of 0 to 100 and normalizing to the scale of 0 to 10 for simplicity in calculations.

The usability score is calculated based on the degree of restrictions present in the model’s license. Apache 2.0 has the highest score of 10, offering the most flexibility and benefits to business and commercial use cases. Followed by MIT with a score of 7 and CC BY-SA-3.0 with a score of 4. Since LLaMa offers a custom license that has usage restrictions, its usability score is set as 7. The scores for each license are defined based on the relative freedom each offers businesses to launch commercial applications.

The aggregate score of the model is calculated by averaging the capability score, ease of adoption, and usability score. The aggregate score is a direct indicator of the rank of the model.

Generative AI Adoption Framework

This whitepaper will explore generative AI and identify business growth opportunities it offers. We aim to provide business owners with a comprehensive guide to using AI to unlock new opportunities and achieve sustainable growth. We will explore how generative AI can be used to analyze data and identify patterns, as well as how it can be used to generate new ideas and solutions.

Free Download

Frequently Asked Questions

Here are some of the most common questions we get asked. If you have a question that isn't on this list, please don't hesitate to contact us. We're always happy to help! We'll get back to you within 24 hours.