Open-source Large
Language Models Leaderboard
Large Language Models (LLMs) have revolutionized natural language processing and have shown impressive results in various language tasks.
The Problem: Several LLMs are available in the market. But, relevant information about these models is scattered on the internet, and it is extremely difficult to evaluate these models.
The Solution: We created this leaderboard to help researchers easily identify the best open-source LLM with an intuitive leadership quadrant graph. We evaluate the performance of open-source LLMs to rank them based on their capabilities and market adoption.View Models
Leaders
LLAMA2, LLAMA1, and T5, based on our scoring methodology, these models scored 89, 87, and 81 points, respectively. The scoring methodology is explained below. The current leader LLAMA2 is a collection of pretrained and fine-tuned large language models (LLMs) that range in scale from 7 billion to 70 billion parameters. The fine-tuned LLMs, or Llama 2-Chat, are specifically optimized for dialogue applications. These models surpass the performance of most open-source chat models on the benchmarks they were tested on. Llama2 70B model outperforms all open-source models.
Leaderboard
Rank | Model | Size | Architecture | Organization | Adoption Rating Calculated based on the number of forks and stars on the official model repo. | Capability Rating Calculated based on the number of tasks and downstream tasks of the model. | Score A weighted average of the adoption and capability score of the model. |
#1 | LLaMA2 | 70B | Transformer, Autoregressive | Meta AI | 86 | 92 | 89 |
#2 | LLaMA | 65B | Transformer, Autoregressive | Meta AI | 85 | 89 | 87 |
#3 | T5 | 11B | Transformer | 74 | 88 | 81 | |
#4 | Galactica | 120B | Transformer | Meta AI | 47 | 58 | 53 |
#5 | LongT5 | 11B | Transformer | 37 | 62 | 50 | |
#6 | GLM | 130B | Transformer, Autoregressive | Tsinghua University | 48 | 50 | 49 |
#7 | OPT | 175B | Transformer | Meta AI | 49 | 41 | 45 |
#8 | BERT | 340M | Transformer | 57 | 27 | 42 | |
#9 | UL2 | 20B | Transformer | 52 | 19 | 36 | |
#10 | GPT-NeoX | 20B | Transformer, Autoregressive | EleutherAI | 42 | 33 | 38 |
11 | Switch | 1.6T | Transformer | 27 | 18 | 23 | |
#12 | H3 | 2.7B | State Space Model | Stanford University | 27 | 21 | 24 |
#13 | FLAN-T5 | 11B | Transformer | 25 | 24 | 25 | |
#14 | Pythia | 12B | Decoder-only autoregressive | Eleuther.ai | 21 | 21 | 21 |
Need help with Generative AI?
If you have any questions or need a helping hand, don't hesitate to reach out.
Let’s Get Started
The first step towards greatness begins now, let's embark on this journey.
Help us Help you.
Share more information with us, and we'll send relevant information that cater to your unique needs.
Final Touch
Kindly share some details about your company to help us identify the best-suited person to contact you.
Contact Details
Project Details
Company Information
Ranking Methodology
We only considered prominent and open-source LLMs to create this leaderboard. Note that this leaderboard can only be considered a high-level indicator of overall performance. Depending on the specific use case and business requirements, a detailed analysis is required to choose the right model. The key parameters we used for the scoring are;
- Benchmark results
- Model forks
Capability Rating(CR) is calculated based on a weighted sum of benchmark results(BR) published in the Model's research paper.
Rank weights;
For performance ranks #1 to #5, rank weight = 3.
For performance ranks #6 to #10, rank weight = 2.
For performance ranks #11 to #20, rank weight = 1.
Adoption Rating (AR) is calculated based on Model forks (MF) and penalizing that value against model performance. To calculate the adoption rating, we calculate the sum of the normalized value of Forks and Capability score. Then normalize the resulting value to 100. The Model score is simply the average of scores Adoption Rating and Capability Rating.
Generative AI Adoption Framework
This whitepaper will explore generative AI and identify business growth opportunities it offers. We aim to provide business owners with a comprehensive guide to using AI to unlock new opportunities and achieve sustainable growth. We will explore how generative AI can be used to analyze data and identify patterns, as well as how it can be used to generate new ideas and solutions.
Free DownloadFrequently Asked Questions
Here are some of the most common questions we get asked. If you have a question that isn't on this list, please don't hesitate to contact us. We're always happy to help! We'll get back to you within 24 hours.
Some of the popular and widely used open-source large language models are:
- Generalized linear models (GLMs) are a type of statistical model that can be used to analyze data that does not follow a normal distribution. GLMs can be used to predict the response variable and understand the relationship between the independent variables and the response variable. GLMs are used in marketing, finance, healthcare, and social sciences.
- Galactica is a large language model (LLM) developed by Meta AI and Papers with Code. It is a 120-billion-parameter model trained on a massive dataset of scientific papers, textbooks, and code. Galactica can generate text, translate languages, write creative content, and answer your questions informatively.
- T5, or Text-to-Text Transfer Transformer, is a Transformer-based language model that Google AI developed. It is a 1.54 trillion parameter model trained on a massive dataset of text and code. T5 can be used for various natural language processing tasks, including text summarization, translation, and question-answering.
- GPT-NeoX is a 20 billion parameter autoregressive language model, also known as a conversational AI or chatbot trained to be informative and comprehensive. It is trained on a massive dataset of text and code and can communicate and generate human-like text in response to a wide range of prompts and questions. For example, it can provide summaries of factual topics or create stories.
- OPT (Optimized Transformer) is a large language model (LLM) developed by Google AI. It is a 175-billion-parameter model trained on a massive dataset of text and code. OPT is designed to be more efficient and faster than other LLMs while maintaining accuracy.
Based on our ranking, GLM has a high score because it has high adoption and capability ratings. For example, GLM may be better at generating creative text than GPT-NeoX, but GPT-NeoX is much faster than GLM. So ultimately, the best way to choose a language model is to consider the specific task they must perform.
T5 and BERT are large language models (LLMs) trained on massive amounts of text data. They can be used for various tasks, such as translation, question answering, and summarization. T5 is a text-to-text transfer transformer trained in a text-to-text format. This also means the model is trained to generate text as the output for a given input text. Moreover, this makes T5 more flexible than BERT, typically trained on a specific task, such as translation or question answering. BERT is a bidirectional encoder representation from transformers, which means that it is trained to encode text in a way that allows it to be understood in both directions. As such, BERT is better at understanding the context of the text, which can be helpful for tasks like question answering.
Overall, T5 and BERT are powerful LLMs that can be used for various tasks. T5 is more flexible, while BERT is better at understanding text context. However, T5 and BERT are still developing, and their capabilities constantly improve. So it is important to consider the specific task they want to perform when choosing a particular language model for a business use case.
Choosing the right large language model (LLM) for your needs can be daunting. There are many factors to consider, such as the model's size, the type of data it was trained on, and the specific tasks you need it to perform. Here are a few suggestions that can help you choose the right LLM:
- Consider the model size: Larger models have been trained on more data and are more capable. However, they also require more computing resources to run.
- Think about the type of data the model was trained on: If you need the model to perform a task requiring knowledge of a particular domain, such as medicine or law, choose a model trained on data from that domain.
- Determine the specific tasks you need the model to perform: Some models are better at certain tasks than others. For example, some models are better at generating text, while others are better at understanding text.
- Consider the cost of the model: Some models are free, while others require a subscription or pay-per-use fee.
- Think about the privacy implications of using the model: Some models collect and store user data, which could be used for marketing or other purposes.
- Be aware of the model's limitations: No model is perfect, and all models have limitations. It is important to be aware of these limitations before using the model.
- Choose a model aligned with your values: Some models are trained on biased or harmful data, so choosing a model aligned with your values and beliefs is important.
Some of the applications of large language models are:
- Customer Service: Large language models can be used to create chatbots that can answer customer questions and provide support 24/7. Above all, this frees up human customer service representatives to focus on more complex issues.
- Sales & Marketing: LLMs can generate personalized marketing content that is more likely to resonate with each customer. As such, it can help businesses to improve their conversion rates and generate more leads.
- Product Development: LLMs can be used to gather feedback from customers and identify new product opportunities. This can help businesses to develop products that are more in line with the needs of their customers.
- Research: LLMs can analyze large amounts of text data and identify patterns and trends. Moreover, this help businesses to stay ahead of the competition and develop new products and services.
- Finance: LLMs can be used to analyze financial data and identify risks and opportunities. This can help businesses to make more informed investment decisions.
- Law: LLMs can be used to analyze legal documents and identify potential risks and liabilities. As such, it can help businesses to avoid costly legal disputes.
- Medicine: LLMs can analyze medical data and identify new treatments and cures. This can help businesses develop new drugs and therapies to improve patients' lives.
- Education: LLMs can personalize learning and provide students with more engaging and effective instruction. It can help students to learn more effectively and achieve their educational goals.