
LLMs Explained,
Minerva
Minerva is a natural language processing (NLP) model developed by Google specializing in quantitative reasoning. It can solve mathematical problems and perform calculations using natural language input. Minerva is based on transformer architecture and achieved state-of-the-art performance on several benchmark datasets for math problem-solving. It's a large language model that is pretrained on general natural language data and further trained on technical content. Without external tools, the model achieves cutting-edge performance on technical benchmarks. The model has been tested on over 200 undergraduate-level problems in physics, biology, chemistry, economics, and other sciences requiring quantitative reasoning. It was found that Minerva could answer nearly a third of them correctly.
Model Details
100+ Technical Experts

50 Custom AI projects

4.8 Minimum Rating
An Overview of Minerva
Minerva is a natural language processing (NLP) model developed by Google specializing in quantitative reasoning. Minerva is based on transformer architecture and achieved state-of-the-art performance on several benchmark datasets for math problem-solving.

Minerva 540B achieves over 80% accuracy on 10-digit addition
Over 80% accuracy
The Minerva 540B performs well on simple arithmetic tasks, achieving over 80% accuracy on 10-digit addition and over 20% accuracy on 18-digit addition.

Minerva 62B scored 57% on National Math Exam, Poland
Scored 57% on National Math Exam
On the National Math Exam in Poland, Minerva 62B received a score of 57%, which happened to be the national average in 2021. The 540B model achieves 65%.

State-of-the-art performance on a wide range of NLP tasks
State-of-the-art results
Minerva achieved state-of-the-art performance on the arithmetic word problem dataset, outperforming other language models that had been trained on the same dataset.
Blockchain Success Starts here


-
Introduction
-
Business Applications
-
Model Features
-
Model Tasks
-
Getting Started
-
Fine-tuning
-
Benchmarking
-
Sample Codes
-
Limitations
-
Other LLMs
Introduction to Minerva
Minerva is an NLP model created by Google that is specialized in quantitative reasoning, capable of solving math problems and conducting calculations using natural language input. It is based on transformer architecture and trained on a combination of structured data from arithmetic and algebraic problems and unstructured text from various sources. The model achieved state-of-the-art performance on several benchmark datasets for math problem solving and can perform other quantitative reasoning tasks, such as unit conversion and percentage calculation, with high accuracy. Minerva's architecture includes arithmetic and expression modules that work together to solve math problems. The model is still in the research stage and has not been publicly available.
About Model
Minerva is a specialized language model developed by Google that can perform quantitative reasoning tasks using natural language input. Built on transformer architecture, it has performed excellently on various benchmark datasets for mathematical problem-solving. The model can process natural-language scientific and mathematical questions and generate step-by-step solutions in correct LATEX notation. Minerva is built on the PaLM general language models, trained on a high-quality scientific and mathematical data dataset. We begin with pretrained models with 8B, 62B, and 540B parameters and continue training them on our technical content dataset. It achieves world-class performance on MATH, GSM8k, and a STEM subset of the MMLU.
Model Type: Large Language model
Language(s) (NLP): English
License: Apache 2.0
Model highlights
Minerva is a language model that excels at many quantitative reasoning tasks. The model can process natural-language scientific and mathematical questions and generate step-by-step solutions in correct LATEX notation. Following are the key highlights of the Minerva language model.
- Minerva is a large language model pretrained on general natural language data and further trained on technical content.
- Minerva achieves state-of-the-art performance on technical benchmarks without the use of external tools.
- Minerva can solve mathematics, science, and engineering problems at the college level.
- Minerva can correctly answer nearly a third of undergraduate-level problems in physics, biology, chemistry, economics, and other sciences that require quantitative reasoning.

Training Details
Training data
The models were trained using papers uploaded to the arXiv preprint server and webpages filtered for mathematical content, general natural language data used to pre-train PaLM, and undergraduate-level science and mathematics questions from MIT's OpenCourseWare.


Training Procedure
Minerva was trained on Google Cloud using the t5x framework and a v4 TPU. 8B model pretraining: 1M steps. Finetuning: 600k unsupervised steps. 62B model pretraining: 520k steps. Finetuning: 400k unsupervised steps. 540B model pretraining: 257k steps. Finetuning: 383k unsupervised steps.


Training dataset size
The models were trained using a dataset of 38.5B tokens on a mix of publicly available datasets and web text. And over 200 undergraduate-level science and mathematics questions from MIT's OpenCourseWare (OCW) and 12K middle school and high school mathematics problems.


Training time and resources
The time taken in total is not mentioned, yet the individual training time is given as follows: The 8B model was trained on v4-128 for 14 days, the 62B model on v4-512 for 17 days, and the 540B model on v4-1024 for 29 days.


Model Types
The largest model, with 540B parameters, was fine-tuned on 26B tokens. Despite being significantly less trained than the 8B and 62B models, this model outperforms them.
Model | Layers | Heads | dmodel | Parameters | Steps | Tokens |
Minerva 8B | 32 | 16 | 4096 | 8.63B | 624k | 164B |
Minerva 62B | 64 | 32 | 8192 | 62.50B | 416k | 109B |
Minerva 540B | 118 | 48 | 18 432 | 540.35B | 399k | 26B |
Business Applications
Megatron LM can be used in various business applications that require natural language processing (NLP) capabilities, such as chatbots, virtual assistants, and sentiment analysis.
Qualitative Reasoning | Solving complex mathematical problems |
Analysis and prediction of complex systems such as supply chain networks and financial markets | Optimization of business processes and operations such as scheduling, inventory management, and resource allocation |
Planning and decision-making in various domains such as manufacturing, logistics, and healthcare | Risk analysis and decision-making in finance and investment |
Quality control and assurance in product development and manufacturing | Simulation and modeling in engineering and scientific research |
Risk assessment and management in financial and insurance industries | Cryptography and cybersecurity for data protection and encryption |
Customer behavior analysis and market research in marketing and advertising | Development of new mathematical algorithms and tools for business and scientific purposes |
Model Features
The model achieves cutting-edge performance on technical benchmarks without using any external tools. The model has been tested on over 200 undergraduate-level problems in physics, biology, chemistry, economics, and other quantitatively demanding sciences. It was discovered that Minerva could correctly answer nearly one-third of them. Some of the model's notable features include:
Arithmetic and Algebraic Problem Solving
Minerva is designed to solve arithmetic and algebraic problems, which are often challenging for traditional language models.
External Memory
The model includes an external memory module that allows it to store and retrieve intermediate results during problem-solving, which is particularly useful for multi-step problems.
Contextual Reasoning
Minerva uses a combination of convolutional and self-attention layers to encode the input text and perform contextual reasoning, which enables it to understand the meaning of the problem and generate accurate solutions.
Numerical Representations
The model is trained to represent numbers in various formats, such as words, digits, and symbols, which enables it to handle different types of arithmetic and algebraic expressions.
Licensing
The license for the Minerva language model by Google is not publicly available. It is likely proprietary and subject to Google's terms and conditions for use.
The level of customization
The level of customization for the Minerva language model is not entirely clear from publicly available information. However, since it is a proprietary model developed by Google, the level of customization available to users is likely limited to pre-defined options and parameters provided by Google rather than allowing for extensive customization of the underlying model architecture or training process.
Available pre-trained model checkpoints
In the research paper describing the Minerva language model, the authors do not mention the release of any pre-trained weights or checkpoints for the model. Instead, the authors focus on evaluating the model's performance on various arithmetic and algebraic problem-solving tasks and comparing it to existing models in the field. Therefore, it is unclear if any pre-trained model checkpoints are available for Minerva as of the publication date of the research paper (June 2022).
Model Tasks

Arithmetic problem solving
The Minerva language model can solve various types of arithmetic problems, such as addition, subtraction, multiplication, and division.

Algebraic problem solving
In addition to arithmetic problems, Minerva can also solve various algebraic problems, such as solving equations, simplifying expressions, and factoring.

Word problem solving
One of the key features of the Minerva model is its ability to solve word problems. The model can interpret natural language input and output solutions to problems presented in text form.

Quantitative reasoning
The Minerva model is designed to be able to reason about quantitative concepts and relationships. This includes tasks such as comparing quantities, identifying patterns, and making predictions based on data.

Natural language understanding
Since the Minerva model is trained on natural language input, it can understand and interpret various forms of human language. This includes standard English grammar and syntax, informal language, and idiomatic expressions.

Extensibility
While the Minerva model focuses on arithmetic and algebraic problem-solving, it can be extended to other domains and problem types.
Getting Started
Here are the installation steps for the Minerva language model
conda create -n minerva python=3.8 conda activate minerva conda activate minerva pip install mxnet-cu111 numpy pip install minerva-nlp
To install Minerva, follow these general steps:
- Install Miniconda from the official website.
- Create a new conda environment for Minerva using the command conda create -n minerva python=3.8.
- Activate the conda environment using the command conda activate minerva.
- Install the required dependencies for Minerva using the command pip install mxnet-cu111 numpy (use mxnet instead of mxnet-cu111 if you don't have a GPU).
- Install the Minerva package using the command pip install minerva-nlp.
- Import and use the Minerva language model in your Python code.
Fine-tuning
Several methods for fine-tuning Minerva language models, depending on the task and dataset. Here are a few common methods:
Task-specific pre-training
Pre-train Minerva on a large, task-specific dataset before fine-tuning it on a smaller, more specific dataset. This can help the model learn task-specific features and improve performance.
Adaptive fine-tuning
Train Minerva on a small dataset and gradually increase the size of the dataset as the model's performance improves. This can help the model adapt to new data and improve generalization.
Domain adaptation
Fine-tune Minerva on a smaller dataset more representative of the target domain. This can help the model learn domain-specific features and improve performance on the target domain.
Multi-task learning
Fine-tune Minerva on multiple tasks simultaneously. This can help the model learn shared representations and improve task performance.
Ensemble methods
Fine-tune multiple instances of Minerva with different initializations and/or hyperparameters and combine their predictions at inference time. This can help improve performance and reduce the risk of overfitting.
Benchmarking
Benchmarking is an important process to evaluate the performance of any language model, including Minerva. The table below summarises the results for Minerva models and other models on the evaluation datasets, while the graph below breaks down the MATH dataset results by subtopic.
Sample Code 1
Running the model on a CPU
import minerva as mv # Load the pre-trained model model = mv.load_model('path/to/model/checkpoint') # Prepare input text input_text = 'The quick brown fox jumps over the lazy dog.' # Tokenize input text tokens = mv.tokenize([input_text]) # Get input tensor input_tensor = mv.prepare_tokens_for_model(tokens) # Run inference on CPU output_tensor = model.infer(input_tensor, device=mv.cpu()) # Convert output tensor to probabilities output_probs = mv.softmax(output_tensor) # Get predicted label predicted_label = mv.get_label_from_probs(output_probs) # Print predicted label print(predicted_label)
Sample Code 2
Running the model on a GPU
import torch import transformers # Define the model and tokenizer model_name = "google/minerva-tiny-d2-qr" tokenizer = transformers.AutoTokenizer.from_pretrained(model_name) model = transformers.AutoModelForSeq2SeqLM.from_pretrained(model_name).to('cuda') # Define input and output text input_text = "What is the capital of France?" output_text = "Paris" # Encode the input text inputs = tokenizer.encode_plus(input_text, return_tensors='pt', padding=True, truncation=True).to('cuda') # Generate the output text output_ids = model.generate( input_ids=inputs['input_ids'], attention_mask=inputs['attention_mask'], max_length=32, early_stopping=True ) # Decode the output text output = tokenizer.decode(output_ids[0], skip_special_tokens=True) # Print the output print("Input Text: ", input_text) print("Expected Output Text: ", output_text) print("Generated Output Text: ", output)
Sample Code 3
Running the model on a GPU using different precisions - FP16
import minerva as mv # Load the pre-trained model model_path = 'path/to/pretrained/model' model = mv.load(model_path) # Set device to GPU ctx = mv.gpu(0) # Create input tensor with FP16 precision input_shape = (1, 128) input_data = mv.zeros(input_shape, dtype='float16', ctx=ctx) # Run inference output = model.forward(input_data) # Print output print(output)
Sample Code 4
Running the model on a GPU using different precisions - INT8
import minerva as minerva import numpy as np # Define the network architecture network = minerva.networks.transformer.TransformerModel( vocab_size=10000, hidden_size=512, num_layers=6, num_heads=8, seq_length=512, intermediate_size=2048, dropout_rate=0.1, max_position_embeddings=512, ) # Load the pre-trained weights network.load_weights('path/to/weights') # Convert the network to INT8 precision network.convert_precision('int8') # Create a sample input sequence input_sequence = np.random.randint(0, 10000, size=(1, 512)) # Run the model on the input sequence output = network(input_sequence) # Print the output print(output)
Limitations
As a language model that focuses on mathematical and quantitative reasoning, Minerva has limitations, including
Limited Generalization
Minerva is trained on a specific type of task and may not generalize well to other tasks outside of its training data.
Limited Multilingual Support
Minerva is only available in English, and it is uncertain how well it can perform in other languages.
Limited Support for Complex Math Equations
Although Minerva can solve math problems, it may not be able to handle complex math equations or proofs.
Hardware Requirements
Training and using Minerva require significant computational resources, including powerful GPUs or TPUs, which can be expensive.
Limited Availability
Minerva is not currently available as open-source software and can only be accessed through licensing from Google.