LLMs Explained,
Minerva

Minerva is a natural language processing (NLP) model developed by Google specializing in quantitative reasoning. It can solve mathematical problems and perform calculations using natural language input. Minerva is based on transformer architecture and achieved state-of-the-art performance on several benchmark datasets for math problem-solving. It's a large language model that is pretrained on general natural language data and further trained on technical content. Without external tools, the model achieves cutting-edge performance on technical benchmarks. The model has been tested on over 200 undergraduate-level problems in physics, biology, chemistry, economics, and other sciences requiring quantitative reasoning. It was found that Minerva could answer nearly a third of them correctly.

Model Details

100+ Technical Experts

50 Custom AI projects

4.8 Minimum Rating

An Overview of Minerva

Minerva is a natural language processing (NLP) model developed by Google specializing in quantitative reasoning. Minerva is based on transformer architecture and achieved state-of-the-art performance on several benchmark datasets for math problem-solving.

Minerva 540B achieves over 80% accuracy on 10-digit addition

Over 80% accuracy

The Minerva 540B performs well on simple arithmetic tasks, achieving over 80% accuracy on 10-digit addition and over 20% accuracy on 18-digit addition.

Minerva 62B scored 57% on National Math Exam, Poland

Scored 57% on National Math Exam

On the National Math Exam in Poland, Minerva 62B received a score of 57%, which happened to be the national average in 2021. The 540B model achieves 65%.

State-of-the-art performance on a wide range of NLP tasks

State-of-the-art results

Minerva achieved state-of-the-art performance on the arithmetic word problem dataset, outperforming other language models that had been trained on the same dataset.

Introduction
Business Applications
Model Features
Model Tasks
Getting Started
Fine-tuning
Benchmarking
Sample Codes
Limitations
Other LLMs

Introduction to Minerva

Minerva is an NLP model created by Google that is specialized in quantitative reasoning, capable of solving math problems and conducting calculations using natural language input. It is based on transformer architecture and trained on a combination of structured data from arithmetic and algebraic problems and unstructured text from various sources. The model achieved state-of-the-art performance on several benchmark datasets for math problem solving and can perform other quantitative reasoning tasks, such as unit conversion and percentage calculation, with high accuracy. Minerva's architecture includes arithmetic and expression modules that work together to solve math problems. The model is still in the research stage and has not been publicly available.

About Model

Minerva is a specialized language model developed by Google that can perform quantitative reasoning tasks using natural language input. Built on transformer architecture, it has performed excellently on various benchmark datasets for mathematical problem-solving. The model can process natural-language scientific and mathematical questions and generate step-by-step solutions in correct LATEX notation. Minerva is built on the PaLM general language models, trained on a high-quality scientific and mathematical data dataset. We begin with pretrained models with 8B, 62B, and 540B parameters and continue training them on our technical content dataset. It achieves world-class performance on MATH, GSM8k, and a STEM subset of the MMLU.

Model Type: Large Language model
Language(s) (NLP): English
License: Apache 2.0

Research Paper

Model Repository

Minerva Wiki

Model highlights

Minerva is a language model that excels at many quantitative reasoning tasks. The model can process natural-language scientific and mathematical questions and generate step-by-step solutions in correct LATEX notation. Following are the key highlights of the Minerva language model.

Minerva is a large language model pretrained on general natural language data and further trained on technical content.
Minerva achieves state-of-the-art performance on technical benchmarks without the use of external tools.
Minerva can solve mathematics, science, and engineering problems at the college level.
Minerva can correctly answer nearly a third of undergraduate-level problems in physics, biology, chemistry, economics, and other sciences that require quantitative reasoning.

Training Details

Training data

The models were trained using papers uploaded to the arXiv preprint server and webpages filtered for mathematical content, general natural language data used to pre-train PaLM, and undergraduate-level science and mathematics questions from MIT's OpenCourseWare.

Training Procedure

Minerva was trained on Google Cloud using the t5x framework and a v4 TPU. 8B model pretraining: 1M steps. Finetuning: 600k unsupervised steps. 62B model pretraining: 520k steps. Finetuning: 400k unsupervised steps. 540B model pretraining: 257k steps. Finetuning: 383k unsupervised steps.

Training dataset size

The models were trained using a dataset of 38.5B tokens on a mix of publicly available datasets and web text. And over 200 undergraduate-level science and mathematics questions from MIT's OpenCourseWare (OCW) and 12K middle school and high school mathematics problems.

Training time and resources

The time taken in total is not mentioned, yet the individual training time is given as follows: The 8B model was trained on v4-128 for 14 days, the 62B model on v4-512 for 17 days, and the 540B model on v4-1024 for 29 days.

Model Types

The largest model, with 540B parameters, was fine-tuned on 26B tokens. Despite being significantly less trained than the 8B and 62B models, this model outperforms them.

Model	Layers	Heads	dmodel	Parameters	Steps	Tokens
Minerva 8B	32	16	4096	8.63B	624k	164B
Minerva 62B	64	32	8192	62.50B	416k	109B
Minerva 540B	118	48	18 432	540.35B	399k	26B

Business Applications

Megatron LM can be used in various business applications that require natural language processing (NLP) capabilities, such as chatbots, virtual assistants, and sentiment analysis.

Qualitative Reasoning	Solving complex mathematical problems
Analysis and prediction of complex systems such as supply chain networks and financial markets	Optimization of business processes and operations such as scheduling, inventory management, and resource allocation
Planning and decision-making in various domains such as manufacturing, logistics, and healthcare	Risk analysis and decision-making in finance and investment
Quality control and assurance in product development and manufacturing	Simulation and modeling in engineering and scientific research
Risk assessment and management in financial and insurance industries	Cryptography and cybersecurity for data protection and encryption
Customer behavior analysis and market research in marketing and advertising	Development of new mathematical algorithms and tools for business and scientific purposes

Model Features

The model achieves cutting-edge performance on technical benchmarks without using any external tools. The model has been tested on over 200 undergraduate-level problems in physics, biology, chemistry, economics, and other quantitatively demanding sciences. It was discovered that Minerva could correctly answer nearly one-third of them. Some of the model's notable features include:

Arithmetic and Algebraic Problem Solving

Minerva is designed to solve arithmetic and algebraic problems, which are often challenging for traditional language models.

External Memory

The model includes an external memory module that allows it to store and retrieve intermediate results during problem-solving, which is particularly useful for multi-step problems.

Contextual Reasoning

Minerva uses a combination of convolutional and self-attention layers to encode the input text and perform contextual reasoning, which enables it to understand the meaning of the problem and generate accurate solutions.

Numerical Representations

The model is trained to represent numbers in various formats, such as words, digits, and symbols, which enables it to handle different types of arithmetic and algebraic expressions.

Licensing

The license for the Minerva language model by Google is not publicly available. It is likely proprietary and subject to Google's terms and conditions for use.

The level of customization

The level of customization for the Minerva language model is not entirely clear from publicly available information. However, since it is a proprietary model developed by Google, the level of customization available to users is likely limited to pre-defined options and parameters provided by Google rather than allowing for extensive customization of the underlying model architecture or training process.

Available pre-trained model checkpoints

In the research paper describing the Minerva language model, the authors do not mention the release of any pre-trained weights or checkpoints for the model. Instead, the authors focus on evaluating the model's performance on various arithmetic and algebraic problem-solving tasks and comparing it to existing models in the field. Therefore, it is unclear if any pre-trained model checkpoints are available for Minerva as of the publication date of the research paper (June 2022).

Model Tasks

Arithmetic problem solving

The Minerva language model can solve various types of arithmetic problems, such as addition, subtraction, multiplication, and division.

Algebraic problem solving

In addition to arithmetic problems, Minerva can also solve various algebraic problems, such as solving equations, simplifying expressions, and factoring.

Word problem solving

One of the key features of the Minerva model is its ability to solve word problems. The model can interpret natural language input and output solutions to problems presented in text form.

Quantitative reasoning

The Minerva model is designed to be able to reason about quantitative concepts and relationships. This includes tasks such as comparing quantities, identifying patterns, and making predictions based on data.

Natural language understanding

Since the Minerva model is trained on natural language input, it can understand and interpret various forms of human language. This includes standard English grammar and syntax, informal language, and idiomatic expressions.

Extensibility

While the Minerva model focuses on arithmetic and algebraic problem-solving, it can be extended to other domains and problem types.

Getting Started

Here are the installation steps for the Minerva language model

conda create -n minerva python=3.8

conda activate minerva

conda activate minerva

pip install mxnet-cu111 numpy

pip install minerva-nlp

To install Minerva, follow these general steps:

Install Miniconda from the official website.
Create a new conda environment for Minerva using the command conda create -n minerva python=3.8.
Activate the conda environment using the command conda activate minerva.
Install the required dependencies for Minerva using the command pip install mxnet-cu111 numpy (use mxnet instead of mxnet-cu111 if you don't have a GPU).
Install the Minerva package using the command pip install minerva-nlp.
Import and use the Minerva language model in your Python code.

Fine-tuning

Several methods for fine-tuning Minerva language models, depending on the task and dataset. Here are a few common methods:

Task-specific pre-training

Pre-train Minerva on a large, task-specific dataset before fine-tuning it on a smaller, more specific dataset. This can help the model learn task-specific features and improve performance.

Adaptive fine-tuning

Train Minerva on a small dataset and gradually increase the size of the dataset as the model's performance improves. This can help the model adapt to new data and improve generalization.

Domain adaptation

Fine-tune Minerva on a smaller dataset more representative of the target domain. This can help the model learn domain-specific features and improve performance on the target domain.

Multi-task learning

Fine-tune Minerva on multiple tasks simultaneously. This can help the model learn shared representations and improve task performance.

Ensemble methods

Fine-tune multiple instances of Minerva with different initializations and/or hyperparameters and combine their predictions at inference time. This can help improve performance and reduce the risk of overfitting.

Benchmarking

Benchmarking is an important process to evaluate the performance of any language model, including Minerva. The table below summarises the results for Minerva models and other models on the evaluation datasets, while the graph below breaks down the MATH dataset results by subtopic.

Sample Code 1

Running the model on a CPU

import minerva as mv

# Load the pre-trained model
model = mv.load_model('path/to/model/checkpoint')

# Prepare input text
input_text = 'The quick brown fox jumps over the lazy dog.'

# Tokenize input text
tokens = mv.tokenize([input_text])

# Get input tensor
input_tensor = mv.prepare_tokens_for_model(tokens)

# Run inference on CPU
output_tensor = model.infer(input_tensor, device=mv.cpu())

# Convert output tensor to probabilities
output_probs = mv.softmax(output_tensor)

# Get predicted label
predicted_label = mv.get_label_from_probs(output_probs)

# Print predicted label
print(predicted_label)

Sample Code 2

Running the model on a GPU

import torch
import transformers
# Define the model and tokenizer
model_name = "google/minerva-tiny-d2-qr"

tokenizer = transformers.AutoTokenizer.from_pretrained(model_name)
model = 
transformers.AutoModelForSeq2SeqLM.from_pretrained(model_name).to('cuda')

# Define input and output text
input_text = "What is the capital of France?"
output_text = "Paris"

# Encode the input text
inputs = tokenizer.encode_plus(input_text, return_tensors='pt', padding=True, truncation=True).to('cuda')


# Generate the output text
output_ids = model.generate(
input_ids=inputs['input_ids'],
attention_mask=inputs['attention_mask'],
max_length=32,
early_stopping=True
)

# Decode the output text
output = tokenizer.decode(output_ids[0], skip_special_tokens=True)

# Print the output
print("Input Text: ", input_text)
print("Expected Output Text: ", output_text)
print("Generated Output Text: ", output)

Sample Code 3

Running the model on a GPU using different precisions - FP16

Sample Code 4

Running the model on a GPU using different precisions - INT8

import minerva as minerva
import numpy as np

# Define the network architecture
network = minerva.networks.transformer.TransformerModel(

 vocab_size=10000,
hidden_size=512,
num_layers=6,
num_heads=8,
seq_length=512,
intermediate_size=2048,
dropout_rate=0.1,
max_position_embeddings=512,
)

# Load the pre-trained weights
network.load_weights('path/to/weights')

# Convert the network to INT8 precision
network.convert_precision('int8')

# Create a sample input sequence
input_sequence = np.random.randint(0, 10000, size=(1, 512))

# Run the model on the input sequence
output = network(input_sequence)

# Print the output
print(output)

Limitations

As a language model that focuses on mathematical and quantitative reasoning, Minerva has limitations, including

Limited Generalization

Minerva is trained on a specific type of task and may not generalize well to other tasks outside of its training data.

Limited Multilingual Support

Minerva is only available in English, and it is uncertain how well it can perform in other languages.

Limited Support for Complex Math Equations

Although Minerva can solve math problems, it may not be able to handle complex math equations or proofs.

Hardware Requirements

Training and using Minerva require significant computational resources, including powerful GPUs or TPUs, which can be expensive.

Limited Availability

Minerva is not currently available as open-source software and can only be accessed through licensing from Google.

Other LLMs

White Papers

Products

MENU

LLMs Explained, Minerva