InstructEval Models Explained,
ChatGLM-6B

ChatGLM-6B is an open-source bilingual dialogue language model thoughtfully crafted to enable seamless question-answering tasks in both Chinese and English. With an extensive architecture boasting 6.2 billion parameters, the model is built upon the strong foundation of the General Language Model (GLM). Leveraging cutting-edge technology similar to ChatGPT, it has been meticulously fine-tuned and optimized to deliver exceptional performance in Chinese Q&A and dialogue settings.

Model Details

View All Models

100+ Technical Experts

50 Custom AI projects

4.8 Minimum Rating

An Overview of ChatGLM-6B

A team of researchers from Tsinghua University in China developed ChatGLM-6B. Professor Jianfeng Gao, the THU-Tianjin Key Laboratory of Intelligent Information Processing director, led the team. The purpose of developing ChatGLM-6B was to create a large language model that could be used for various tasks, including machine translation, question answering, and chatbot development. The model was also designed to be efficient and easy to deploy on various devices. The ChatGLM-6B model is effective for a variety of tasks. For example, the model has achieved state-of-the-art performance on the GLUE benchmark (89.4), a set of natural language understanding tasks. The model has also been effective for machine translation and chatbot development.

ChatGLM-6B was trained on a cluster of 1,000 GPUs, while GPT-3 was trained on a cluster of 5,000 GPUs.

Large Size

The large size of the model affords it the capacity to assimilate intricate patterns and interrelationships within language, yielding superior performance across a diverse range of tasks.

The model could answer 95% of queries about numbers correctly, compared to 85% for the GPT-3 family of models.

Multilingual

Being multilingual, the model comprehends and generates text in multiple languages, making it invaluable for apps requiring support for diverse linguistic needs, such as customer service chatbots and translation tools.

The ChatGLM-6B model is also more fluent than other LLM models at answering queries about numbers.

Efficient

The ChatGLM-6B model exhibits exceptional GPU memory efficiency, enabling its deployment on smaller GPUs compared to other large language models.

Model Details

The ChatGLM-6B model is based on the Transformer architecture. It has 6.2 billion parameters, which makes it one of the largest LLMs available. The ChatGLM-6B model is trained on a massive dataset of text and code, which allows it to generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way. Regarding speed, the ChatGLM-6B model is comparable to GPT-3. The ChatGLM-6B model can generate text at about 100 words per second on a single GPU. Moreover, this speed is quite similar to the speed of GPT-3. However, the ChatGLM-6B model has some advantages over GPT-3. For example, the ChatGLM-6B model is more efficient at using GPU memory. This means you can run the ChatGLM-6B model on a smaller GPU than you would need to run GPT-3.

Hugging Face

Model Repositary

Research Paper

Model License

Model Highlights

ChatGLM-6B is a large language model with a training cost of approximately $1.5 million, making it one of the most expensive language models ever trained. This is significantly less than the estimated training cost of GPT-3, which is around $4.6 million. This cost difference is due to several factors, including the size of the dataset used to train the models, the number of parameters in the models, and the computational resources used to train the models. But, the model's business applications are potentially very lucrative. It can be used to create chatbots that can answer customer questions, provide product recommendations, and even generate creative content. ChatGLM-6B could also be used to develop new products and services, such as personalized chatbots or virtual assistants.

The model has a wider coverage of numeral facts than other LLM models. So, it can answer a wider range of queries about numbers, including more obscure and challenging queries.
The model is more robust to noise and errors than other LLM models, which means it can still answer queries about numbers correctly even if they are not well-formed or contain errors.

Training Details

Training Dataset

The ChatGLM-6B model has been extensively trained on a vast corpus comprising approximately 1 trillion tokens encompassing both Chinese and English languages. The training process incorporates several techniques, including supervised fine-tuning, feedback bootstrap, and reinforcement learning with human feedback.

Training Procedure

The model's training is done using the GLM framework in three different phases. Firstly, supervised fine-tuning exposes the model to human-labeled dialogs to understand conversational patterns and rules. Next, feedback bootstrapping utilizes human evaluators to enhance the model's ability to generate natural and informative responses. Finally, reinforcement learning with human feedback incentivizes the model to produce highly-rated responses, refining its capability to generate engaging and informative outputs.

Training Observation

During the ten-day training period on eight TPU v3-8 machines, the model learned conversational patterns, received feedback from human evaluators, and was rewarded for generating highly-rated responses. Afterward, the model was quantized to INT8 and INT4 for improved deployment efficiency.

Hardware Requirements

Below given are the hardware requirements of the different model types:

Model Type	Quantization Level	Minimum GPU Memory (Inference)	Minimum GPU Memory (Efficient Parameter Fine-tuning)
ChatGLM-6B-FP16	FP16 (no quantization)	13 GB	14 GB
ChatGLM-6B-INT8	INT8	8 GB	9 GB
ChatGLM-6B-INT4	INT4	6 GB	7 GB

Limitations and Bias

The performance of ChatGLM-6B is constrained by its compact size, leading to several inherent limitations in its capabilities.

Limitations due to the small size of ChatGLM-6B

The model's small capacity results in relatively weak model memory and language capabilities. It may generate incorrect information when confronted with factual knowledge tasks, but it performs comparatively better in solving logical problems like mathematics and programming.

Generation of harmful instructions or biased content

As ChatGLM-6B is a language model aligned with human intent, there is a risk of generating harmful or biased content, which may include offensive material.

Insufficient English proficiency

Most of ChatGLM-6B's training data consists of Chinese instructions and answers, with only limited English content. Consequently, when providing instructions in English, the quality of the model's responses significantly declines, sometimes even contradicting the content of Chinese instructions. This language mixture poses challenges.

Vulnerability to being misled and weak dialogue ability

ChatGLM-6B's dialogue skills still have room for improvement. It can be easily misled and produce inaccurate remarks due to issues with "self-perception." For instance, the current version of the model may exhibit biases in self-perception when misguided.

Using the Model

To generate a dialogue, use the following code as an example:


>>> from transformers import AutoTokenizer, AutoModel
>>> tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True)
>>> model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).half().cuda()
>>> model = model.eval()
>>> response, history = model.chat(tokenizer, "你好", history=[])
>>> print(response)
你好👋!我是人工智能助手 ChatGLM-6B,很高兴见到你,欢迎问我任何问题。
>>> response, history = model.chat(tokenizer, "晚上睡不着应该怎么办", history=history)
>>> print(response)
晚上睡不着可能会让你感到焦虑或不舒服,但以下是一些可以帮助你入睡的方法:

1. 制定规律的睡眠时间表:保持规律的睡眠时间表可以帮助你建立健康的睡眠习惯,使你更容易入睡。尽量在每天的相同时间上床,并在同一时间起床。
2. 创造一个舒适的睡眠环境:确保睡眠环境舒适,安静,黑暗且温度适宜。可以使用舒适的床上用品,并保持房间通风。
3. 放松身心:在睡前做些放松的活动,例如泡个热水澡,听些轻柔的音乐,阅读一些有趣的书籍等,有助于缓解紧张和焦虑,使你更容易入睡。
4. 避免饮用含有咖啡因的饮料:咖啡因是一种刺激性物质,会影响你的睡眠质量。尽量避免在睡前饮用含有咖啡因的饮料,例如咖啡,茶和可乐。
5. 避免在床上做与睡眠无关的事情:在床上做些与睡眠无关的事情,例如看电影,玩游戏或工作等,可能会干扰你的睡眠。
6. 尝试呼吸技巧:深呼吸是一种放松技巧,可以帮助你缓解紧张和焦虑,使你更容易入睡。试着慢慢吸气,保持几秒钟,然后缓慢呼气。

如果这些方法无法帮助你入睡,你可以考虑咨询医生或睡眠专家,寻求进一步的建议。

To acquire models from the Hugging Face Hub, it is necessary to install Git LFS.

If the checkpoint download from the Hugging Face Hub proceeds slowly, it may limit the ability to obtain the model fully.

Other InstructEval Models

Falcon 7B Instruct

Falcon-7B-Instruct is a 7B parameter causal decoder-only model built by TII based on Falcon-7B and finetuned on a mixture of chat/instruct datasets.

Alpaca LoRA

Alpaca LoRA is a 65B parameter LLM that has undergone quantization to 4 bits, resulting in a smaller and more efficient model compared to other LLMs.

StableVicuna

StableVicuna-13B-HF represents an LLM model that has undergone meticulous fine-tuning through reinforcement learning from human feedback (RLHF).

White Papers

Products

MENU

ChatGLM-6B

InstructEval Models Explained,
ChatGLM-6B

100+ Technical Experts

50 Custom AI projects

4.8 Minimum Rating

An Overview of ChatGLM-6B

ChatGLM-6B was trained on a cluster of 1,000 GPUs, while GPT-3 was trained on a cluster of 5,000 GPUs.

Large Size

The model could answer 95% of queries about numbers correctly, compared to 85% for the GPT-3 family of models.

Multilingual

The ChatGLM-6B model is also more fluent than other LLM models at answering queries about numbers.

Efficient

Model Details

Model Highlights

Training Details

Training Dataset

Training Procedure

Training Observation

Hardware Requirements

Limitations and Bias

Limitations due to the small size of ChatGLM-6B

Generation of harmful instructions or biased content

Insufficient English proficiency

Vulnerability to being misled and weak dialogue ability

Using the Model

To generate a dialogue, use the following code as an example:

To acquire models from the Hugging Face Hub, it is necessary to install Git LFS.

If the checkpoint download from the Hugging Face Hub proceeds slowly, it may limit the ability to obtain the model fully.

Other InstructEval Models

Falcon 7B Instruct

Alpaca LoRA

StableVicuna

White Papers

Products

MENU

InstructEval Models Explained,ChatGLM-6B

100+ Technical Experts

50 Custom AI projects

4.8 Minimum Rating

An Overview of ChatGLM-6B

ChatGLM-6B was trained on a cluster of 1,000 GPUs, while GPT-3 was trained on a cluster of 5,000 GPUs.

Large Size

The model could answer 95% of queries about numbers correctly, compared to 85% for the GPT-3 family of models.

Multilingual

The ChatGLM-6B model is also more fluent than other LLM models at answering queries about numbers.

Efficient

Model Details

Model Highlights

Training Details

Training Dataset

Training Procedure

Training Observation

Hardware Requirements

Limitations and Bias

Limitations due to the small size of ChatGLM-6B

Generation of harmful instructions or biased content

Insufficient English proficiency

Vulnerability to being misled and weak dialogue ability

Using the Model

To generate a dialogue, use the following code as an example:

To acquire models from the Hugging Face Hub, it is necessary to install Git LFS.

If the checkpoint download from the Hugging Face Hub proceeds slowly, it may limit the ability to obtain the model fully.

Other InstructEval Models

Falcon 7B Instruct

Alpaca LoRA

StableVicuna

InstructEval Models Explained,
ChatGLM-6B