llm.models.bert
Utilities for loading BERT models from HuggingFace.
Source: ColossalAI-Examples
BERT_BASE
module-attribute
¶
BERT_BASE = dict(
attention_probs_dropout_prob=0.1,
hidden_act="gelu_new",
hidden_dropout_prob=0.1,
hidden_size=768,
initializer_range=0.02,
intermediate_size=3072,
max_position_embeddings=512,
num_attention_heads=12,
num_hidden_layers=12,
type_vocab_size=2,
vocab_size=30522,
)
BERT-base HuggingFace configuration.
BERT_LARGE
module-attribute
¶
BERT_LARGE = dict(
attention_probs_dropout_prob=0.1,
hidden_act="gelu_new",
hidden_dropout_prob=0.1,
hidden_size=1024,
initializer_range=0.02,
intermediate_size=4096,
max_position_embeddings=512,
num_attention_heads=16,
num_hidden_layers=24,
type_vocab_size=2,
vocab_size=30522,
)
BERT-large HuggingFace configuration.
from_config
¶
from_config(
config: dict[str, Any],
checkpoint_gradients: bool = False,
) -> BertForPreTraining
Load a BERT model from the configuration.
Parameters:
-
config
(dict[str, Any]
) –BERT configuration.
-
checkpoint_gradients
(bool
, default:False
) –Enable gradient checkpointing.
Returns:
-
BertForPreTraining
–BERT model.