llm.utils
HParamT
module-attribute
¶
Supported Hyperparameter types (i.e., JSON types).
create_summary_writer
¶
create_summary_writer(
tensorboard_dir: str,
hparam_dict: dict[str, HParamT] | None = None,
metrics: list[str] | None = None,
**writer_kwargs: Any
) -> SummaryWriter
Create a SummaryWriter instance for the run annotated with hyperparams.
https://github.com/pytorch/pytorch/issues/37738#issuecomment-1124497827
Parameters:
-
tensorboard_dir(str) –TensorBoard run directory.
-
hparam_dict(dict[str, HParamT] | None, default:None) –Optional hyperparam dictionary to log alongside metrics.
-
metrics(list[str] | None, default:None) –Optional list of metric tags that will be used with
writer.add_scalar()(e.g.,['train/loss', 'train/lr']). Must be provided ifhparam_dictis provided. -
writer_kwargs(Any, default:{}) –Additional keyword arguments to pass to
SummaryWriter.
Returns:
-
SummaryWriter–Summary writer instance.
Source code in llm/utils.py
get_filepaths
¶
get_filepaths(
directory: Path | str,
extensions: list[str] | None = None,
recursive: bool = False,
) -> list[str]
Get list of filepaths in directory.
Note
Only files (not sub-directories will be returned. Though
sub-directories will be recursed into if recursive=True.
Parameters:
-
directory(Path | str) –Pathlike object with the directory to search.
-
extensions(list[str] | None, default:None) –Pptionally only return files that match these extensions. Each extension should include the dot. E.g.,
['.pdf', '.txt']. Match is case sensitive. -
recursive(bool, default:False) –Recursively search sub-directories.
Returns:
Source code in llm/utils.py
gradient_accumulation_steps
¶
gradient_accumulation_steps(
global_batch_size: int,
local_batch_size: int,
world_size: int,
) -> int
Compute the gradient accumulation steps from the configuration.
Parameters:
-
global_batch_size(int) –Target global/effective batch size.
-
local_batch_size(int) –Per rank batch size.
-
world_size(int) –Number of ranks.
Returns:
-
int–Gradient accumulation steps needed to achieve the
global_batch_size.
Raises:
-
ValueError–If the resulting gradient accumulation steps would be fractional.
Source code in llm/utils.py
init_logging
¶
init_logging(
level: int | str = logging.INFO,
logfile: Path | str | None = None,
rich: bool = False,
distributed: bool = False,
) -> None
Configure global logging.
Parameters:
-
level(int | str, default:INFO) –Default logging level.
-
logfile(Path | str | None, default:None) –Optional path to write logs to.
-
rich(bool, default:False) –Use rich for pretty stdout logging.
-
distributed(bool, default:False) –Configure distributed formatters and filters.
Source code in llm/utils.py
log_step
¶
log_step(
logger: Logger,
step: int,
*,
fmt_str: str | None = None,
log_level: int = logging.INFO,
ranks: Iterable[int] = (0,),
skip_tensorboard: Iterable[str] = (),
tensorboard_prefix: str = "train",
writer: SummaryWriter | None = None,
**kwargs: Any
) -> None
Log a training step.
Parameters:
-
logger(Logger) –Logger instance to log to.
-
step(int) –Training step.
-
fmt_str(str | None, default:None) –Format string used to format parameters for logging.
-
log_level(int, default:INFO) –Level to log the parameters at.
-
ranks(Iterable[int], default:(0,)) –Ranks to log on (default to rank 0 only).
-
skip_tensorboard(Iterable[str], default:()) –List of parameter names to skip logging to TensorBoard.
-
tensorboard_prefix(str, default:'train') –Prefix for TensorBoard parameters.
-
writer(SummaryWriter | None, default:None) –TensorBoard summary writer.
-
kwargs(Any, default:{}) –Additional keyword arguments to log.