OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

[ad_1]

The reproducibility and transparency of huge language fashions are essential for advancing open analysis, making certain the trustworthiness of outcomes, and enabling investigations into information and mannequin biases, in addition to potential dangers. To this finish, we launch OpenELM, a state-of-the-art open language mannequin. OpenELM makes use of a layer-wise scaling technique to effectively allocate parameters inside every layer of the transformer mannequin, resulting in enhanced accuracy. For instance, with a parameter finances of roughly one billion parameters, OpenELM reveals a 2.36% enchancment in accuracy in comparison with OLMo whereas requiring 2 occasions fewer pre-training tokens.

Diverging from prior practices that solely present mannequin weights and inference code, and pre-train on non-public datasets, our launch contains the entire framework for coaching and analysis of the language mannequin on publicly obtainable datasets, together with coaching logs, a number of checkpoints, and pre-training configurations. We additionally launch code to transform fashions to MLX library for inference and fine-tuning on Apple units. This complete launch goals to empower and strengthen the open analysis neighborhood, paving the best way for future open analysis endeavors.

[ad_2]

Source link

OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

AI Assistant Users Could Develop An ‘Emotional Attachment’ to Them, Google Warns

Cosmos Developers Patch Critical Flaw in IBC Protocol, Safeguarding $126 Million in Assets

Cosmos Developers Patch Critical Flaw in IBC Protocol, Safeguarding $126 Million in Assets

PayPal's Reward System For "Green Miners"

KDk: A Novel Machine Learning Framework that Protects Vertical Federated Learning from All the Known Types of Label Inference Attacks with Very High Performance

Leave a Reply Cancel reply

CATEGORIES

SITE MAP