Saturday, July 12, 2025
No Result
View All Result
Blockchain Broadcast
  • Home
  • Bitcoin
  • Crypto Updates
    • General
    • Altcoin
    • Ethereum
    • Crypto Exchanges
  • NFT
  • Blockchain
  • Metaverse
  • DeFi
  • Web3
  • Analysis
  • Regulations
  • Scam Alert
Crypto Marketcap
Blockchain Broadcast
  • Home
  • Bitcoin
  • Crypto Updates
    • General
    • Altcoin
    • Ethereum
    • Crypto Exchanges
  • NFT
  • Blockchain
  • Metaverse
  • DeFi
  • Web3
  • Analysis
  • Regulations
  • Scam Alert
No Result
View All Result
Blockchain Broadcast
No Result
View All Result

NVIDIA NeMo-RL Utilizes GRPO for Advanced Reinforcement Learning

July 10, 2025
in Blockchain
Reading Time: 2 mins read
0 0
A A
0
Home Blockchain
Share on FacebookShare on Twitter




Peter Zhang
Jul 10, 2025 06:07

NVIDIA introduces NeMo-RL, an open-source library for reinforcement studying, enabling scalable coaching with GRPO and integration with Hugging Face fashions.





NVIDIA has unveiled NeMo-RL, a cutting-edge open-source library designed to reinforce reinforcement studying (RL) capabilities, in accordance with NVIDIA’s official weblog. The library helps scalable mannequin coaching, starting from single-GPU prototypes to large thousand-GPU deployments, and integrates seamlessly with widespread frameworks like Hugging Face.

NeMo-RL’s Structure and Options

NeMo-RL is part of the broader NVIDIA NeMo Framework, identified for its versatility and high-performance capabilities. The library contains native integration with Hugging Face fashions, optimized coaching, and inference processes. It helps widespread RL algorithms reminiscent of DPO and GRPO and employs Ray-based orchestration for effectivity.

The structure of NeMo-RL is designed with flexibility in thoughts. It helps numerous coaching and rollout backends, making certain that high-level algorithm implementations stay agnostic to backend specifics. This design permits for the seamless scaling of fashions with out the necessity for algorithm code modifications, making it splendid for each small-scale and large-scale deployments.

Implementing DeepScaleR with GRPO

The weblog publish explores the appliance of NeMo-RL to breed a DeepScaleR-1.5B recipe utilizing the Group Relative Coverage Optimization (GRPO) algorithm. This includes coaching high-performing reasoning fashions, reminiscent of Qwen-1.5B, to compete with OpenAI’s O1 benchmark on the AIME24 tutorial math problem.

The coaching course of is structured in three steps, every growing the utmost sequence size used: beginning at 8K, then 16K, and at last 24K. This gradual improve helps handle the distribution of rollout sequence lengths, optimizing the coaching course of.

Coaching Course of and Analysis

The coaching setup includes cloning the NeMo-RL repository and putting in mandatory packages. Coaching is carried out in phases, with the mannequin evaluated repeatedly to make sure efficiency benchmarks are met. The outcomes demonstrated that NeMo-RL achieved a coaching reward of 0.65 in solely 400 steps.

Analysis on the AIME24 benchmark confirmed that the skilled mannequin surpassed OpenAI O1, highlighting the effectiveness of NeMo-RL when mixed with the GRPO algorithm.

Getting Began with NeMo-RL

NeMo-RL is obtainable for open-source use, offering detailed documentation and instance scripts on its GitHub repository. This useful resource is good for these seeking to experiment with reinforcement studying utilizing scalable and environment friendly strategies.

The library’s integration with Hugging Face and its modular design make it a strong instrument for researchers and builders looking for to leverage superior RL methods of their initiatives.

Picture supply: Shutterstock



Source link

Tags: AdvancedGRPOLearningNeMoRLNVIDIAReinforcementUtilizes
Previous Post

The Future of Equities on the Blockchain

Next Post

Remixpoint Commits $215 Million to Bitcoin, Targets 3,000 BTC Reserve

Related Posts

Algorand (ALGO) Gains Momentum: Staking Expansion, Interoperability Boost, and Market Insights
Blockchain

Algorand (ALGO) Gains Momentum: Staking Expansion, Interoperability Boost, and Market Insights

July 12, 2025
Hacker Slips Malicious Code Into Ethereum Dev Tool ETHcode
Blockchain

Hacker Slips Malicious Code Into Ethereum Dev Tool ETHcode

July 11, 2025
Bitcoin (BTC) Sees Supply Tightening Amid Accumulation and Volatility Trends
Blockchain

Bitcoin (BTC) Sees Supply Tightening Amid Accumulation and Volatility Trends

July 11, 2025
Viral Spotify Band The Velvet Sundown Admits It’s 100% AI
Blockchain

Viral Spotify Band The Velvet Sundown Admits It’s 100% AI

July 10, 2025
Announcement – Certified Cryptocurrency Professional (CCP)â„¢ Certification Launched
Blockchain

Announcement – Certified Cryptocurrency Professional (CCP)â„¢ Certification Launched

July 10, 2025
Impersonator Uses AI to Mimic Rubio to Contacts Officials
Blockchain

Impersonator Uses AI to Mimic Rubio to Contacts Officials

July 9, 2025
Next Post
Remixpoint Commits 5 Million to Bitcoin, Targets 3,000 BTC Reserve

Remixpoint Commits $215 Million to Bitcoin, Targets 3,000 BTC Reserve

Australia’s Tokenization Push Could Cement ‘Even Greater Financial Control’

Australia’s Tokenization Push Could Cement ‘Even Greater Financial Control’

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Facebook Twitter Instagram Youtube RSS
Blockchain Broadcast

Blockchain Broadcast delivers the latest cryptocurrency news, expert analysis, and in-depth articles. Stay updated on blockchain trends, market insights, and industry innovations with us.

CATEGORIES

  • Altcoin
  • Analysis
  • Bitcoin
  • Blockchain
  • Crypto Exchanges
  • Crypto Updates
  • DeFi
  • Ethereum
  • Metaverse
  • NFT
  • Regulations
  • Scam Alert
  • Uncategorized
  • Web3
No Result
View All Result

SITEMAP

  • About Us
  • Advertise With Us
  • Disclaimer
  • Privacy Policy
  • DMCA
  • Cookie Privacy Policy
  • Terms and Conditions
  • Contact Us

Copyright © 2024 Blockchain Broadcast.
Blockchain Broadcast is not responsible for the content of external sites.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
  • bitcoinBitcoin(BTC)$118,019.00-0.21%
  • ethereumEthereum(ETH)$2,969.64-1.54%
  • rippleXRP(XRP)$2.818.17%
  • tetherTether(USDT)$1.000.00%
  • binancecoinBNB(BNB)$692.340.37%
  • solanaSolana(SOL)$162.55-1.05%
  • usd-coinUSDC(USDC)$1.000.00%
  • dogecoinDogecoin(DOGE)$0.2007451.51%
  • tronTRON(TRX)$0.3045842.80%
  • staked-etherLido Staked Ether(STETH)$2,967.76-1.71%
No Result
View All Result
  • Home
  • Bitcoin
  • Crypto Updates
    • General
    • Altcoin
    • Ethereum
    • Crypto Exchanges
  • NFT
  • Blockchain
  • Metaverse
  • DeFi
  • Web3
  • Analysis
  • Regulations
  • Scam Alert

Copyright © 2024 Blockchain Broadcast.
Blockchain Broadcast is not responsible for the content of external sites.