FREE MEETING: KEY TRENDS AND RISKS IN NFT GAMES– REGISTER

Crypto Cipherium
  • Home
  • News
    Meta lobbies lawmakers for immunity from youngster hurt lawsuits: report
    Business

    Meta lobbies lawmakers for immunity from youngster hurt lawsuits: report

    Meta President Dina Powell McCormick and CEO of mikeroweWORKS Basis Mike Rowe…

    By Editor
    June 19, 2026
    Prosus stories as much as 28% rise in core headline earnings
    Business
    Prosus stories as much as 28% rise in core headline earnings
    Grocery chain pays huge nice, accused of inflated worth reporting
    Business
    Grocery chain pays huge nice, accused of inflated worth reporting
    Appeals court docket guidelines Ohio can implement social media parental consent regulation
    Business
    Appeals court docket guidelines Ohio can implement social media parental consent regulation
    Pentagon tells US lawmakers it wants  billion for Iran warfare and different payments, WSJ stories
    Business
    Pentagon tells US lawmakers it wants $80 billion for Iran warfare and different payments, WSJ stories
  • Stock Market
    Stock MarketShow More
    Rockstar Confirms GTA 6 Pre-Orders, Triggering Crypto Meme Mania
    Rockstar Confirms GTA 6 Pre-Orders, Triggering Crypto Meme Mania
    June 19, 2026
    Matt Damon Joins Ripple Swell As RLUSD Water.org Push Grows
    Matt Damon Joins Ripple Swell As RLUSD Water.org Push Grows
    June 19, 2026
    Chart Artwork: GBP/CHF Eyes Mid-Channel Bounce From Triple Assist
    Chart Artwork: GBP/CHF Eyes Mid-Channel Bounce From Triple Assist
    June 19, 2026
    UK’s Starmer faces management problem as rival Burnham turns into MP
    UK’s Starmer faces management problem as rival Burnham turns into MP
    June 19, 2026
    UK Could retail gross sales +1.2% vs +0.5% m/m anticipated
    UK Could retail gross sales +1.2% vs +0.5% m/m anticipated
    June 19, 2026
  • Blockchain
    BlockchainShow More
    xAI Launches Grok Add-In for Microsoft Phrase
    xAI Launches Grok Add-In for Microsoft Phrase
    June 19, 2026
    Claude Introduces Centralized Authorization for MCP Connectors
    Claude Introduces Centralized Authorization for MCP Connectors
    June 19, 2026
    AI Reshaping Authorized Memo Drafting: Key Insights for Attorneys
    AI Reshaping Authorized Memo Drafting: Key Insights for Attorneys
    June 19, 2026
    xAI Launches Grok Add-In for Microsoft Phrase
    Avalanche (AVAX) Unveils Funds Collective With 28 Main Corporations
    June 19, 2026
    Hong Kong Credit score Card Transactions Hit HK1.3B in Q1 2026
    Hong Kong Credit score Card Transactions Hit HK$311.3B in Q1 2026
    June 18, 2026
  • Market Analysis
    Market Analysis
    Show More
    Top News
    Trump selects longtime BLS economist Brett Matsumoto as new commissioner
    Trump selects longtime BLS economist Brett Matsumoto as new commissioner
    January 31, 2026
    Ex-NFL star Mark Sanchez charged with battery after being stabbed, Indianapolis police say
    Ex-NFL star Mark Sanchez charged with battery after being stabbed, Indianapolis police say
    October 4, 2025
    Underneath Armour (UAA) Strikes 7.5% Greater: Will This Energy Final?
    Underneath Armour (UAA) Strikes 7.5% Greater: Will This Energy Final?
    December 31, 2025
    Latest News
    Meta lobbies lawmakers for immunity from youngster hurt lawsuits: report
    June 19, 2026
    Prosus stories as much as 28% rise in core headline earnings
    June 19, 2026
    Grocery chain pays huge nice, accused of inflated worth reporting
    June 19, 2026
    Appeals court docket guidelines Ohio can implement social media parental consent regulation
    June 19, 2026
Reading: FlashAttention-4 Hits 1,605 TFLOPS on NVIDIA Blackwell GPUs
Share
Crypto CipheriumCrypto Cipherium
Font ResizerAa
Search
  • Home
  • News
    • NFT
    • Mining
  • Stock Market
    • Bitcoin
    • Ethereum
    • Forex
    • Tether
  • Blockchain
  • Market
    • Business
    • Money
Have an existing account? Sign In
Follow US
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms of Service
2025 © Crypto Cipherium. All Rights Reserved.
Blockchain

FlashAttention-4 Hits 1,605 TFLOPS on NVIDIA Blackwell GPUs

Editor
Last updated: January 22, 2026 11:12 pm
Editor
Published: January 22, 2026
Share
FlashAttention-4 Hits 1,605 TFLOPS on NVIDIA Blackwell GPUs


Contents
  • What the Numbers Present
  • {Hardware}-Software program Co-Design
  • Manufacturing Integration


Alvin Lang
Jan 22, 2026 23:03

NVIDIA’s FlashAttention-4 achieves 71% {hardware} effectivity on Blackwell chips, delivering 3.6x speedup over FA2 for AI coaching workloads.





NVIDIA has launched FlashAttention-4, the most recent optimization for transformer neural networks that squeezes 1,605 TFLOPS out of its Blackwell structure—capturing 71% of the {hardware}’s theoretical most efficiency.

The announcement issues for anybody watching AI infrastructure investments. As giant language fashions push towards longer context home windows, the eye mechanism’s quadratic reminiscence complexity turns into a brutal bottleneck. FlashAttention-4 assaults this downside immediately, and the benchmark numbers counsel significant positive aspects for manufacturing AI workloads.

What the Numbers Present

On the B200 GPU, FA4 delivers a 3.6x speedup over FlashAttention-2 throughout ahead passes at 32,768 sequence size. Backward move efficiency hits 3.15x sooner than FA2 below the identical circumstances. In opposition to current frameworks, FA4 posts 1.3x enchancment over cuDNN and a pair of.4x over Triton Inference Server implementations.

The reminiscence effectivity positive aspects are equally important. Commonplace consideration scales at O(N²) with sequence size—that means doubling your context window quadruples reminiscence necessities. FA4 brings this right down to O(N) by tiling and incremental softmax normalization. NVIDIA claims 20x decrease reminiscence utilization in comparison with PyTorch baselines.

{Hardware}-Software program Co-Design

FA4 was constructed particularly for Blackwell’s quirks. The structure presents an uneven scaling downside: compute energy roughly doubles whereas reminiscence bandwidth would not preserve tempo. Conventional approaches depart tensor cores sitting idle whereas ready for information.

The answer leverages Blackwell’s devoted Tensor Reminiscence (TMEM)—256 KB of on-chip reminiscence per streaming multiprocessor. By storing intermediate calculations immediately in TMEM as a substitute of shared reminiscence, FA4 sidesteps the bandwidth bottleneck that might in any other case throttle the sooner compute models.

Bigger tile sizes (as much as 128×128) and deeper pipelines preserve the {hardware} busy. The backward move—usually the slower half of coaching—advantages from bypassing register accumulation completely.

Manufacturing Integration

Main inference frameworks together with SGLang and vLLM already assist FA4 prefill operations. NVIDIA has included these strategies into cuDNN 9.14, making the optimizations accessible to builders with out customized kernel work.

For AI corporations burning by compute budgets, the effectivity positive aspects translate on to price financial savings. A 3x+ speedup on coaching passes means both sooner iteration cycles or the flexibility to coach bigger fashions inside current infrastructure constraints.

The broader development right here: as transformer fashions develop, algorithmic effectivity on the kernel stage turns into as vital as uncooked {hardware} functionality. FlashAttention-4 represents the present frontier of that optimization work.

Picture supply: Shutterstock


Bitcoin Tops $118K, Meme Cash Soar As Uptober Rally Ignites
OpenAI Acquires Promptfoo to Bolster Enterprise AI Safety Testing
NVIDIA and LG Companion on AI Manufacturing facility for Robotics and Mobility
Solana Value Up 1% As 21Shares Information SOL ETF, Cboe Approves It
ATOM Value Prediction: Targets $2.75 by February Amid Technical Breakout Setup

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
[mc4wp_form]
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Email Copy Link Print
Previous Article Home approves remaining spending payments Home approves remaining spending payments
Next Article Dogecoin Worth Prediction as 21Shares Declares DOGE ETF Dogecoin Worth Prediction as 21Shares Declares DOGE ETF
Leave a Comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Follow US

Find US on Socials
FacebookLike
XFollow
YoutubeSubscribe
TelegramFollow
Popular News
Success Story: Charles Tyler’s Studying Journey with 101 Blockchains
Success Story: Charles Tyler’s Studying Journey with 101 Blockchains
Key Advantages, Use Circumstances, And Developments
Key Advantages, Use Circumstances, And Developments
The Innovation Hub Playbook: Constructing a Digital Ecosystem for the Recent Meals Chain
The Innovation Hub Playbook: Constructing a Digital Ecosystem for the Recent Meals Chain

Follow Us on Socials

We use social media to react to breaking news, update supporters and share information

Facebook X-twitter Youtube
Crypto Cipherium

We influence 20 million users and is the number one business blockchain and crypto news network on the planet.

Topics

  • About Us
  • Contact Us
  • Privacy Policy
  • Terms of Service
Reading: FlashAttention-4 Hits 1,605 TFLOPS on NVIDIA Blackwell GPUs
Share
2025 © Crypto Cipherium. All Rights Reserved.
  • bitcoinBitcoin(BTC)$62,569.00-2.78%
  • ethereumEthereum(ETH)$1,694.54-2.92%
  • tetherTether(USDT)$1.000.02%
  • binancecoinBNB(BNB)$574.19-2.49%
  • usd-coinUSDC(USDC)$1.000.00%
  • rippleXRP(XRP)$1.12-4.43%
  • solanaSolana(SOL)$68.30-4.73%
  • tronTRON(TRX)$0.320242-0.11%
  • Figure HelocFigure Heloc(FIGR_HELOC)$1.01-0.76%
  • HyperliquidHyperliquid(HYPE)$66.63-6.68%
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?