FREE MEETING: KEY TRENDS AND RISKS IN NFT GAMES– REGISTER

Crypto Cipherium
  • Home
  • News
    Bull of the Day: Caterpillar (CAT)
    Market

    Bull of the Day: Caterpillar (CAT)

    Caterpillar Firm OverviewZacks Rank #1 (Sturdy Purchase) firm Caterpillar (CAT), identified for…

    By Editor
    June 15, 2026
    UBS raises Marriott inventory value goal on Q1 outcomes, RevPAR outlook
    Business
    UBS raises Marriott inventory value goal on Q1 outcomes, RevPAR outlook
    Bull of the Day: Caterpillar (CAT)
    Market
    After a Document-Breaking Debut, Is There Nonetheless Room to Run in SpaceX?
    Bitcoin recovers to ,000, however one analyst warns of a ‘dead-cat bounce’
    Business
    Bitcoin recovers to $66,000, however one analyst warns of a ‘dead-cat bounce’
    Bull of the Day: Caterpillar (CAT)
    Market
    Bear of the Day: RH (RH)
  • Stock Market
    Stock MarketShow More
    US-Iran Hormuz Deal Sends Oil Decrease
    US-Iran Hormuz Deal Sends Oil Decrease
    June 15, 2026
    SpaceX IPO raises whole of .7 billion as underwriters train ‘greenshoe’ overallotment choice
    SpaceX IPO raises whole of $85.7 billion as underwriters train ‘greenshoe’ overallotment choice
    June 15, 2026
    BingX Opens a  Million Inventory Buying and selling Carnival to Carry World Inventory Markets to Crypto Merchants
    BingX Opens a $1 Million Inventory Buying and selling Carnival to Carry World Inventory Markets to Crypto Merchants
    June 15, 2026
    NAHB US June house builder sentiment 35 vs 37 anticipated
    NAHB US June house builder sentiment 35 vs 37 anticipated
    June 15, 2026
    Ingersoll Rand Is Barely Extra Attention-grabbing Now (Ranking Improve) (NYSE:IR)
    Ingersoll Rand Is Barely Extra Attention-grabbing Now (Ranking Improve) (NYSE:IR)
    June 15, 2026
  • Blockchain
    BlockchainShow More
    US-Iran Peace Pact Sparks Excessive Confidence July 31 Settlement on Polymarket
    US-Iran Peace Pact Sparks Excessive Confidence July 31 Settlement on Polymarket
    June 15, 2026
    Oil Sanction Aid Heads Trump-Iran Market as June 30 Proximity Boosts Odds
    Oil Sanction Aid Heads Trump-Iran Market as June 30 Proximity Boosts Odds
    June 15, 2026
    Iran World Cup progress priced excessive as markets observe knockout odds
    Iran World Cup progress priced excessive as markets observe knockout odds
    June 15, 2026
    Barron’s information shift nudges Polymarket odds as 2028 race stays energetic
    Barron’s information shift nudges Polymarket odds as 2028 race stays energetic
    June 15, 2026
    No Transfer on Trump Exit by June 30 Positive factors Floor Regardless of Excessive Polymarket Odds
    No Transfer on Trump Exit by June 30 Positive factors Floor Regardless of Excessive Polymarket Odds
    June 15, 2026
  • Market Analysis
    Market Analysis
    Show More
    Top News
    Oppenheimer Raises PT on KLA (KLAC) Inventory to ,900
    Oppenheimer Raises PT on KLA (KLAC) Inventory to $1,900
    March 19, 2026
    Cracker Barrel shareholders elect 9 of ten board nominees together with CEO
    Cracker Barrel shareholders elect 9 of ten board nominees together with CEO
    November 20, 2025
    Bull of the Day: Caterpillar (CAT)
    Shopify (SHOP) Name Choice Unfold Garners a 33% Return Potential
    March 20, 2026
    Latest News
    Bull of the Day: Caterpillar (CAT)
    June 15, 2026
    UBS raises Marriott inventory value goal on Q1 outcomes, RevPAR outlook
    June 15, 2026
    After a Document-Breaking Debut, Is There Nonetheless Room to Run in SpaceX?
    June 15, 2026
    Bitcoin recovers to $66,000, however one analyst warns of a ‘dead-cat bounce’
    June 15, 2026
Reading: FlashAttention-4 Hits 1,605 TFLOPS on NVIDIA Blackwell GPUs
Share
Crypto CipheriumCrypto Cipherium
Font ResizerAa
Search
  • Home
  • News
    • NFT
    • Mining
  • Stock Market
    • Bitcoin
    • Ethereum
    • Forex
    • Tether
  • Blockchain
  • Market
    • Business
    • Money
Have an existing account? Sign In
Follow US
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms of Service
2025 © Crypto Cipherium. All Rights Reserved.
Blockchain

FlashAttention-4 Hits 1,605 TFLOPS on NVIDIA Blackwell GPUs

Editor
Last updated: January 22, 2026 11:12 pm
Editor
Published: January 22, 2026
Share
FlashAttention-4 Hits 1,605 TFLOPS on NVIDIA Blackwell GPUs


Contents
  • What the Numbers Present
  • {Hardware}-Software program Co-Design
  • Manufacturing Integration


Alvin Lang
Jan 22, 2026 23:03

NVIDIA’s FlashAttention-4 achieves 71% {hardware} effectivity on Blackwell chips, delivering 3.6x speedup over FA2 for AI coaching workloads.





NVIDIA has launched FlashAttention-4, the most recent optimization for transformer neural networks that squeezes 1,605 TFLOPS out of its Blackwell structure—capturing 71% of the {hardware}’s theoretical most efficiency.

The announcement issues for anybody watching AI infrastructure investments. As giant language fashions push towards longer context home windows, the eye mechanism’s quadratic reminiscence complexity turns into a brutal bottleneck. FlashAttention-4 assaults this downside immediately, and the benchmark numbers counsel significant positive aspects for manufacturing AI workloads.

What the Numbers Present

On the B200 GPU, FA4 delivers a 3.6x speedup over FlashAttention-2 throughout ahead passes at 32,768 sequence size. Backward move efficiency hits 3.15x sooner than FA2 below the identical circumstances. In opposition to current frameworks, FA4 posts 1.3x enchancment over cuDNN and a pair of.4x over Triton Inference Server implementations.

The reminiscence effectivity positive aspects are equally important. Commonplace consideration scales at O(N²) with sequence size—that means doubling your context window quadruples reminiscence necessities. FA4 brings this right down to O(N) by tiling and incremental softmax normalization. NVIDIA claims 20x decrease reminiscence utilization in comparison with PyTorch baselines.

{Hardware}-Software program Co-Design

FA4 was constructed particularly for Blackwell’s quirks. The structure presents an uneven scaling downside: compute energy roughly doubles whereas reminiscence bandwidth would not preserve tempo. Conventional approaches depart tensor cores sitting idle whereas ready for information.

The answer leverages Blackwell’s devoted Tensor Reminiscence (TMEM)—256 KB of on-chip reminiscence per streaming multiprocessor. By storing intermediate calculations immediately in TMEM as a substitute of shared reminiscence, FA4 sidesteps the bandwidth bottleneck that might in any other case throttle the sooner compute models.

Bigger tile sizes (as much as 128×128) and deeper pipelines preserve the {hardware} busy. The backward move—usually the slower half of coaching—advantages from bypassing register accumulation completely.

Manufacturing Integration

Main inference frameworks together with SGLang and vLLM already assist FA4 prefill operations. NVIDIA has included these strategies into cuDNN 9.14, making the optimizations accessible to builders with out customized kernel work.

For AI corporations burning by compute budgets, the effectivity positive aspects translate on to price financial savings. A 3x+ speedup on coaching passes means both sooner iteration cycles or the flexibility to coach bigger fashions inside current infrastructure constraints.

The broader development right here: as transformer fashions develop, algorithmic effectivity on the kernel stage turns into as vital as uncooked {hardware} functionality. FlashAttention-4 represents the present frontier of that optimization work.

Picture supply: Shutterstock


AAVE Worth Prediction: Impartial Restoration Targets $135-140 by March 2026
Ethereum Breaks $3K as Vitalik Pushes “World Laptop” Imaginative and prescient
BitMine Buys $201M Ethereum As Whales Double Down On ETH
Trump Household Has Profited At Least $1B From Crypto: FT
Harvey Integrates NetDocuments for Enhanced Authorized Doc Administration

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
[mc4wp_form]
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Email Copy Link Print
Previous Article Home approves remaining spending payments Home approves remaining spending payments
Next Article Dogecoin Worth Prediction as 21Shares Declares DOGE ETF Dogecoin Worth Prediction as 21Shares Declares DOGE ETF
Leave a Comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Follow US

Find US on Socials
FacebookLike
XFollow
YoutubeSubscribe
TelegramFollow
Popular News
Success Story: Charles Tyler’s Studying Journey with 101 Blockchains
Success Story: Charles Tyler’s Studying Journey with 101 Blockchains
Key Advantages, Use Circumstances, And Developments
Key Advantages, Use Circumstances, And Developments
The Innovation Hub Playbook: Constructing a Digital Ecosystem for the Recent Meals Chain
The Innovation Hub Playbook: Constructing a Digital Ecosystem for the Recent Meals Chain

Follow Us on Socials

We use social media to react to breaking news, update supporters and share information

Facebook X-twitter Youtube
Crypto Cipherium

We influence 20 million users and is the number one business blockchain and crypto news network on the planet.

Topics

  • About Us
  • Contact Us
  • Privacy Policy
  • Terms of Service
Reading: FlashAttention-4 Hits 1,605 TFLOPS on NVIDIA Blackwell GPUs
Share
2025 © Crypto Cipherium. All Rights Reserved.
  • bitcoinBitcoin(BTC)$66,621.003.98%
  • ethereumEthereum(ETH)$1,815.249.23%
  • tetherTether(USDT)$1.00-0.02%
  • binancecoinBNB(BNB)$626.173.23%
  • rippleXRP(XRP)$1.259.95%
  • usd-coinUSDC(USDC)$1.000.01%
  • solanaSolana(SOL)$73.849.63%
  • tronTRON(TRX)$0.3187730.36%
  • Figure HelocFigure Heloc(FIGR_HELOC)$1.010.00%
  • HyperliquidHyperliquid(HYPE)$67.9012.38%
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?