Notion Slashes AI Embedding Prices 80% After Ditching Spark for Ray

Contents

What Notion Really Modified
Uber and Salesforce Report Related Positive aspects
Why This Issues Past These Corporations
What’s Subsequent

James Ding
Apr 09, 2026 16:48

Notion migrated from Spark on EMR to Ray, slicing embedding prices 80% and bettering question latency 10x. Uber and Salesforce shared comparable AI infrastructure wins.

Notion has slashed its AI embedding pipeline prices by greater than 80% after migrating from Apache Spark to Ray, the distributed computing framework backed by Anyscale. The productiveness software program firm additionally achieved 10x enhancements in question latency whereas consolidating three separate jobs per area into one.

The migration particulars emerged at Ray Day Seattle on April 9, 2026, the place ML engineers from Notion, Uber, Salesforce, and Apple shared hard-won classes about scaling AI infrastructure.

What Notion Really Modified

Mickey Liu, a software program engineer on Notion’s search platform workforce, walked by way of the overhaul. Their authentic setup used a three-step Spark pipeline working on Amazon EMR: information chunking, third-party API requires embedding technology, and writes to a vector retailer.

The ache factors have been predictable however extreme. Double compute prices. Third-party API charge limits throttling throughput. Debugging nightmares when failures occurred throughout instruments—driver and executor logs weren’t even endured in YARN.

The brand new structure streams Kafka information straight right into a Ray cluster dealing with CPU chunking, GPU embedding technology, and vector retailer writes in a single pipeline. No intermediate S3 handoffs. What began because the backend for a Q&A function in 2023 now powers all of Notion AI and customized brokers.

Uber’s Peng Zhang detailed how their Michelangelo ML platform developed from TensorFlow/Horovod to Ray with PyTorch. The standout transfer: separating CPU data-loading nodes from GPU coaching nodes in a heterogeneous cluster design. Outcome? GPU utilization jumped 20%, and coaching time dropped roughly 50% in choose pipelines.

Salesforce tackled a distinct beast—summarizing paperwork as much as 200,000 tokens lengthy (roughly a brief novel) with P95 latency below 15 seconds. Their workforce used Ray to chunk paperwork and run parallel inference throughout a distributed actor pool with vLLM, then merge outcomes. They landed on 1-2 GPU information parallelism because the candy spot after working scaling experiments straight on Ray.

Why This Issues Past These Corporations

Robert Nishihara, Ray’s co-creator and Anyscale co-founder, opened the occasion by framing the core drawback: AI infrastructure retains getting tougher. Multimodal information processing, reinforcement studying workloads, and multi-node LLM inference are pushing current instruments previous their limits.

Each speaker landed on the identical conclusion from completely different angles—their earlier tooling ran out of highway.

Apple engineers Charlie Chen and Haocheng Bian highlighted basis mannequin coaching challenges: large unstructured information, billion-plus parameters, and sparse architectures like Combination of Consultants. Conventional engines fail as a result of information pipelines and coaching frameworks run in separate environments with no shared context.

What’s Subsequent

Ray Day Seattle kicked off Anyscale’s 2026 “Ray on the Street” tour—eight cities throughout three nations. The corporate can be working invite-only buyer roundtables at every cease to preview their product roadmap.

For groups hitting comparable partitions with Spark or different distributed frameworks, Notion’s full technical writeup is obtainable on their engineering weblog below “Two Years of Vector Search at Notion.” The 80% value discount and 10x latency enchancment supply a concrete benchmark for anybody evaluating comparable migrations.

Picture supply: Shutterstock

US aircraft carrying workforce for talks with Iran lands in Islamabad, two Pakistani sources say

Eos Vitality Enterprises Surges on Preliminary Income Steering and Document Shipments

New Jersey leads nation in dwelling worth development at practically 6%, report finds

US, Iran set for peace talks however doubts emerge over Lebanon, sanctions

What single {couples} shopping for a home collectively have to know

Notion Slashes AI Embedding Prices 80% After Ditching Spark for Ray

What Notion Really Modified

Why This Issues Past These Corporations

What’s Subsequent

Leave a Reply Cancel reply

Follow US

Popular News

Success Story: Charles Tyler’s Studying Journey with 101 Blockchains

Key Advantages, Use Circumstances, And Developments

The Innovation Hub Playbook: Constructing a Digital Ecosystem for the Recent Meals Chain

Follow Us on Socials

We influence 20 million users and is the number one business blockchain and crypto news network on the planet.

Topics

What Notion Really Modified

Uber and Salesforce Report Related Positive aspects

Why This Issues Past These Corporations

What’s Subsequent

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

Follow US

Popular News

Topics