Anyscale Nabs $100M, Unleashes Parallel, Serverless Computing in the Cloud
With a fresh $100 million in the bank and $1 billion valuation, UC Berkeley’s RISELab alum Anyscale is now set to scale up its business as the latest data unicorn. The company also announced the general availability of its service, which allows developers to scale data and AI applications developed on a laptop to run on AWS and Google Cloud in a distributed and serverless manner.
Anyscale was founded in 2019 by Robert Nishihara, Philipp Moritz, Ion Stoica, and Michael I. Jordan, with a plan to build a commercial outfit around Ray. Ray, of course, is the runtime framework created by Nishihara and Moritz at RISELab (under the supervision of Stoica and Jordan) for simplifying distributed computing.
The open source Ray framework carries the promise of enabling a developer to take a program she wrote on her laptop and scale it out to run on an arbitrary number nodes in a distributed manner, without the work or expertise typically required to accomplish that. Parallel computing has long been a stumbling block for scaling big data and AI applications (not to mention HPC), and Ray provides a simplified path forward.
“There’s a huge gap between what it takes to write a program on your laptop and what it takes to write a scalable program that runs across hundreds of machines,” Nishihara, the CEO of Anyscale, told Datanami earlier this year. “The latter takes a huge amount of expertise…We are trying to make it so if you know how to program on your laptop, then that’s enough.”
Ray has spread quickly over the past couple of years, and is now the fastest growing open source project in distributed AI, with more than 18,000 GitHub stars, Anyscale says. The software is used by thousands of organizations globally, including Amazon, Ant Group, LinkedIn, Shopify, Uber and Visa, the company says.
Anyscale has been in preview mode for the past year or so, but that changes today, as the company is set to announce the general availability of its service on AWS and Google Cloud, with plans to support Microsoft Azure in 2022.
Customers that do the work of integrating their applications “deeply” into Ray–which involves using the Ray APIs or using Ray libraries–have the benefit of being able to run their applications in a serverless manner on AWS and Google Cloud. Ray also supports a less difficult “shallow” integration path that requires only about a day’s worth of work to enable the application to run in a parallel manner (but it doesn’t bring the same benefits as deep integration).
“You do need to write your application against the Ray APIs or against the native Ray libraries (i.e., deeply integrate it with Ray) for the application to run in a serverless manner on Anyscale,” Stoica and Nishihara tell Datanami via email. “There is no need to specify a cluster in order to run the application. Anyscale will automatically create the cluster, autoscale it to meet the application’s demands, and then tear down the cluster once the application finishes.”
Ray supports a couple of dozen languages, including TensorFlow, PyTorch, Scikit Learn, XGBoost, Horovod, Hugging Face, and Dask. The work has already been done to integrate many of these in a deep manner, which Anyscale expects to be the route most developers eventually chose.
“Note that most of the applications and libraries use deep integration,” Stoica and Nishihara continue. “We expect shallow integration to be only used as a stop gap, as it gives developers the ability to integrate their distributed application with Ray in a few hours and experiment with it. However, we expect that eventually developers will leverage the deep integration to take advantage of all Ray features like autoscaling and fault tolerance.”
As serverless runtimes become more popular in the cloud, the ability to eliminate the need to manage an Anyscale cluster may be as important as simplified development of distributed application.
“Compute requirements for AI apps have been doubling every 3.5 months,” Nishihara said in a press release. “Ray introduces a necessary and fundamental shift in how AI applications are developed and scaled. With the Anyscale managed Ray offering, we are excited to bring the power of Ray to the broader developer community and organizations worldwide.”
The potential to radically simplify not only the development of distributed AI applications but to mange them in a serverless manner certainly has the attention of Andreessen Horowitz, which co-lead the $100 million Series C round of funding along with Addition.
“With the creators of Ray at the helm, Anyscale is poised to disrupt a $13 trillion market opportunity in AI,” David George, a general partner with Andreessen Horowitz, said in a press release. “The growth of the Ray community and the success of organizations globally on Anyscale continues to impress. Most importantly, the Anyscale team has a unique combination of technical leadership and experience scaling organizations to win in the market.”
With its service generally available on two clouds and a valuation of $ billion, Anyscale is now poised to scale up its business plan. Priorities at this point include go-to-market aspects of the business (sales, marketing), as well as R&D, “across both open source and the managed offering…to deliver a more polished experience to Ray and Anyscale users,” the company says.
Anyscale is planning a launch event on December 15. Details can be found here.
Related Items:
From Amazon to Uber, Companies Are Adopting Ray
Scaling to Great Heights at the Ray Summit
Meet Ray, the Real-Time Machine-Learning Replacement for Spark