Pathway Launches Data Processing Engine for Real-Time AI That Can ‘Unlearn’
PARIS, July 26, 2023 — Pathway today announced the general launch of its data processing engine, which benchmarks have determined to be x90 faster than existing streaming solutions. The platform uniquely unifies workflows for batch and streaming data to enable real-time machine learning and, critically, the ability for machines to ‘learn to forget.’
Until now, it has been nearly impossible for machines to learn and react to changes in real-time like humans. Due to the complexity of designing streaming workflows, intelligent systems are typically trained on static [frozen] data uploads, including large language models like ChatGPT. This means their intelligence is stuck at a moment in time. Unlike humans, machines are not in a continuous state of learning and therefore cannot iteratively ‘unlearn’ any information they were previously taught when it is found to be false, inaccurate, or becomes outdated.
Pathway overcomes this thanks to its unique ability to mix batch and streaming logic in the same workflow. Systems can be continuously trained with new streaming data, with revisions made to certain data points without requiring a full batch data upload. This can be compared to updating the value of one cell within an Excel document, which doesn’t reprocess the whole document, but just the cells dependent on it. This means inaccurate source information can be seamlessly corrected to improve system outputs.
It has traditionally been extremely hard to design efficient systems that combine both batch and streaming workflows. And the situation has become even more complex since a third workflow entered the scene, generative AI, which needs fast and secure learning of context to deliver value.
Most organizations typically design two or more separate systems, which are unable to perform incremental updates to revise preliminary results. This has reduced confidence in machine learning systems and stalled the adoption of enterprise AI among organizations that need to make decisions based on accurate real-time data, such as in manufacturing, financial services and logistics. Bringing together batch and streaming data overcomes this challenge and enables true real-time systems for resource management, observability and monitoring, predictive maintenance, anomaly detection, and strategic decision-making.
Pathway Enables a Paradigm Shift Towards Real-Time Data
The Pathway data processing engine is enabling organizations to perform real-time data processing at scale. Existing clients include DB Schenker, which has reduced the time-to-market of anomaly-detection analytics projects from three months to one hour, and La Poste, which enabled a fleet CAPEX reduction of 16%.
Unique capabilities of the Pathway data processing engine supporting this shift to real-time include:
- Fastest data processing engine on the market- unified batch and streaming: Capable of processing millions of data points per second, it largely surpasses current reference technologies such as Spark (in both batch and streaming), Kafka Streams, and Flink. Benchmarking of WordCount and PageRank against the above also found that Pathway supports more advanced operations and is up to x90 faster thanks to its maximised throughput and lower latency. The benchmarks were stress tested by the developer community, and are publicly available so the tests can be replicated. A detailed description of the benchmarks is available in the HAL preprint.
- Facilitates real-time systems: Pathway allows the seamless transition between existing systems, from batch to real-time and LLM architectures, thanks to real-time machine learning integrating fully into the Enterprise context.
- Ease of development: Batch and streaming workflows can be designed with the same code logic in Python, which is then transposed into Rust. This democratizes the ability for developers to design streaming workflows, which have typically required a specialist skillset, and enables what have typically been disparate teams within an organization to come together. Thanks to this, Pathway becomes the lingua franca of all data pipelines – stream, batch and generative AI.
Zuzanna Stamirowska, CEO and cofounder of Pathway, commented: “Until now, the complexity of building batch and streaming architectures has resulted in a division between the two approaches. This has slowed the adoption of data streaming for AI systems and fixed their intelligence at a moment in time. But there is a critical need for real-time to optimize processing and to enable AI to unlearn for improved, continuous accuracy.
“That’s why our mission has been to enable real-time data processing, while giving developers a simple experience regardless of whether they work with batch, streaming, or LLM systems. Pathway is truly facilitating the convergence of historical and real-time data for the first time.”
The general launch of the Pathway platform follows the company’s $4.5m per-seed round in December 2022, which was led by CEE VCs Inovo and Market One Capital, with angel investors Lukasz Kaiser, Co-Author of Tensor Flow and informally known as the “T” in ChatGPT, and Roger Crook, the former global CEO of German delivery giant DHL.
About Pathway
Pathway develops real-time intelligence technology. Real-time learning is made possible by an effective and scalable engine, which powers LLMs and machine learning models. These models are automatically updated thanks to a framework that combines streaming and batch data, and which is user-friendly and flexible for developers, data engineers and data scientists. Leading experts in the field of artificial intelligence make up the team, which is headed by Zuzanna Stamirowska. They include CTO Jan Chorowski, co-authors of Geoff Hinton and Yoshua Bengio, as well as Business Angel Lukasz Kaiser, who co-authored Tensor Flow and is also known as the “T” in ChatGPT.
Source: Pathway