DENVER, Nov. 13, 2023 — NeuReality has launched its much anticipated, fully-integrated NR1 AI Inference solution this week at the international SC23 Conference – a long awaited cure for the ailments of big CPU-centric data centers of today that suffer from high inefficiency and expense.
Now with 10x performance, 90 percent cost savings on AI operations per dollar, and a line-up of business partners and customers, NeuReality will demonstrate the world’s first affordable, ultra scalable AI-centric servers designed purely for inference; meaning, the daily operation of a trained AI model.
As expensive as it is to run live AI data in the world’s data centers, AI inferencing remains a blind spot in our industry, according to NeuReality Co-founder and CEO Moshe Tanach: “ChatGPT is a new and popular example, of course, but generative AI is its infancy. Today’s businesses are already struggling to run everyday AI applications affordably – from voice recognition systems and recommendation engines to computer vision and risk management,” says Tanach. “Generative AI is on their horizon too, so it’s a compounding problem that requires an entirely new AI-centric design ideal for inferencing. Our customers will benefit immediately from deploying our easy-to-install and easy-to-use solution with established hardware and solution providers.”
Anticipating the need for more affordable, faster, and scalable AI inference goes back to before 2019 when NeuReality was founded. The company focuses on one of the biggest problems in artificial intelligence; that is, making the inference phase both economically sustainable and scalable enough to support consumer and enterprise demand as AI accelerates.
But for every $1 spent on training an AI model today, businesses spend about $8 to run those models, according to Tanach. “That astronomical energy and financial cost will only grow as AI software, applications and pipelines ramp up in the years to come on top of larger, more sophisticated AI models.”
With the NR1 system, future AI-centric data centers will see 10x performance capability to empower financial, healthcare, government and small businesses helping them to create better customer experiences with more AI inside their products. That in turn can help companies generate more top-line revenue while decreasing bottom-line costs by 90 percent.
“NeuReality’s AI inference system comes at the right time when customers not only desire scalable performance and lower total cost of ownership, but also want open-choice, secure and seamless AI solutions that meet their unique business needs,” said Scott Tease, Vice President, General Manager, Artificial Intelligence and HPC WW at Lenovo.
“NeuReality is bringing highly efficient and easy-to-use AI innovation to the data center. Working together with NeuReality, Lenovo looks forward to extending this transformative AI solution to customer data and delivering rapid AI adoption for all. As a leader in our Lenovo AI Innovators Program, NeuReality’s technologies will help us to deliver proven cognitive solutions to customers as they embark on their AI journeys,” said Tease.
At SC23 next week, NeuReality will demonstrate its easy-to-deploy software development kit, APIs, and two flavors of hardware technology: the NR1-M AI Inference Module and the NR1-S AI Inference Appliance. Along with OEM and Deep Learning Accelerator (DLA) providers, each demo addresses specific market sectors and AI applications that showcase the breadth of NeuReality’s technology stack and compatibility with all DLAs. The systems architecture will feature one-of-kind, patented technologies including:
- NR1 AI-Hypervisor hardware IP: a novel hardware sequencer that offloads data movement and processing from the CPU, an architectural cornerstone for heterogenous compute semiconductor device;
- NR1 AI-over-Fabric network engine: an embedded NIC (Network Interface Controller) with offload capabilities for an optimized network protocol dedicated for inference. The AIoF (AI-over-Fabric) protocol optimizes networking between AI clients and servers as well as between connected servers forming a large language model (LLM) cluster or other large AI pipelines;
- NR1 NAPU (Network Addressable Processing Unit): a network-attached heterogenous chip for complete AI-pipeline offloading, leveraging Arm cores to host Linux-based server applications with native Kubernetes for cloud and data center orchestration.
“We are thrilled to be working with NeuReality to deliver inference-as-a-service in banking, insurance and investment services,” said PJ Go, CEO, Cirrascale Cloud Services. “As a specialized cloud and managed services provider deploying the latest training and inference compute with high-speed storage at scale, we focus on helping customers choose the right platform and performance criteria for their cloud service needs. Working with NeuReality to help solve for inference – arguably the biggest issue facing AI companies today – will undoubtedly unlock new experiences and revenue streams for our customers.”
About NeuReality Ltd.:
NeuReality is an AI technology innovation company creating a purpose-built AI-platforms for ultra-scalability of real-life AI applications. Founded in 2019 and led by a seasoned management team with extensive experience in data centers architecture, system, and software, NeuReality positioned itself as a pioneer in the deep learning and AI solutions market. NeuReality is headquartered in Caesarea, Israel. For more information visit www.neureality.ai.