We are excited to bring Transform 2022 back in-person July 19 and virtually July 20 - 28. Join AI and data leaders for insightful talks and exciting networking opportunities. Register today!


Enterprise MLops platform Domino Data Lab announced its new Nexus hybrid Enterprise MLops architecture today. Previewed at its recent Rev3 conference, the company says Nexus allows companies to rapidly scale, control and orchestrate data science work across different compute clusters — in different geographic regions, on-premises, and even across multiple clouds — without sacrificing reliability, security or usability.

According to Forrester Research, two-thirds of IT decision makers have already invested in hybrid support for AI workload development. With Nexus, customers can gain maximum cost optimization by leveraging owned on-premises Nvidia GPUs, as well as the ability to move workloads to cloud-based GPUs when additional capacity is needed, the company explained in a press release. Domino Data Lab has already begun development of Nexus with Nvidia as a launch partner, while specific solution architectures validated for Nvidia technologies are planned for later this year.

Enterprises want machine learning flexibility

According to Nick Elprin, CEO of Domino Data Lab, Nexus’ hybrid MLops launch comes as companies grapple with the reality that their data and large compute needs may be both in the cloud and on-premise, as well as the fact that many are moving some cloud workloads back on-premise for cost and efficiency reasons. 

“The rise and importance of cloud is a real trend, but it oversimplifies the reality of what enterprises are dealing with,” Elprin told VentureBeat. “It’s not as simple as ‘all compute is in the cloud,’ when data science and machine learning workloads are in many different places.” Enterprises want one unified platform and architecture that lets them execute machine learning workloads wherever the data lives and gives them the flexibility of managing costs efficiently, he explained. 

Event

Transform 2022

Join us at the leading event on applied AI for enterprise business and technology decision makers in-person July 19 and virtually from July 20-28.

Register Here

Manuvir Das, VP of enterprise computing at Nvidia, points out that if companies are willing to run their workload outside of the public cloud and it’s a relatively steady-state workload, “the economics outside of the public cloud are far better,” he said. “Now, if you’re doing things where you do a lot of work one day and very little the next day, then the cloud’s perfect because you get that flexibility and elasticity.” 

Helping bridge the gap between IT and data science

Hybrid MLops will also help bridge the longstanding gap between IT and data science teams, Elprin explained. For example, one of the first things IT cares about most is data security – but as modern enterprise data strategies evolve, there is data in many different places, often for security reasons. 

“When data science teams want to build models that operate on sensitive data, they’re always running up against IT and InfoSec restrictions about how they have to move that data around,” Elprin said. 

IT also cares about cost, he added. “They always have and they always will – and machine learning workloads are very expensive,” he said. “As more computationally intensive algorithms get developed, and data science teams are pushing the boundaries of having more impact with more compute spend, that’s creating a natural friction or tension between IT and data science.” 

The way Nexus helps is to give data science organizations options in terms of managing the cost of machine learning workloads. “One of our customers has something like 500 GPU boxes on premise and they run those 24-7, so the cost of moving those GPU workloads to the cloud would be millions of dollars a year,” Elprin said. “Nexus gives IT the flexibility to manage costs and still unlocks productivity for data scientists.” 

IT as data science hero

Das shared the example of a pharmaceutical company customer that began using data science in one team. Then other teams began. “Before they knew it there were ten different mini organizations around their company,  living in silos, doing these machine learning workloads for their own particular purpose and also geographically distributed,” he said.

The data science teams at this client got to the point that they were pushing the boundaries of their machine learning workloads to the point that they were running against the availability of resources in the cloud they were using. “The data science teams were complaining saying, ‘we need more computers,’ while IT was saying, ‘we’re hitting cloud limits and the cloud provider can’t get us more GPUs,” said Elprin. 

That’s when Nvidia and Domino Data Lab really engaged with the IT team to provide, essentially, a center of excellence for all the data scientists across their company. “It has completely changed the relationship between IT and the data science teams,” said Das. “IT can be a huge part of this – they can be the hero.” 

Unlocking the hybrid potential of MLops

“I believe we’re in an era where more availability of computers will lead to more breakthroughs, so Nexus is unlocking more pools of compute resources,” said Elprin. Previously, he explained, if you were limited by compute, there might be ideas you don’t even bother to test. “It unlocks the art of the possible – the kinds of ideas that data science teams can test and experiment with.” 

In addition, Elprin said that he believes the most successful enterprises don’t think of data science, machine learning and AI as a new function that is stood up like building a marketing team. “In reality, they think about it as an integrated cross-functional capability, that is weaving all the different parts of the business together – which is critical to unlock the potential of MLops 

Finally, every company needs a hybrid strategy, said Das, where everything runs everywhere – but that has rarely been realized in practice. 

“I think the Nexus interface is a pragmatic example of how to run a hybrid workload,” said Das. “There has been a lot of buzz on [hybrid] for a decade, but not that many pragmatic examples of it – this is a way of really proving out the fact that yes, hybrid can work.”

Author
Topics