AI / ML / DL with OpenShift and IBM Power Systems

OpenShift for ML on IBM Power Systems is the ideal solution for you to reduce costs, modernize your deployments, accelerate your ML/DL training and increase collaboration between your teams. Thanks to hardware specifically designed for these new workloads, the possibilities offered by new GPU virtualization technologies and the market-leading Kubernetes-based hybrid cloud solution from Red Hat and IBM.

What is Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL)?

Artificial Intelligence (AI)

AI is the ability of machines to mimic intelligent human behavior and perform tasks that generally require humans to perform.

Machine Learning (ML)

ML is, within AI, the ability to learn, using different models, without being directly programmed to do so. Algorithms and statistical systems such as patterns and inferences are used to achieve certain unattended learning capabilities.

Deep Learning (DL)

Deep learning goes one step further. It allows information to be extracted gradually from any data input. It is a complex architecture that allows the processing of images or even human language, including voice or object recognition.

What does OpenShift bring to DL and ML systems?

By using containers within our hybrid cloud to deploy our Deep Learning and Machine Learning workloads, we can make much better use of our infrastructure investment: storage, servers and networking. Since OpenShift version 4.7, if deployed on Power Systems (specifically on AC922 and IC922 models) it allows running different ML and DL models even sharing GPUs. This represents a real revolution in on-premises ML projects and more than significant cost savings compared to existing cloud alternatives: think that in addition to the costs of running the various training sessions, all the data must be uploaded and downloaded from the cloud, with the high costs that this entails.

OpenShift Container Storage and DevOps

With the help of OpenShift Container Storage (OCS), each developer can manage different instances and versions of the same model following devops practices and on our own storage. When the model is ready to be deployed, the user can start a configuration and deployment process at any time. A version control system and advanced orchestration capabilities are available including automatic testing of the new code. This is made possible by the latest advances in GPUs virtualization technologies and integration into the only HW platform that has dedicated connections between GPUs (FPGAs) and sockets. This avoids bottlenecks with bandwidth several times that of Intel processor-based architectures. We can also run different models simultaneously on GPUs (FPGAs) of the same graphics card.

Collaboration between data scientists and developers

OpenShift is a unified platform where data scientists, software developers and system administrators can collaborate in a simple and robust way. This allows you to accelerate the deployment of applications of all kinds, including ML/IA in minutes thanks to its self-service portal. Quickly create, scale, reproduce, test and share the results of AI/DL/ML models in an agile way with the other people involved in this type of projects, including project managers, mathematicians, programmers and customers.

Can I still use AWS, Azure, or Google Cloud for ML?

Of course. There will be certain models, workloads or projects where it is interesting for various reasons to use cloud provider services. In others, either because of its high costs or requirements arising from data protection, we will choose to make it our own infrastructure. OpenShift allows you to manage it easily and transparently.

SiXe Ingeniería
×