The trend towards hyperscale AI started a
few years ago, and many companies and institutions have been increasing their
investment in GPU cluster systems and clouds. Gangwon Jo co-founded Moreh in
2020 to make it easier to build and utilize AI infrastructure at scale. He
believes that many infrastructure-level challenges are due to the limitations
of the legacy AI software stacks, specifically deep learning frameworks and
parallel computing platforms. From this insight, he leads the development of
the MoAI platform, a set of fully integrated software components from deep
learning primitive libraries to application-level APIs. The platform bridges AI
applications and underlying accelerators in a more efficient, scalable, and
flexible way.
The MoAI platform provides 100%
PyTorch/TensorFlow compatibly and ensures the same user experience as
traditional AI software stacks. Numerous existing AI applications can run on
the platform without any modification. However, the internal structure of the
platform is completely different from existing deep learning frameworks. The
platform adopts new techniques including runtime IR construction and
just-in-time compilation – it first records the behavior of an application as a
computational graph. Then the just-in-time compiler finds the optimal way of
executing the application based on this birds-of-eye view of what the user
wants to do.
The platform features accelerator portability,
single device abstraction, and application-level virtualization. It can run AI
applications on non-NVIDIA GPUs and other types of accelerators such as NPUs.
AI infrastructure can be built on the most cost-effective hardware without
concerning software compatibility. It encapsulates a large cluster system as a
single device. Users can easily implement AI applications, especially those
that deal with large AI models such as GPT-3, without considering
parallelization across multiple accelerators and nodes. Lastly, it can provide
the users with virtual devices instead of directly exposing physical
accelerators to users. The mapping between virtual and physical devices is
solely managed by the platform. This enables more flexible AI cloud services
and drastically improves the average utilization of accelerators.