REDWOOD
CITY, Calif., April 3,
2024 /PRNewswire/ -- FriendliAI, a frontrunner in
inference serving for generative AI, is thrilled to announce
Friendli Dedicated Endpoints, which offers the capabilities of
Friendli Container as a managed service. This latest addition to
the Friendli Suite eliminates the complexities of containerization
and development, providing customers with automated,
cost-effective, and high-performance custom model
serving.
Friendli Dedicated Endpoints is the managed cloud service
alternative to Friendli Container. Friendli Container, currently
adopted by startups and enterprises alike to deploy Large Language
Models (LLMs) at scale within private environments, shows
significant reductions in GPU costs with the power of the highly
GPU optimized Friendli Engine, which powers Friendli Dedicated
Endpoints as well.
In addition to leveraging the Friendli Engine, Friendli
Dedicated Endpoints streamlines the process of building and serving
LLMs through automation, making it more cost and time efficient.
Friendli Dedicated Endpoints handles managing and operating
generative AI deployments, from model custom fine-tuning to
procuring cloud resources to automatic monitoring of deployments.
For instance, users can fine-tune and deploy a quantized Llama 2 or
Mixtral model using the powerful Friendli Engine in just a few
clicks, bringing cutting-edge GPU-optimized serving to users of all
technical backgrounds.
Byung-Gon Chun, CEO of
FriendliAI, highlighted the importance of democratizing generative
AI, emphasizing its importance in driving innovation and
organizational productivity.
"With Friendli Dedicated Endpoints, we're eliminating the hassle
of infrastructure management so that customers can unlock the full
potential of generative AI with the power of Friendli Engine.
Whether it's text generation, image creation, or beyond, our
service opens the doors to endless possibilities for users of all
backgrounds."
Key features of Friendli Dedicated Endpoints:
- Dedicated GPU Instances: Users can reserve entire GPUs
for serving their custom generative AI models, ensuring consistent
and reliable access to high-performance GPU resources.
- Custom Model Support: Users can upload, fine-tune, and
deploy models, enabling tailored solutions for diverse AI
applications.
- Superior Performance and Efficiency: A single GPU
with the optimized Friendli Engine delivers results equivalent to
up to seven GPUs with vLLM. Friendli Engine saves 50% to 90% on GPU
costs and boasts up to 10x faster query response times.
- Intelligent Operation: Friendli Dedicated Endpoints
seamlessly adapts to fluctuating workloads and failures with
automated failure management and auto-scaling that adjusts resource
allocation based on traffic patterns, ensuring uninterrupted
operations and resource efficiency during peak demand periods.
By eliminating technical barriers and optimizing GPU usage,
FriendliAI hopes that infrastructure constraints will no longer
hinder innovation in generative AI.
Chun says, "We're thrilled to welcome new users on our journey
to make generative AI models fast and affordable."
For more information about Friendli Dedicated Endpoints or
Friendli Container, please visit https://friendli.ai/
About FriendliAI:
FriendliAI is a leader in inference serving for generative AI,
committed to democratizing access to cutting-edge generative AI
technologies. By providing accessible generative AI infrastructure
services for developers, FriendliAI aims to accelerate innovation
in the field of generative AI.
For media inquiries or interview requests, please contact
Sujin Oh at
press@friendli.ai
View original
content:https://www.prnewswire.com/news-releases/friendliai-introduces-friendli-dedicated-endpoints-a-managed-service-version-of-friendli-container-to-increase-accessibility-302105724.html
SOURCE FriendliAI