Intelligent CIO APAC Issue 47 | Page 71

ENTERPRISES ARE LOOKING TO LEVERAGE THEIR VAST AMOUNTS OF UNSTRUCTURED DATA TO BUILD MORE ADVANCED GENERATIVE AI APPLICATIONS . effectively indexing data within a vector database on standard hardware further compounds these challenges .
INTELLIGENT BRANDS // Software for Business

DataStax to deliver highperformance RAG solution using NVIDIA microservices

Cutting-edge collaboration enables enterprises to use DataStax Astra DB with NVIDIA Inference microservices to create instantaneous vector embeddings to fuel real-time GenAI use cases .

DataStax is supporting enterprise retrieval-augmented generation ( RAG ) use cases by integrating the new NVIDIA NIM inference microservices and NeMo Retriever microservices with Astra DB to deliver high-performance RAG data solutions for superior customer experiences .

With this integration , users will be able to create instantaneous vector embeddings 20x faster than other popular cloud embedding services and benefit from an 80 % reduction in cost for services .
Organizations building generative AI applications face the daunting technological complexities , security and cost barriers associated with vectorizing both existing and newly acquired unstructured data for seamless integration into large language models ( LLMs ). The urgency of generating embeddings in near-real time and

ENTERPRISES ARE LOOKING TO LEVERAGE THEIR VAST AMOUNTS OF UNSTRUCTURED DATA TO BUILD MORE ADVANCED GENERATIVE AI APPLICATIONS . effectively indexing data within a vector database on standard hardware further compounds these challenges .

DataStax is collaborating with NVIDIA to help solve this problem .
NVIDIA NeMo Retriever generates over 800 embeddings per second per GPU , pairing well with DataStax Astra DB , which can ingest new embeddings at more than 4000 transactions per second at singledigit millisecond latencies on low-cost commodity storage solutions / disks – greatly reducing total cost of ownership for users and performs lightning-fast embedding generation and indexing .
With embedded inferencing built on NVIDIA NeMo and NVIDIA Triton Inference Server software , DataStax AstraDB vector performance of RAG use cases running on NVIDIA H100 Tensor Core GPUs achieved 9.48ms latency embedding and indexing documents , which is a 20x improvement .
When combined with NVIDIA NeMo Retriever , Astra DB and DataStax Enterprise ( DataStax ’ s on-premise offering ) provide a fast vector database RAG solution that ’ s built on a scalable NoSQL database that can run on any storage medium . Outof-the-box integration with RAGStack ( powered by LangChain and LlamaIndex ) makes it easy for developers to replace their existing embedding model with NIM . In addition , using the RAGStack compatibility matrix tester , enterprises can validate the availability and performance of various combinations of embedding and LLM models for common RAG pipelines .
DataStax is also launching , in developer preview , a new feature called Vectorize performing embedding generations at the database tier , enabling customers to leverage Astra DB to easily generate embeddings using its own NeMo microservices instance , instead of their own , passing the cost savings directly to the customer .
Chet Kapoor , Chairman and CEO , DataStax , said : “ Integrating NVIDIA NIM into RAGStack cuts down the barriers enterprises are facing to bring them the high-performing RAG solutions they need to make significant strides in their genAI application development .”
Kari Briski , Vice President of AI software , NVIDIA , said : “ Enterprises are looking to leverage their vast amounts of unstructured data to build more advanced generative AI applications .
“ Using the integration of NVIDIA NIM and NeMo Retriever microservices with the DataStax Astra DB , businesses can significantly reduce latency and harness the full power of AI-driven data solutions .” p
www . intelligentcio . com INTELLIGENTCIO APAC 71