Follow BigDATAwire:

February 28, 2024

More AI Added to Google Cloud’s Databases

(Michael Vi/Shutterstock)

Google Cloud is bolstering its analytics and transactional databases, including BigQuery, AlloyDB, and Spanner, with new capabilities designed to drive the development of generative AI applications among its customers.

BigQuery, which is Google Cloud’s top database for powering analytical and AI workloads, received several AI enhancements. First, the company rolled out the preview of an integration between BigQuery and Vertex AI for text and speech. This will allow users to extract insights from unstructured data like images and documents, Google Cloud says.

Gemini, the company’s largest and most capable AI model–and which has also been the subject of some controversy following a rocky consumer debut last week–is also now available to BigQuery customers via Vertex AI.

These AI capabilities come on the heels of the previously announced vector search capability in BigQuery. The vector search function, also in preview, enables critical components of GenAI applications, such as similarity search and retrieval-augmented generation (RAG) using large language models.

Having access to Vertex AI directly within BigQuery bolsters the ease-of-use story for Google Cloud AI customers in several ways, said Gerrit Kazmaier, GM and VP for data analytics.

“As a data analytic practitioner, you can access all of the Vertex AI models, including our Gemini [model] just from your SQL command line or BigQuery embedded Python API,” Kazmaier said in a press conference yesterday. “That is amazing because it means you don’t need to go to a data scientist or machine learning platform. You can access it right in the domain you’re working in, right on the data you have at hand.”

The second big benefit of the integration is better access to data for AI models, Kazmaier said. Prior to this integration, getting data to the AI models typically required the construction and operation and a data pipeline to move the data. That is now no longer needed, he said. “All of that complexity just goes away,” he said.

The capability to combine text- and image-based AI models within Vertex–now available to data analysts via BigQuery–is also something that will benefit customers in a big way, Kazmaier said.

“This unlocks of whole new step of analytical scenarios,” he said. “The summarization, sentiment extraction, classification, enrichment, translation of structured and unstructured data. And that is a huge deal. This is really the news here, because 90%, roughly speaking, of the data out there is unstructured. This data is usually not used in enterprise data analytics because you couldn’t work with them in a meaningful way.”

On the transactional (or operational) front, Google Cloud announced the general availability of AlloyDB AI, the AI-specific version of the hosted Postgres database the company unveiled at its Next 23 conference last year. Equipped with the capability to store vector embeddings and perform vector search functions, Google Cloud sees AlloyDB AI as a core component of its customers GenAI use cases.

Google Cloud also rolled out a new integration with LangChain, a popular open source framework that helps connect customers data into large language models (LLMs). All of Google Cloud’s databases will be integrated with LangChain, said Andi Gutmans, Google Cloud’s GM and VP for databases.

The new capabilities were made in response to customer demand to figure out a way to get more GenAI value from their data, Gutmans said.

“That’s really what Gerrit and I spend our time on,” Gutmans said in the press conference with Kazmaier. “We own the data. We know AI cannot be successful without the data and so how do we make sure that this AI can really work with the data in concert and with data in real time.”

The company also announced that it’s adding vector search capabilities to other databases that it hosts for customers on its cloud, including its Redis and MySQL offerings. Cloud Spanner, Firestore, and Bigtable will also be getting vector capabilities, Gutmans said.

“What’s special about Spanner is this will be exact nearest-neighbor search capability, which is slightly a different variant,” Gutmans said. “What’s really exciting about that is customers who have very, very large use cases–for example, trillions of vectors, highly partitioned based on users for example. You can imagine some of the Google internal apps are kind of partitioned by user–they will be able to store and search vectors at a trillion [vector] scale.”

All databases will eventually need vector functions, including the capability to store vector embeddings as well as some type of vector search functions, Gutmans said.

“Our belief is really any database, any place where you’re storing operational data that you may need to use in a GenAI use case should also have vector capabilities,” he said. “This is no different from 15 to 20 years ago when database all added JSON support. We believe good vector capabilities should just keep foundational capability of the database.”

Related Items:

Google Vertex AI Search Add News GenAI Capabilities And Enterprise-Ready Features

Google Cloud Overhauls AI with Vertex Launch

Google Cloud Launches New Postgres-Compatible Database, AlloyDB

BigDATAwire