by Shubham Sharma — venturebeat — 2023 was the year of generative AI. However, as every company moved to strengthen their AI strategy, they also realized the value of clean and high-quality data — circling back to the need for robust infrastructure into the mix. From Snowflake to Microsoft, data ecosystem vendors cashed on this opportunity and moved, sometimes even acquired notable players, to give their customers the ability to tap their data for various AI applications as well as implement various AI capabilities into their products.

These are VentureBeat’s top 5 data stories of 2023

1. Microsoft’s move to beat Amazon and Google in the cloud war

In May, Microsoft announced Fabric – an end-to-end, analytics platform that combines all the data and analytics tools organizations need, including Azure Synapse Analytics and Power BI, into a single unified product. We spoke with analysts to understand what makes this offering, which aims to unlock the potential of data and lay the foundation for AI, unique and might help Microsoft “leapfrog” Amazon and other cloud providers, such as Google. At least when it comes to serving large enterprise companies. “With all these capabilities coming together, Microsoft definitely has a slight advantage over the other hyperscalers at the moment,” Noel Yuhanna, an analyst at Forrester, told VentureBeat.

2. The rise of vector database, a new kind of database for AI era

With generative AI being the talking point for every business, Charles Xie, the CEO and founder of Zilliz, discussed the rise of vector databases, a new category of database management, and a paradigm shift for making use of the exponential volumes of unstructured data sitting untapped in object stores. Vector databases offer a mind-numbing new level of capability to search unstructured data in particular, but can tackle semi-structured and even structured data as well. Xie also talked about how companies should approach vector databases to target their respective use cases.

3. Databricks’ $1.3 billion acquisition of MosaicML

Databricks made headlines ahead of its annual summit in June when it announced the acquisition of AI company MosaicML for $1.3 billion. The idea was to bring MosaicML’s entire team and AI models under its umbrella, providing enterprises with a unified platform where they could manage data assets and use them to build secure generative AI applications. “Every organization should be able to benefit from the AI revolution with more control over how their data is used. Databricks and MosaicML have an incredible opportunity to democratize AI and make the lakehouse the best place to build generative AI and LLMs,” said Ali Ghodsi, cofounder and CEO of Databricks.

4. Salesforce partners up for stronger data foundations

Over the last year, customer relationship management (CRM) giant Salesforce strengthened its AI strategy with several product enhancements. To support these initiatives, in September, the Marc Benioff-led company announced that its proprietary Data Cloud, which brings together information from different sources to host unified customer profiles in real-time, will support bi-directional data sharing and access with Databricks’ data lakehouse platform and Snowflake’s data cloud. The move allows joint customers of the companies to enrich their datasets and power additional use cases, including building and deploying more capable models targeting different business-critical problems.

5. Snowflake’s Document AI for unstructured data search

Snowflake made waves in June with the launch of Document AI, a new large language model (LLM) tool that allows enterprises to quickly extract value from their barrage of unstructured documents (imagine PDF invoices). The move marked a major development for the company — which started with a focus on structured data — by giving teams an easy way to mobilize useful unstructured information that often remains scattered across silos. “We’re unlocking a new data era for customers, leveraging AI and eliminating silos previously bound by format, location and more to revolutionize how organizations put their data to work and drive insights with the Data Cloud,” said Snowflake SVP of product Christian Kleinerman.