

via Shutterstock
Amazon Web Services used a big data conference in the backyard of some of its largest government customers to showcase its AI and machine learning tools that are helping to funnel ever-larger volumes of data into its storage and computing infrastructure.
Making a pitch for better data management tools like metadata systems, AWS executives addressing a big data conference in Tysons Corner, Va., said the the public cloud giant aims to go beyond democratizing big data to “demystify” AI and machine learning.
The combination of organized data and analytics will accelerate the building and deployment of machine learning models, many that currently never make it to production. Those that are deployed often require up to 18 months to roll out, noted Ben Snively, a solution architect at AWS (NASDAQ: AMZN).
Open source tools for model development often advance a generation or two in the time it takes many enterprises to develop, train and launch a machine learning model, he added.
Snively asserted that the combination of big data and analytics along with AI and machine learning creates a “flywheel effect” in which organized, accessible data leads to faster insights, better products and—completing the cycle—more data.
(Hence, the cloud vendor forecasts as much as 180 zettabytes of widely varied and fast-moving data by 2025.)
As it seeks to demystify machine automation technologies and move beyond the current technology “hype phase,” AWS executives note that deployment of machine learning models and, eventually, full-blown platforms, remains hard. Among the reasons are “dirty” data that must be cleansed to foster access. The company estimates that 80 percent of data lakes currently lack metadata management systems that help determine data sources, formats and other attributes needed to wrangle big data.
That makes the heavy investments in data lakes “inefficient,” stressed Alan Halamachi, a senior manager for AWS solution architectures. “If data is not in a format where it can be widely consumed and accessible,” Halamachi stressed, machine learning developers will find themselves in “data jail.”
Once big data is wrangled and secured—“Hackers would like nothing more than to engineer a single breach with access to all of it,” Hamachi said—it can be combined with analytics on the inference side to accelerate training of machine learning models, Snively said.
Noting that most machine learning models built by enterprises never make it to production, the AWS engineers pitched several new tools including its SageMaker machine and deep learning stack introduced in November. Described as a tool for taking the “muck” out of developing machine learning models, Snively said Sagemaker is also designed to free data scientists from IT chores like standing up a server for model development.
The cloud giant is seeing more experimentation among its customers as they seek to connect big data with machine learning development. “Voice [recognition] systems are here to stay,” Snively asserted, and developers are investigating “new ways of interacting with those systems.”
“It’s really about demystifying AI and machine learning” and getting beyond the “magic box” phase, he added.
Recent items:
AWS Takes the ‘Muck’ Out of ML with Sagemaker
How to Make Deep Learning Easy
February 21, 2025
- SADA Recognized for AI, Security, and Data Analytics in Google Public Sector
- Gartner Survey Finds One-Third of CDAOs Cite Measuring Data, Analytics and AI Impact as Top Challenge
- Alabama Power and E Source Win DOE Prize for Data Analytics Innovation
- AWS and InfluxData Launch Amazon Timestream for InfluxDB Read Replicas
- Kioxia and Sandisk Unveil Next-Gen 3D Flash Memory Tech Achieving 4.8Gb/s NAND Interface Speed
- DataChat’s Gen AI Platform for Analytics on Track for Continued Growth in 2025
- Lucidworks Launches AI-Powered Commerce Studio and Analytics Studio, Available in New Fast-Start Packages
- Arize AI Secures $70M Series C to Expand AI Observability and LLM Evaluation
February 20, 2025
- Accenture Invests in Voltron Data to Help Organizations Use GPU Tech to Simplify Large-Scale Data Processing
- DDN Unveils Infinia 2.0 to Streamline AI Data Management and GPU Utilization
- Together AI Raises $305M Series B to Power AI Model Training and Inference
- Starburst Closes Record FY25, Fueled by Rising AI Demand and Growing Enterprise Momentum
- GridGain Brings Apache Ignite Community Together for Ignite Summit 2025
- Elasticsearch Open Inference API now Supports Jina AI Embeddings and Rerank Model
- HarperDB Named an IDC Innovator for Edge Inference Delivery
- EDB Postgres AI Outperforms Oracle, SQL Server, and MongoDB in New Benchmark
- CData and Ellie.ai Partner to Streamline Enterprise Data Modeling
- Privacera Aligns AI Governance with NIST Standards to Mitigate AI Risks
February 19, 2025
- OpenTelemetry Is Too Complicated, VictoriaMetrics Says
- What Are Reasoning Models and Why You Should Care
- Three Ways Data Products Empower Internal Users
- Keeping Data Private and Secure with Agentic AI
- Memgraph Bolsters AI Development with GraphRAG Support
- Three Data Challenges Leaders Need To Overcome to Successfully Implement AI
- PayPal Feeds the DL Beast with Huge Vault of Fraud Data
- What Leonardo DaVinci Teaches Us About Data Management
- Demystifying AI: What Every Business Leader Needs to Know
- Top-Down or Bottom-Up Data Model Design: Which is Best?
- More Features…
- Meet MATA, an AI Research Assistant for Scientific Data
- AI Agent Claims 80% Reduction in Time to Complete Data Tasks
- DataRobot Expands AI Capabilities with Agnostiq Acquisition
- Snowflake Unleashes AI Agents to Unlock Enterprise Data
- EDB Says It Tops Oracle, Other Databases in Benchmarks
- Collibra Bolsters Position in Fast-Moving AI Governance Field
- Microsoft Open Sources Code Behind PostgreSQL-Based MongoDB Clone
- AI Making Data Analyst Job More Strategic, Alteryx Says
- VAST Data Expands Platform With Block Storage And Real-Time Event Streaming
- Databricks Unveils LakeFlow: A Unified and Intelligent Tool for Data Engineering
- More News In Brief…
- Informatica Reveals Surge in GenAI Investments as Nearly All Data Leaders Race Ahead
- Gartner Predicts 40% of Generative AI Solutions Will Be Multimodal By 2027
- PEAK:AIO Powers AI Data for University of Strathclyde’s MediForge Hub
- DataRobot Acquires Agnostiq to Accelerate Agentic AI Application Development
- Cloudera Welcomes Tom Brady as Keynote Speaker at ELEVATE26
- Starburst Closes Record FY25, Fueled by Rising AI Demand and Growing Enterprise Momentum
- TigerGraph Launches Savanna Cloud Platform to Scale Graph Analytics for AI
- EY and Microsoft Unveil AI Skills Passport to Bridge Workforce AI Training Gap
- Alluxio Enhances Enterprise AI with Version 3.5 for Faster Model Training
- DeepSeek-R1 models now available on AWS
- More This Just In…