- Analytics›
- Amazon SageMaker›
- FAQs
Amazon SageMaker FAQs
General
What is the next generation of Amazon SageMaker?
The next generation of SageMaker is a unified platform for data, analytics, and AI. Bringing together widely adopted AWS machine learning (ML) and analytics capabilities, the next generation of SageMaker delivers an integrated experience for analytics and AI with unified access to all your data. SageMaker allows you to collaborate and build faster from a unified studio (preview) using familiar AWS services for model development, generative AI, data processing, and SQL analytics, accelerated by Amazon Q Developer, the most capable generative AI assistant for software development. Additionally, you can access all your data whether it’s stored in data lakes, data warehouses, or third-party or federated data sources, with governance built in to address enterprise security needs.
How is the new SageMaker different from what I am using today for my ML workflows?
We expanded the widely adopted SageMaker service with the comprehensive set of AWS data, analytics, and AI capabilities to deliver a unified platform of data, analytics, and AI. Going forward, the existing set of AI/ML capabilities in SageMaker for data wrangling, building, training, and deploying AI models will be referred to as Amazon SageMaker AI. SageMaker AI is integrated within the next generation of SageMaker and is also available as a standalone service for those who wish to focus specifically on building, training, and deploying AI and ML models at scale.
The next generation SageMaker includes:
- Amazon SageMaker Unified Studio (preview): Build in a single development environment to access and use familiar tools and functionality from purpose-built AWS analytics and AI/ML services like Amazon EMR, AWS Glue, Amazon Athena, Amazon Redshift, Amazon Bedrock, and SageMaker AI.
- Amazon SageMaker Lakehouse: Unify data across Amazon Simple Storage Service (Amazon S3) data lakes, Amazon Redshift data warehouses, and third-party and federated data sources.
- Amazon SageMaker Data and AI Governance: Securely discover, govern, and collaborate on data and AI with Amazon SageMaker Catalog, built on Amazon DataZone.
What capabilities are included with the next generation of SageMaker?
The next generation of SageMaker includes the following capabilities:
- SageMaker Unified Studio (preview): Build with all your data and tools for analytics and AI in a single environment.
- SageMaker Lakehouse: Unify data across Amazon S3 data lakes, Amazon Redshift data warehouses, third-party and federated data sources with SageMaker Lakehouse.
- SageMaker Data and AI Governance: Securely discover, govern, and collaborate on data and AI with SageMaker Catalog, built on Amazon DataZone.
- Model development: Build, train, and deploy ML and foundation models (FMs) with fully managed infrastructure, tools, and workflows with SageMaker AI (formerly SageMaker).
- Generative AI app development: Build and scale generative AI applications with Amazon Bedrock.
- SQL analytics: Gain insights with Amazon Redshift, the most price-performant SQL engine.
- Data processing: Analyze, prepare, and integrate data for analytics and AI using open source frameworks on Athena, Amazon EMR, and AWS Glue.
Why should I use the next generation of SageMaker?
SageMaker is a unified platform for data, analytics, and AI. Bringing together widely adopted AWS ML and analytics capabilities, the next generation of SageMaker delivers an integrated experience for analytics and AI with unified access to all your data. This unified approach helps you work more efficiently with your data, increase collaboration across teams, and enhance overall productivity.
SageMaker allows you to:
- Collaborate and build faster with a single data and AI development environment, using familiar AWS services for model development, generative AI, data processing, and SQL analytics.
- Develop and scale your AI use cases with a broad set of tools to train, customize, and deploy ML and FMs, and rapidly create generative AI applications tailored to your business.
- Reduce data silos with an open lakehouse to unify all your data across Amazon S3 data lakes, Amazon Redshift data warehouses, and third-party or federated data sources.
- Meet your enterprise security needs with built-in data and AI governance to control access to the right data, ML models, generative AI development artifacts, and compute resources, by the right user for the right purpose.
Can I use individual AWS services without using SageMaker?
Yes. You can continue to independently use individual AWS services such as SageMaker AI (formerly SageMaker), Amazon EMR for big data processing, AWS Glue, and Amazon Redshift for data warehousing based on your specific business requirements. There is no impact to how you currently use your individual services today.
SageMaker offers an additional benefit by providing a unified, user-friendly interface that enables access to these services. This approach helps you more effectively innovate with your data, increase collaboration across teams, and enhance overall productivity.
What existing AWS services can I use within SageMaker?
SageMaker brings together a comprehensive set of AWS AI and analytics services across SageMaker Unified Studio (preview), SageMaker Data and AI Governance, and SageMaker Lakehouse.
From SageMaker Unified Studio, you can access capabilities for data processing, SQL analytics, ML, and generative AI application development using existing AWS services. For data processing, services like Athena, AWS Glue, Amazon EMR, and Amazon Managed Workflows for Apache Airflow (Amazon MWAA) analyze, prepare, integrate and orchestrate data for analytics and AI at any scale. For SQL Analytics, Amazon Redshift seamlessly integrates with SageMaker Lakehouse to provide powerful SQL analytic capabilities on your unified data across Amazon Redshift data warehouses and Amazon S3 data lakes. ML capabilities are delivered by SageMaker AI (previously known as SageMaker) for building, training, and deploying ML and FMs. Additionally, you can develop generative AI applications using Amazon Bedrock IDE (preview).
SageMaker Data and AI Governance provides end-to-end, built-in governance through a unified data management experience in SageMaker Catalog, built on Amazon DataZone, to securely discover, govern, and collaborate on data and AI.
SageMaker Lakehouse is built on multiple catalog services across AWS Glue Data Catalog, AWS Lake Formation, and Amazon Redshift to provide unified data access across Amazon S3 data lakes, Amazon Redshift data warehouses, and third-party and federated data sources.
In addition, these services remain available as standalone capabilities through the AWS Management Console, giving you flexibility based on your use cases. We will enhance the SageMaker platform with more services in 2025 to unify experiences across analytics and AI. These include search analytics with Amazon OpenSearch Service, business intelligence with Amazon QuickSight, and streaming with the AWS streaming portfolio of services.
How do I get started with SageMaker?
Getting started with SageMaker is easy. The first step is to navigate to the SageMaker Unified Studio (preview) management console to create a domain, the organizing entity for connecting together your assets, users, and their projects for your business unit. In the console, choose Create domain, and you will be presented with two options: Quick setup and Manual setup. Choose Quick setup to get started with a set of default configurations that can be customized later. Alternatively, you can choose Manual setup, which gives you full control over your settings as you create your domain. Once your domain is created, you can navigate to the SageMaker Unified Studio (a browser-based web application) where you can use all your data and configured tools for analytics and AI. To learn more about how to get started, visit the SageMaker documentation.
I currently use existing AWS services that are now included in SageMaker. How do I upgrade to the unified experience in SageMaker?
Your existing data development experiences in AWS services like Amazon EMR, AWS Glue, and Athena remain available. This means all existing code and resources you've created can continue to be used without disruption. We will provide easy-to-use upgrade scripts and comprehensive guidelines to bring your existing code base to the unified SageMaker experience in Q1 2025.
Is the next generation of SageMaker generally available?
We are extending SageMaker, a widely adopted ML service into a data and AI platform by integrating the comprehensive set of AWS data, analytics, and AI tools already used by customers today. We’ve also added new capabilities to the new SageMaker platform including the SageMaker Unified Studio (preview), the SageMaker Lakehouse (GA), and the SageMaker Catalog (GA).
The new SageMaker platform includes virtually all of the components you need for SQL analytics with Amazon Redshift, data processing with Amazon EMR, AI model development with SageMaker AI, and generative AI app development with the new Amazon Bedrock IDE (preview), all delivered through an integrated development experience in the unified studio (preview).
Product experience
What is a project in SageMaker?
A project entity in SageMaker helps users organize their work and provide business context over the jobs they are performing. It provides a collaborative workspace where users can collaborate on data and artifacts such as ML models, notebooks, queries, dashboards, and generative AI applications. Projects are secured so that only users who are explicitly added to the project are able to access the data and tools within it. The project creates AWS Identity and Access Management (IAM) roles based on the project-selected capabilities (for example, a data lake) that provide users with required access to do their job. Projects also provide work isolation within the same account, as well as a security boundary (security group and IAM roles).
How does Amazon Q Developer enhance productivity in SageMaker?
Amazon Q Developer is a generative AI conversational assistant integrated into the SageMaker experience that enhances your productivity throughout the development lifecycle. Through a chat interface, you can use natural language to ask questions about SageMaker, get help with code, and explore resources such as datasets. When you chat with Amazon Q Developer, it uses the context of your current conversation to provide personalized guidance and automated assistance throughout the SageMaker development experience. Amazon Q Developer can help you with code discussions, provide inline code completions, generate SQL queries, find and integrate datasets, and offer intelligent support tailored to your specific development needs.
By understanding the nuances of your work, Amazon Q Developer delivers targeted, context-aware assistance that streamlines your development process and enhances overall productivity in the SageMaker environment.
What tools are available in SageMaker for analytics and AI jobs?
SageMaker provides a unified, web-based environment that brings together powerful tools for complete data and AI workflows. Built-in IDEs enable AI/ML development, allowing you to process large data volumes from various sources using frameworks and services like PySpark, AWS Glue, and Amazon EMR.
For version control and workflow management, you can commit to Git and define workflows using Amazon MWAA. The integrated SQL query editor allows you to explore, analyze, and visualize data, with the ability to more easily save and share queries and create new datasets.
Model development is streamlined through familiar SageMaker AI tools, including Amazon SageMaker notebooks, JumpStart, HyperPod, MLFlow, Pipelines, and Model Registry. Throughout these processes, Amazon Q Developer is seamlessly integrated across SageMaker tools, providing intelligent assistance in data discovery, preparation, pipeline creation, model building and training, and code deployment.
How do I build generative AI applications in SageMaker?
The Amazon Bedrock IDE (preview), integrated within SageMaker Unified Studio (preview), provides a comprehensive environment for developing generative AI applications. This intuitive interface helps you accelerate application development in a trusted and secure setting, offering access to the high-performing FMs and advanced customization capabilities of Amazon Bedrock.
You can use powerful features such as Amazon Bedrock Knowledge Bases, Guardrails, Agents, and Prompt Flows, allowing your team to rapidly tailor generative AI applications to your specific business needs while adhering to your responsible AI guidelines. The platform supports your governed access and enables secure cross-functional collaboration through access-controlled sharing and git-backed auditability.
What types of data sources does SageMaker support?
SageMaker Lakehouse unifies data across AWS data lakes, data warehouses, third-party applications, and operational databases. It gives you fast, streamlined access to your data in one place through zero-ETL integrations, federated query sources, and 240+ connectors.
How do I ensure that the data in SageMaker is properly governed and secured?
SageMaker provides end-to-end, built-in governance through a unified data management experience in SageMaker Catalog, built on Amazon DataZone. This approach enables you to catalog, discover, access, analyze, and govern both structured and unstructured data assets, ML models, and applications across your organization. The platform ensures that the right people have the appropriate access to the right assets, maintaining robust security and compliance standards.
How do I create and manage data pipelines in SageMaker?
You can create and manage data pipelines in SageMaker in multiple ways. Amazon SageMaker Data Processing brings together Amazon EMR, Athena, AWS Glue, and Amazon MWAA to help you integrate, prepare, and explore your data in a unified experience. You can build pipelines for ML-specific model orchestration with SageMaker AI and data pipelines and workflows with Amazon MWAA. You can also use zero-ETL integrations, which simplify data movement by removing complex extract, transform, and load (ETL) processes and enabling direct data replication across services. Visit What is zero-ETL? to learn more.
Pricing
How does SageMaker pricing work?
When using SageMaker, you will be charged as per the pricing model for the various AWS services accessible through SageMaker. There is no separate cost for using the SageMaker Unified Studio (preview), the data and AI development environment that provides the integrated experience within SageMaker. Visit SageMaker pricing for more information.
Can I try SageMaker for free?
The SageMaker Free Tier helps you quickly get started innovating with data and AI at no cost. Refer to SageMaker pricing for details.
Availability
In which AWS Regions is SageMaker available?
The next generation of SageMaker is available in the US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Tokyo), and Europe (Ireland) AWS Regions. SageMaker Unified Studio and Amazon Bedrock IDE are available in preview in these same AWS Regions. For future updates, see the AWS Regional Services List.
Does SageMaker offer an SLA?
Yes. SageMaker is engineered to provide the consistent performance and uptime that mission-critical analytics and AI workloads demand. As a unified platform comprised of multiple service components, the service availability is tied to the service component used.
For detailed information on the service level agreements (SLAs) for each individual service, refer to its respective SLA documentation. SLAs will provide you with the specific uptime guarantees and reliability commitments for the various services that make up the SageMaker experience.
Available SLA documentation include: