Search the Community
Showing results for tags 'glue'.
-
Auto Scaling in AWS Glue Streaming ETL is now generally available. AWS Glue Streaming ETL jobs can now dynamically scale resources up and down based on the input stream. Auto Scaling helps customers reduce the cost and manual effort required to optimize resources by allocating the right resources necessary for Streaming ETL jobs. View the full article
-
AWS Glue streaming ETL (Extract Transform and Load) can now detect compressed data streaming from Amazon Kinesis, Amazon Managed Streaming for Apache Kafka (Amazon MSK), and self managed Apache Kafka. It can then automatically decompresses this data without customers having to write code, saving them development hours. AWS Glue Streaming ETL jobs continuously consume data from streaming sources, cleans and transforms the data in-flight, and makes it available for analysis in seconds. Customers compress data prior to streaming in-order to improve performance and to avoid throttling limits by Amazon Kinesis and Amazon MSK. Prior to this feature, customers had to write user defined functions to uncompress data from a stream, which is time consuming. View the full article
-
AWS Glue Visual Job APIs are now generally available, allowing customers to programmatically create, read, update, and delete AWS Glue studio visual jobs. AWS Glue Studio provides an intuitive visual interface for users to author data integration jobs. Customers want to programmatically create visual jobs in AWS Glue Studio so that they could migrate from other ETL tools and copy jobs to other environments. View the full article
-
Streaming extract, transform, and load (ETL) jobs in AWS Glue can now automatically detect the schema of incoming records and gracefully handle schema changes on a per-record basis. Previously, you needed to specify the schema of incoming data using the AWS Glue Data Catalog and update ETL scripts to handle schema changes. The AWS Glue job can now do both for you, saving time on reworking code and increasing the flexibility of your ETL jobs. View the full article
-
You can now use Amazon Elasticsearch Service as a target data store with AWS Glue Elastic Views. Now in limited preview, AWS Glue Elastic Views is a new capability of AWS Glue that makes it easy to combine and replicate data across multiple data stores without you having to write custom code. With AWS Glue Elastic Views, you can use familiar Structured Query Language (SQL) to quickly create a virtual table—called a view—from multiple different source data stores. Based on this view, AWS Glue Elastic Views copies data from each source data store and creates a replica—called a materialized view—in a target data store. AWS Glue Elastic Views monitors for changes to data in your source data stores continuously, and provides updates to your target data stores automatically, ensuring data accessed through the materialized view is always up-to-date. View the full article
- 1 reply
-
- aws
- elasticsearch
-
(and 1 more)
Tagged with:
-
You now can use Amazon DynamoDB as a source data store with AWS Glue Elastic Views to combine and replicate data across multiple data stores—without having to write custom code. With AWS Glue Elastic Views, you can use Structured Query Language (SQL) to quickly create a virtual table—called a view—from multiple source data stores. Based on this view, AWS Glue Elastic Views copies data from each source data store and creates a replica—called a materialized view—in a target data store. AWS Glue Elastic Views monitors continuously for changes to data in your source data stores, and provides updates to your target data stores automatically, ensuring that data accessed through the materialized view is always up to date. View the full article
-
Now in preview, AWS Glue Elastic Views is a new capability of AWS Glue that makes it easy to build materialized views that combine and replicate data across multiple data stores without you having to write custom code. With AWS Glue Elastic Views, you can use familiar Structured Query Language (SQL) to quickly create a virtual table—a materialized view—from multiple different source data stores. AWS Glue Elastic Views copies data from each source data store and creates a replica in a target data store. AWS Glue Elastic Views continuously monitors for changes to data in your source data stores, and provides updates to the materialized views in your target data stores automatically, ensuring data accessed through the materialized view is always up-to-date. View the full article
-
Errors in Spark applications commonly arise from inefficient Spark scripts, distributed in-memory execution of large-scale transformations, and dataset abnormalities. AWS Glue workload partitioning is the newest offering from AWS Glue to address these issues and improve the reliability of Spark applications and consistency of run-time. Workload partitioning enables you to specify how much data to process in each job-run and, using AWS Glue job bookmarks, track how much of the data AWS Glue processed. View the full article
-
AWS Glue Schema Registry, a serverless feature of AWS Glue, enables you to validate and control the evolution of streaming data using registered Apache Avro schemas, at no additional charge. Through Apache-licensed serializers and deserializers, the Schema Registry integrates with Java applications developed for Apache Kafka/Amazon Managed Streaming for Apache Kafka (MSK), Amazon Kinesis Data Streams, Apache Flink/Amazon Kinesis Data Analytics for Apache Flink, and AWS Lambda. View the full article
-
AWS Glue DataBrew is a new visual data preparation tool for AWS Glue that helps you clean and normalize data without writing code, reducing the time it takes to prepare data for analytics and machine learning by up to 80% compared to traditional approaches to data preparation. AWS Glue DataBrew features an easy-to-use visual interface that helps data analysts and data scientists of all technical levels understand, combine, clean, and transform data. View the full article
-
AWS Glue crawlers now support Amazon DocumentDB (with MongoDB compatibility) and MongoDB collections. You can now use AWS Glue crawlers to infer schema of Amazon DocumentDB (with MongoDB compatibility) and MongoDB collections and create or update a table in the Glue Data Catalog. A configuration option allows you to specify if you want the crawler to crawl the entire data set or select a sample of the data to reduce crawl time. View the full article
-
Streaming extract, transform, and load (ETL) jobs in AWS Glue can now ingest data from Apache Kafka clusters that you manage yourself. Previously, AWS Glue supported reading specifically from Amazon Managed Streaming for Apache Kafka (Amazon MSK). With this update, AWS Glue allows you to perform streaming ETL on data from Apache Kafka whether it is deployed on-premises or in the cloud. View the full article
-
Forum Statistics
67.4k
Total Topics65.3k
Total Posts