Errors in Spark applications commonly arise from inefficient Spark scripts, distributed in-memory execution of large-scale transformations, and dataset abnormalities. AWS Glue workload partitioning is the newest offering from AWS Glue to address these issues and improve the reliability of Spark applications and consistency of run-time. Workload partitioning enables you to specify how much data to process in each job-run and, using AWS Glue job bookmarks, track how much of the data AWS Glue processed.
View the full article