Search the Community
Showing results for tags 'apache'.
-
We are excited to launch two new features that help enforce access controls with Amazon EMR on EC2 clusters (EMR Clusters). These features are supported with jobs that are submitted to the cluster using the EMR Steps API. First is Runtime Role with EMR Steps. A Runtime Role is an AWS Identity and Access Management (IAM) role that you associate with an EMR Step. An EMR Step uses this role to access AWS resources. The second is integration with AWS Lake Formation to apply table and column-level access controls for Apache Spark and Apache Hive jobs with EMR Steps. View the full article
-
- iam
- lake formation
- (and 5 more)
-
Amazon Managed Streaming for Apache Kafka (Amazon MSK) now supports Apache Kafka version 3.1.1 and 3.2.0 for new and existing clusters. Apache Kafka 3.1.1 and Apache Kafka 3.2.0 includes several bug fixes and new features that improve performance. Some of the key features include enhancements to metrics and the use of topic IDs. MSK will continue to use and manage Zookeeper for quorum management in this release for stability. For a complete list of improvements and bug fixes, see the Apache Kafka release notes for 3.1.1 and 3.2.0. View the full article
-
Amazon EMR release 6.6 now supports Apache Spark 3.2, Apache Spark RAPIDS 22.02, CUDA 11, Apache Hudi 0.10.1, Apache Iceberg 0.13, Trino 0.367, and PrestoDB 0.267. You can use the performance-optimized version of Apache Spark 3.2 on EMR on EC2, EKS, and recently released EMR Serverless. In addition Apache Hudi 0.10.1 and Apache Iceberg 0.13 are available on EC2, EKS, and Serverless. Apache Hive 3.1.2 is available on EMR on EC2 and EMR Serverless. Trino 0.367 and PrestoDB 0.267 are only available on EMR on EC2. View the full article
-
AWS Glue can now connect to Apache Kafka using additional client authentication mechanisms. AWS Glue now supports SASL (Simple Authentication and Security Layer) using either SCRAM (Salted Challenge Response Authentication Mechanism) or GSSAPI (Kerberos). View the full article
-
GoAccess is an interactive and real-time web server log analyzer program that quickly analyze and view web server logs. It comes as an open-source and runs as a command line in Unix/Linux operating systems. The post GoAccess (A Real-Time Apache and Nginx) Web Server Log Analyzer first appeared on Tecmint: Linux Howtos, Tutorials & Guides. View the full article
-
Amazon EMR on Amazon EKS provides a new deployment option for Amazon EMR that allows you to run Apache Spark on Amazon Elastic Kubernetes Service (Amazon EKS). If you already use Amazon EMR, you can now run Amazon EMR based applications with other types of applications on the same Amazon EKS cluster to improve resource utilization and simplify infrastructure management across multiple AWS Availability Zones. If you already run big data frameworks on Amazon EKS, you can now use Amazon EMR to automate provisioning and management, and run Apache Spark up to 3x faster. With this deployment option, you can focus on running analytics workloads while Amazon EMR on Amazon EKS builds, configures, and manages containers. View the full article
-
We are pleased to announce the general availability of Amazon MSK Serverless, a type of Amazon MSK cluster that makes it easier for developers to run Apache Kafka without having to manage capacity. MSK Serverless automatically provisions and scales compute and storage resources and offers throughput-based pricing, so you can use Apache Kafka on demand and pay for the data you stream and retain. View the full article
-
Amazon Managed Streaming for Apache Kafka (Amazon MSK) now supports Apache Kafka version 2.7.1 for new and existing clusters. Apache Kafka 2.7.1 includes several bug fixes. For a complete list of fixes, see the Apache Kafka release notes for 2.7.1. View the full article
-
Amazon Managed Streaming for Apache Kafka (Amazon MSK) now supports Apache Kafka version 2.6.2 for new and existing clusters. Apache Kafka 2.6.2 includes several bug fixes and security fixes. Version 2.6.2 will replace 2.6.1 as the default recommended version for new clusters created in Amazon MSK. For a complete list of fixes, see the Apache Kafka release notes for 2.6.2. View the full article
-
Amazon Managed Streaming for Apache Kafka (Amazon MSK) now supports Apache Kafka version 2.8.0 for new and existing clusters. Apache Kafka 2.8.0 includes several bug fixes and new features that improve performance. Some of the key features include connection rate limiting to avoid problems with misconfigured clients (KIP-612) and topic identifiers which provides performance benefits (KIP-516). There is also an early access feature to replace zookeeper with a self-managed metadata quorum (KIP-500), however this is not recommended for use in production. For a complete list of improvements and bug fixes, see the Apache Kafka release notes for 2.8.0. View the full article
-
Developing your website from scratch can be a daunting task. It’s time-consuming and expensive if you are planning to hire a developer. An easy way to get your blog or website off the ground The post How to Install Drupal with Apache on Debian and Ubuntu first appeared on Tecmint: Linux Howtos, Tutorials & Guides. View the full article
-
Apache Flink Kinesis Consumer now supports Enhanced Fan Out (EFO) and the HTTP/2 data retrieval API for Amazon Kinesis Data Streams. EFO allows Amazon Kinesis Data Streams consumers to scale by offering each consumer a dedicated read throughput up to 2MB/second. The HTTP/2 data retrieval API reduces latency of data delivery from producers to consumers to 70 milliseconds or better. In combination, these two features allow you to build low latency Apache Flink applications that utilize dedicated throughput from Amazon Kinesis Data Streams. View the full article
-
Amazon Managed Streaming for Apache Kafka (Amazon MSK) now supports Apache Kafka version 2.7.0 for new and existing clusters. Apache Kafka 2.7.0 includes several bug fixes and new features that improve performance. Some key features include the ability to throttle create topic, create partition, and delete topic operations (KIP-599) and configurable TCP connection timeout (KIP-601). For a complete list of improvements and bug fixes, see the Apache Kafka release notes for 2.7.0. View the full article
-
Amazon Managed Workflows is a new managed orchestration service for Apache Airflow that makes it easier to set up and operate end-to-end data pipelines in the cloud at scale. Apache Airflow is an open source tool used to programmatically author, schedule, and monitor sequences of processes and tasks referred to as “workflows”. View the full article
-
Amazon Kinesis Data Analytics for Apache Flink now provides access to the Apache Flink Dashboard, giving you greater visibility into your applications and advanced monitoring capabilities. You can now view your Apache Flink application’s environment variables, over 120 metrics, logs, and the directed acyclic graph (DAG) of the Apache Flink application in a simple, contextualized user interface. View the full article
-
You can now build and run streaming applications using Apache Flink version 1.11 in Amazon Kinesis Data Analytics for Apache Flink. Apache Flink v1.11 provides improvements to the Table and SQL API, which is a unified, relational API for stream and batch processing and acts as a superset of the SQL language specially designed for working with Apache Flink. Apache Flink v1.11 capabilities also include an improved memory model and RocksDB optimizations for increased application stability, and support for task manager stack traces in the Apache Flink Dashboard. View the full article
-
Amazon Managed Streaming for Apache Kafka (Amazon MSK) now supports Apache Kafka version 2.6.0 for new and existing clusters. Apache Kafka 2.6.0 includes several bug fixes and new features that improve performance. Some key features include native APIs to manage client quotas (KIP-546) and explicit rebalance triggering to enable advanced consumer usecases (KIP-568). For a complete list of improvements and bug fixes, see the Apache Kafka release notes for 2.6.0. View the full article
-
Streaming extract, transform, and load (ETL) jobs in AWS Glue can now read data encoded in the Apache Avro format. Previously, streaming ETL jobs could read data in the JSON, CSV, Parquet, and XML formats. With the addition of Avro, streaming ETL jobs now support all the same formats as batch AWS Glue jobs. View the full article
-
Streaming extract, transform, and load (ETL) jobs in AWS Glue can now ingest data from Apache Kafka clusters that you manage yourself. Previously, AWS Glue supported reading specifically from Amazon Managed Streaming for Apache Kafka (Amazon MSK). With this update, AWS Glue allows you to perform streaming ETL on data from Apache Kafka whether it is deployed on-premises or in the cloud. View the full article