Showing results for tags 'databases'.

Valkey is Rapidly Overtaking Redis

Devops.com posted a topic in Databases, Data Engineering & Data Science

Redis is taking it in the chops, as both maintainers and customers move to the Valkey Redis fork. View the full article

Friday at 09:45 PM
- redis
- valkey
- (and 1 more)
  Tagged with:
  - redis
  - valkey
  - databases

How to Navigate the Costs of Legacy SIEMS with Snowflake

Snowflake posted a topic in Security, Governance, Risk & Compliance

Legacy security information and event management (SIEM) solutions, like Splunk, are powerful tools for managing and analyzing machine-generated data. They have become indispensable for organizations worldwide, particularly for security teams. But as much as security operation center (SOC) analysts have come to rely on solutions like Splunk, there is one complaint that comes up for some: Costs can quickly add up. The issue centers around their volume-based pricing model. This model can force security teams to make difficult decisions on what data to ingest. There are a number of online threads — see here, here and here just to link to a few — dedicated to how best to control costs, while limiting how much an organization has to compromise its security. But what if security teams didn’t have to make tradeoffs? This blog post explores how Snowflake can help with this challenge. Let’s start with five cost factors organizations need to consider with their legacy SIEM solution and how Snowflake can help. Legacy SIEM cost factors to keep in mind Data ingestion: Traditional SIEMs often impose limits to data ingestion and data retention. Snowflake allows security teams to store all their data in a single platform and maintain it all in a readily accessible state, with virtually unlimited cloud data storage capacity. Now there are a few ways to ingest data into Snowflake. Security sources can be ingested directly through native means such as streaming, stages, syslog, native connectors or secure data sharing. Snowflake’s Snowpipe service helps bring in new data easily, at a price that is tailored to an organization’s needs. The most common method is Snowpipe auto ingest, which works for security teams who regularly ingest machine data. But this method isn’t for everyone because loading small amounts of data slowly or many small files can cost more than other options. Snowpipe Streaming is another method that can save security teams money. With Snowpipe Streaming there’s no need to prepare files before loading, making the cost of getting data more predictable. Security teams can also reduce their costs by loading certain datasets in batches instead of continuously. For example, they could load a lot of data that isn’t needed for instant detection three times a day instead of constantly streaming that data, which can lead to more significant savings. Data retention: Many legacy SIEMS delete activity logs, transaction records, and other details from their systems after a few days, weeks or months. With Snowflake, security teams don’t have to work around these data retention windows. Instead, all data is always accessible for analysis, which simplifies cost planning and the data management strategy. It also provides more reliable generation of key security metrics such as visibility coverage, SLA performance, mean time to detect (MTTD) and mean-time-to-respond (MTTR). Snowflake also helps security teams save time by automatically compressing and encrypting the data, making it ready to query. Detection and investigation processing: Security teams depend on detection rules to find important events automatically. These rules need computing power to analyze data and spot attacks. In the cloud, computing can be measured in various ways, like bytes scanned or CPU cycles. This affects how much it costs and how predictable the costs are for processing detections. While computing costs might not have been a concern with fixed hardware in the past, it’s a whole new game in the cloud. For security teams, investigations require computational power to analyze collected data similar to running detections. Some solutions utilize different engines, such as stream or batch processing, for detections and investigations, while others employ the same engine for both tasks. Snowflake helps security teams understand how the query engine functions at a basic level, which helps them effectively plan for the cost estimates of their investigations. Moving away from volume ingest-based pricing A traditional SIEM typically manages all the data ingestion, transformation, detection and investigation processing for security teams. While out-of-the-box connectors and normalization can be useful, customers end up paying more by the nature of legacy SIEMs that use ingest volume-based pricing models. It’s important here to understand how this pricing model works. Ingest volume-based pricing can vary among the different legacy SIEM vendors but the basic principle remains the same: the more data security teams send to the SIEM for analysis, the higher the cost. By moving away from traditional volume-based pricing models, security teams can gain more control of what logs they have access to and how much they are spending. A consumption-based pricing model, like Snowflake’s, allows security teams to have all the data on hand while paying for only the compute resources they use, making security more cost-effective. Snowflake’s pricing model is designed to offer flexibility and scalability, giving security teams the ability to only pay for the resources they use without being tied to long-term contracts or upfront commitments. How Snowflake Works An open-architecture deployment with a modern security data lake, and best-of-breed applications from Snowflake, can keep costs down while improving an organization’s security posture. A security data lake eliminates data silos by removing limits on ingest and retention. Organizations can use a security data lake to scale resources up and down automatically and only pay for the resources they use — potentially controlling their costs without compromising their security. Security data lakes can also help analysts apply complex detection logic and security policies to log data and security tool output. Security analysts can quickly join security logs with contextual data sets, such as asset inventory, user details, configuration details, and other information, to eliminate would-be false positives, and identify stealthy threats. The value proposition is clear: organizations can consolidate their security data affordably and gain the flexibility to query that data at any time. Snowflake empowers organizations to make data-driven choices for long-term gain. We’ll dive into some customer success stories to show the potential of this approach. Real customer success stories If done right, Snowflake customers can experience remarkable cost savings. Let’s take a closer look at some notable success stories across various industries. At Comcast, Snowflake’s security data lake is now an integral component of their security data fabric. Instead of employees managing on-premises infrastructure, the Comcast security data lake built on Snowflake’s elastic engine in the cloud stores over 10 petabytes (PBs) of data with hot retention for over a year, saving millions of dollars. Automated sweeps of over 50,000 indicators of compromise (IOCs) across the 10-PB security data lake can now be completed in under 30 minutes. Guild Education can claim “up to 50% cost savings” working with Snowflake and is just one example that highlights the potentially significant financial benefits organizations can unlock with the Snowflake Data Cloud. By adopting Snowflake as its data lake for security events, corporate travel management company Navan achieved a best-of-breed security architecture that is both cost-efficient and cutting-edge. The results are impressive: Over 70% cost savings by adopting a modern SIEM-less architecture 15K+ hours saved in 8 months 4x improvements in MITRE ATT&CK coverage in 8 months Ready to witness the transformative power of Snowflake? Watch our demo and discover how you can revolutionize your data management strategy, unlock substantial cost savings, and propel your organization into a new era of efficiency and innovation. Learn how you can augment your Splunk strategy with Snowflake today. The post How to Navigate the Costs of Legacy SIEMS with Snowflake appeared first on Snowflake. View the full article

Thursday at 06:47 PM
- siem
- snowflake
- (and 1 more)
  Tagged with:
  - siem
  - snowflake
  - databases

gitops 5 Best practices for implementing GitOps for stateful applications and databases in Kubernetes

Amazic posted a topic in CI/CD, GitOps, Orchestration & Scheduling

GitOps represents a transformative approach to managing and deploying applications within Kubernetes environments, offering many benefits ranging from automation to enhanced collaboration. By centralizing operations around Git repositories, GitOps streamlines processes, fosters reliability, and nurtures teamwork. However, as teams embrace GitOps principles, the natural question arises: can these principles extend to managing databases? The answer is a resounding yes! Yet, while GitOps seamlessly aligns with stateless application management, applying it to stateful workloads, especially databases, presents distinct challenges. In this article, we’ll delve into the landscape of implementing GitOps for stateful applications and databases in Kubernetes, exploring five essential best practices to navigate this terrain effectively. Considerations for applying GitOps to stateful applications Versioning and managing stateful data When applying GitOps principles, it’s crucial to version control persistent data alongside application code. Tools like Git LFS (Large File Storage) can help you manage large datasets efficiently. Ensure that changes to stateful data are captured in Git commits and properly documented to maintain data integrity and facilitate reproducibility. Handling database schema changes and migrations Database schema changes and migrations require careful handling in GitOps workflows. Define database schema changes as code and store migration scripts in version-controlled repositories. Test and apply migrations consistently across environments with automated tools and continuous integration/continuous delivery processes. Backup and disaster recovery strategies Develop robust backup and disaster recovery strategies for stateful applications. Regularly back up data and configuration files to resilient storage solutions. Test data recovery and automate backup procedures to ensure preparedness for unforeseen events or data loss. You can also leverage GitOps practices to manage backup configurations and version-controlled recovery plans. Managing migrations like any other GitOps application Migration should follow suit as applications are deployed and managed using GitOps principles. This means defining migration tasks as declarative configurations stored in Git repositories alongside other application artifacts. These migration configurations should specify the desired state of the database schema or data transformation, including any dependencies or prerequisites. GitOps operators, such as the Atlas Operator for databases, can then pull these migration configurations from Git repositories and apply them to target databases. The operator ensures that the database’s actual state aligns with the desired state defined in the Git repository, automating the process of applying migrations and maintaining consistency across environments. Handling stateful application upgrades and rollbacks Planned and executed stateful application upgrades and rollbacks carefully. Define upgrade strategies that minimize downtime and data loss during the migration process to minimize downtime and data loss. Utilize GitOps principles to manage version-controlled manifests for application upgrades, ensuring consistency and reproducibility across environments. Implement automated rollback mechanisms to revert to previous application versions in case of failures or issues during upgrades. Best practices for implementing GitOps with stateful applications Infrastructure as Code (IaC) for provisioning storage resources Make your storage resources part of your Infrastructure as Code (IaC) practices. Define your storage configurations using tools like Terraform or Kubernetes manifests, and keep them version-controlled alongside your application code. This ensures consistency and reproducibility in your infrastructure deployments. Using Helm charts or operators Simplify the deployment and management of your stateful applications, including databases, by using Helm charts or Kubernetes operators. Helm helps package and template complex configurations while operators automate common operational tasks. Pick the best fit for your needs and keep your application management consistent. Implementing automated testing Automate your testing as much as possible for your stateful applications, including database changes. Develop thorough test suites to check everything from functionality to performance. Tools like Kubernetes Testing Framework (KTF) can help simulate production-like environments and catch issues early on. Continuous (CI/CD) pipelines Set up CI/CD pipelines tailored to your stateful applications, focusing on databases. Automate your build, test, and deployment processes to ensure smooth operation. Remember to trigger pipeline executions based on version-controlled changes so you have consistent deployments across different environments. Storing everything in Git By storing everything in Git repositories, teams benefit from version control, traceability, and collaboration. Every change made to configurations or migrations is tracked, providing a clear history of modifications and enabling easy rollback to previous states if necessary. Moreover, Git’s branching and merging capabilities facilitate collaborative development efforts, allowing multiple team members to work concurrently on different features or fixes without stepping on each other’s toes. Mastering GitOps for stateful workloads Applying GitOps principles to stateful applications, including databases, brings numerous benefits to development and operations teams. By storing everything in Git repositories, including data, migration scripts, and configurations, teams ensure version control, traceability, and collaboration. Handling database schema changes, migrations, and backups within GitOps workflows ensures consistency and reliability across environments. Moreover, managing migrations and upgrades as part of GitOps applications streamlines deployment processes and reduces the risk of errors. Implementing best practices such as Infrastructure as Code (IaC), leveraging Helm charts or operators, and implementing automated testing and CI/CD pipelines further enhances the efficiency and reliability of managing stateful applications. By adopting GitOps for stateful applications, organizations can achieve greater agility, scalability, and resilience in their software delivery processes. With a solid foundation of GitOps principles and best practices in place, teams can confidently navigate the complexities of managing stateful applications in Kubernetes environments, enabling them to focus on efficiently delivering value to their customers. The post 5 Best practices for implementing GitOps for stateful applications and databases in Kubernetes appeared first on Amazic. View the full article

April 18
- best practices
- kubernetes
- (and 2 more)
  Tagged with:

google cloud next 2024 All 218 things we announced at Google Cloud Next ‘24 – a recap

Google Cloud Platform posted a topic in Google Cloud Platform

Google Cloud Next made a big splash in Las Vegas this week! From our opening keynote showcasing incredible customer momentum to exciting product announcements, we covered how AI is transforming the way that companies work. You can catch up on the highlights in our 14 minute keynote recap! Developers were front and center at our Developer keynote and in our buzzing Innovators Hive on the Expo floor (which was triple the size this year!). Our nearly 400 partner sponsors were also deeply integrated throughout Next, bringing energy from the show floor to sessions and evening events throughout the week. Last year, we talked about the exciting possibilities of generative AI, and this year it was great to showcase how customers are now using it to transform the way they work. At Next ‘24, we featured 300+ customer and partner AI stories, 500+ breakout sessions, hands-on demos, interactive training sessions, and so much more. It was a jam-packed week, so we’ve put together a summary of our announcements which highlight how we’re delivering the new way to cloud. Read on for a complete list of the 218 (yes, you read that right) announcements from Next ‘24: Gemini for Google Cloud We shared how Google's Gemini family of models will help teams accomplish more in the cloud, including: 1. Gemini for Google Cloud, a new generation of AI assistants for developers, Google Cloud services, and applications. 2. Gemini Code Assist, which is the evolution of the Duet AI for Developers. 3. Gemini Cloud Assist, which helps cloud teams design, operate, and optimize their application lifecycle. 4. Gemini in Security Operations, generally available at the end of this month, converts natural language to new detections, summarizes event data, recommends actions to take, and navigates users through the platform via conversational chat. 5. Gemini in BigQuery, in preview, enables data analysts to be more productive, improve query performance and optimize costs throughout the analytics lifecycle. 6. Gemini in Looker, in private preview, provides a dedicated space in Looker to initiate a chat on any topic with your data and derive insights quickly. 7. Gemini in Databases, also in preview, helps developers, operators, and database administrators build applications faster using natural language; manage, optimize and govern an entire fleet of databases from a single pane of glass; and accelerate database migrations. Customer Stories We shared new customer announcements, including: 8. Cintas is leveraging Google Cloud’s gen AI to develop an internal knowledge center that will allow its customer service and sales employees to easily find key information. 9. Bayer will build a radiology platform that will help Bayer and other companies create and deploy AI-first healthcare apps that assist radiologists, ultimately improving efficiency and diagnosis turn-around time. 10. Best Buy is leveraging Google Cloud’s Gemini large language model to create new and more convenient ways to give customers the solutions they need, starting with gen AI virtual assistants that can troubleshoot product issues, reschedule order deliveries, and more. 11. Citadel Securities used Google Cloud to build the next generation of its quantitative research platform that increased its research productivity and price-performance ratio. 12. Discover Financial is transforming customer experience by bringing gen AI to its customer contact centers to improve agent productivity through personalized resolutions, intelligent document summarization, real-time search assistants, and enhanced self-service options. 13. IHG Hotels & Resorts is using Gemini to build a generative AI-powered chatbot to help guests easily plan their next vacation directly in the IHG Hotels & Rewards mobile app. 14. Mercedes-Benz will expand its collaboration with Google Cloud, using our AI and gen AI technologies to advance customer-facing use cases across e-commerce, customer service, and marketing. 15. Orange is expanding its partnership with Google Cloud to deploy generative AI closer to Orange’s and its customers’ operations to help meet local requirements for trusted cloud environments and accelerate gen AI adoption and benefits across autonomous networks, workforce productivity, and customer experience. 16. WPP will leverage Google Cloud’s gen AI capabilities to deliver personalization, creativity, and efficiency across the business. Following the adoption of Gemini, WPP is already seeing internal impacts, including real-time campaign performance analysis, streamlined content creation processes, AI narration, and more. 17. Covered California, California’s health insurance marketplace, will simplify the healthcare enrollment process using Google Cloud’s Document AI, enabling the organization to verify more than 50,000 healthcare documents with a 84% verification rate per month. Workspace and collaboration The next wave of innovations and enhancements are coming to Google Workspace: 18. Google Vids, a key part of our Google Workspace innovations, is a new AI-powered video creation app for work that sits alongside Docs, Sheets and Slides. Vids will be released to Workspace Labs in June. 19. Gemini is coming to Google Chat in preview, giving you an AI-powered teammate to summarize conversations, answer questions, and more. 20. The new AI Meetings and Messaging add-on is priced at $10 per user, per month, and includes: Take notes for me, now in preview, translate for me, coming in June, which automatically detects and translates captions in Meet, with support for 69 languages, and automatic translation of messages and on-demand conversation summaries in Google Chat, coming later this year. 21. Using large language models, Gmail can now block an additional 20% more spam and evaluate 1,000 times more user-reported spam every day. 22. A new AI Security add-on allows IT teams to automatically classify and protect sensitive files in Google Drive, and is available for $10 per user, per month. 23. We’re extending DLP controls and classification labels to Gmail in beta. 24. We’re adding experimental support for post-quantum cryptography (PQC) in client-side encryption with our partners Thales and Fortanix. 25. Voice prompting and instant polish in Gmail: Send emails easily when you’re on the go with voice input in Help me write, and convert rough notes to a complete email with one click. 26. A new tables feature in Sheets (generally available in the coming weeks) formats and organizes data with a sleek design and a new set of building blocks — from project management to event planning templates witautomatic alerts based on custom triggers like a change in a status field. 27. Tabs in Docs (generally available in the coming weeks) allow you to organize information in a single document rather than linking to multiple documents or searching through Drive. 28. Docs now supports full-bleed cover images that extend from one edge of your browser to the other; generally available in the coming weeks. 29. Generally available in the coming weeks, Chat will support increased member capacity of up to 500,000 in spaces. 30. Messaging interoperability for Slack and Teams is now generally available through our partner Mio. AI infrastructure 31. The Cloud TPU v5p GA is now generally available. 32. Google Kubernetes Engine (GKE) now supports Cloud TPU v5p and TPU multi-host serving, also generally available. 33. A3 Mega compute instance powered by NVIDIA H100 GPUs offers double the GPU-to-GPU networking bandwidth of A3, and will be generally available in May. 34. Confidential Computing is coming to the A3 VM family, in preview later this year. 35. The NVIDIA Blackwell GPU platform will be available on the AI Hypercomputer architecture in two configurations: NVIDIA HGX B200 for the most demanding AI, data analytics, and HPC workloads; and the liquid-cooled GB200 NVL72 GPU for real-time LLM inference and training massive-scale models. 36. New caching capabilities for Cloud Storage FUSE improve training throughput and serving performance, and are generally available. 37. The Parallelstore high-performance parallel filesystem now includes caching in preview. 38. Hyperdisk ML in preview is a next-generation block storage service optimized for AI inference/serving workloads. 39. The new open-source MaxDiffusion is a new high-performance and scalable reference implementation for diffusion models. 40. MaxText, a JAX LLM, now supports new LLM models including Gemma, GPT3, LLAMA2 and Mistral across both Cloud TPUs and NVIDIA GPUs. 41. PyTorch/XLA 2.3 will follow the upstream release later this month, bringing single program, multiple data (SPMD) auto-sharding, and asynchronous distributed checkpointing features. 42. For Hugging Face PyTorch users, the Hugging Face Optimum-TPU package lets you train and serve Hugging Face models on TPUs. 43. Jetstream is a new open-source, throughput- and memory-optimized LLM inference engine for XLA devices (starting with TPUs); it supports models trained with both JAX and PyTorch/XLA, with optimizations for popular open models such as Llama 2 and Gemma. 44. Google models will be available as NVIDIA NIM inference microservices. 45. Dynamic Workload Scheduler now offers two modes: flex start mode (in preview), and calendar mode (in preview). 46. We shared the latest performance results from MLPerf™ Inference v4.0 using A3 virtual machines (VMs) powered by NVIDIA H100 GPUs. 47. We shared performance benchmarks for Gemma models using Cloud TPU v5e and JetStream. 48. We introduced ML Productivity Goodput, a new metric to measure the efficiency of an overall ML system, as well as an API to integrate into your projects, and methods to maximize ML Productivity Goodput. Vertex AI 49. Gemini 1.5 Pro is now available in public preview in Vertex AI, bringing the world’s largest context window to developers everywhere. 50. Gemini 1.5 Pro on Vertex AI can now process audio streams including speech, and the audio portion of videos. 51. Imagen 2.0, our family of image generation models, can now be used to create short, 4-second live images from text prompts. 52. Image editing is generally available in Imagen 2.0, including inpainting/outpainting and digital watermarking powered by Google DeepMind’s SynthID. 53. We added CodeGemma, a new model from our Gemma family of lightweight models, to Vertex AI. 54. Vertex AI has expanded grounding capabilities, including the ability to directly ground responses with Google Search, now in public preview. 55. Vertex AI Prompt Management, in preview, helps teams improve prompt performance. 56. Vertex AI Rapid Evaluation, in preview, helps users evaluate model performance when iterating on the best prompt design. 57. Vertex AI AutoSxS is now generally available, and helps teams compare the performance of two models. 58. We expanded data residency guarantees for data stored at-rest for Gemini, Imagen, and Embeddings APIs on Vertex AI to 11 new countries: Australia, Brazil, Finland, Hong Kong, India, Israel, Italy, Poland, Spain, Switzerland, and Taiwan. 59. When using Gemini 1.0 Pro and Imagen, you can now limit machine-learning processing to the United States or European Union. 60. Vertex AI hybrid search, in preview, integrates vector-based and keyword-based search techniques to ensure relevant and accurate responses for users. 61. The new Vertex AI Agent Builder, in preview, lets developers build and deploy gen AI experiences using natural language or open-source frameworks like LangChain on Vertex AI. 62. Vertex AI includes two new text embedding models in public preview: the English-only text-embedding-preview-0409, and the multilingual text-multilingual-embedding-preview-0409 Core infrastructure Thomas with the Google Axion chip 63. We expanded Google Cloud’s compute portfolio, with major product releases spanning compute and storage for general-purpose workloads, as well as for more specialized workloads like SAP and high-performance databases. 64. Google Axion is our first custom Arm-based CPU designed for the data center, and will be in preview in the coming months. 65. Now in preview, the Compute Engine C4 general-purpose VM provides high performance paired with a controlled maintenance experience for your mission-critical workloads. 66. The general-purpose N4 machine series is built for price-performance with Dynamic Resource Management, and is generally available. 67. C3 bare-metal machines, available in an upcoming preview, provide workloads with direct access to the underlying server’s CPU and memory resources. 68. New X4 memory-optimized instances are now in preview, through this interest form. 69. Z3 VMs are designed for storage-dense workloads that require SSD, and are generally available. 70. Hyperdisk Storage Pools Advanced Capacity, in general availability, and Advanced Performance in preview, allow you to purchase and manage block storage capacity in a pool that’s shared across workloads. 71. Coming to general availability in May, Hyperdisk Instant Snapshots provide near-zero RPO/RTO for Hyperdisk volumes. 72. Google Compute Engine users can now use zonal flexibility, VM family flexibility, and mixed on-demand and spot consumption to deploy their VMs. As part of Google Distributed Cloud (GDC) offering, we announced: 73. A generative AI search packaged solution powered by Gemma open models will be available in preview in Q2 2024 on GDC to help customers retrieve and analyze data at the edge or on-premises. 74. GDC has achieved ISO27001 and SOC2 compliance certifications. 75. A new managed Intrusion Detection and Prevention Solution (IDPS) integrates Palo Alto Networks threat prevention technology with GDC, and is now generally available. 76. GDC Sandbox, in preview, helps application developers build and test services designed for GDC in a Google Cloud environment, without needing to navigate the air-gap and physical hardware. 77. A preview GDC storage flexibility feature can help you grow your storage independent of compute, with support for block, file, or object storage. 78. GDC can now run in disconnected mode for up to seven days, and offers a suite of offline management features to help ensure deployments and workloads are accessible and working while they are disconnected; this capability is generally available. 79. New Managed GDC Providers who can sell GDC as a managed service include Clarence, T-Systems, and WWT.and a new Google Cloud Ready — Distributed Cloud badge signals that a solution has been tuned for GDC. 80. GDC servers are now available with an energy-efficient NVIDIA L4 Tensor Core GPU. 81. Google Distributed Cloud Hosted (GDC Hosted) is now authorized to host Top Secret and Secret missions for the U.S. Intelligence Community, and Top Secret missions for the Department of Defense (DoD). From our Google Cloud Networking family, we announced: 82. Gemini Cloud Assist, in preview, provides AI-based assistance to solve a variety of networking tasks such as generating configurations, recommending capacity, correlating changes with issues, identifying vulnerabilities, and optimizing performance. 83. Now generally available, the Model as a Service Endpoint solution uses Private Service Connect, Cloud Load Balancing, and App Hub lets model creators own the model service endpoint to which application developers then connect. 84. Later this year, Cloud Load Balancing will add enhancements for inference workloads: Cloud Load Balancing with custom metrics, Cloud Load Balancing for streaming inference, and Cloud Load Balancing with traffic management for AI models. 85. Cloud Service Mesh is a fully managed service mesh that combines Traffic Director’s control plane and Google’s open-source Istio-based service mesh, Anthos Service Mesh. A service-centric Cross-Cloud Network delivers a consistent, secure experience from any cloud to any service, and includes the following enhancements: 86. Private Service Connect transitivity over Network Connectivity Center, available in preview this quarter, enables services in a spoke VPC to be transitively accessible from other spoke VPCs. 87. Cloud NGFW Enterprise (formerly Cloud Firewall Plus), now GA, provides network threat protection powered by Palo Alto Networks, plus network security posture controls for org-wide perimeter and Zero Trust microsegmentation. 88. Identity-based authorization with mTLS integrates the Identity-Aware Proxy with our internal application Load Balancer to support Zero Trust network access, including client-side and soon, back-end mutual TLS. 89. In-line network data-loss prevention (DLP), in preview soon, integrates Symantec DLP into Cloud Load Balancers and Secure Web Proxy using Service Extensions. 90. Partners Imperva, HUMAN Security, Palo Alto Networks and Traceable are integrating their advanced web protection services into Service Extensions, as are web services providers Cloudinary, Nagra, Queue-it, and Datadog. 91. Service Extensions now has a library of code examples to customize origin selection, adjust headers, and more. 92. Private Service Connect is now fully integrated with Cloud SQL, and generally available. There are many improvements to our storage offerings: 93. Generate insights with Gemini lets you use natural language to analyze your storage footprint, optimize costs, and enhance security across billions of objects. It is available now through the Google Cloud console as an allowlist experimental release. 94. Google Cloud NetApp Volumes is expanding to 15 new Google Cloud regions in Q2’24 (GA) and includes a number of enhancements: dynamically migrating files by policy to lower-cost storage based on access frequency (in preview Q2’24); increasing Premium and Extreme service levels up to 1PB in size, with throughput performance up to 3X (preview Q2’24). NetApp Volumes also includes a new Flex service level enabling volumes as small as 1GiB. 95. Filestore now supports single-share backup for Filestore Persistent Volumes and GKE (generally available) and NFS v4.1 (preview), plus expanded Filestore Enterprise capacity up to 100TiB. For Cloud Storage: 96. Cloud Storage Anywhere Cache now uses zonal SSD read cache across multiple regions within a continent (allowlist GA). 97. Cloud Storage soft delete protects against accidental or malicious deletion of data by preserving deleted items for a configurable period of time (generally available). 98. The new Cloud Storage managed folders resource type allows granular IAM permissions to be applied to groups of objects (generally available). 99. Tag-based at-scale backup helps manage data protection for Compute Engine VMs (generally available). 100. The new high-performance backup option for SAP HANA leverages persistent disk (PD) snapshot capabilities for database-aware backups (generally available). 101. As part of Backup and DR Service Report Manager, you can now customize reports with data from Google Cloud Backup and DR using Cloud Monitoring, Cloud Logging, and BigQuery (generally available). Databases 102. Database Studio, a part of Gemini in Databases, brings SQL generation and summarization capabilities to our rich SQL editor in the Google Cloud console, as well as an AI-driven chat interface. 103. Database Center lets operators manage an entire fleet of databases through intelligent dashboards that proactively assess availability, data protection, security, and compliance issues, as well as with smart recommendations to optimize performance and troubleshoot issues. 104. Database Migration Service is also integrated with Gemini in Databases, including assistive code conversion (e.g., from Oracle to PostgreSQL) and explainability features. Likewise, AlloyDB gains a lot of new functionality: 105. AlloyDB AI lets gen AI developers build applications that accurately query data with natural language, just like they do with SQL; available now in AlloyDB Omni. 106. AlloyDB AI now includes a new pgvector-compatible index based on Google’s approximate nearest neighbor algorithms, or ScaNN; it’s available as a technology preview in AlloyDB Omni. 107. AlloyDB model endpoint management makes it easier to call remote Vertex AI, third-party, and custom models; available in AlloyDB Omni today and soon on AlloyDB in Google Cloud. 108. AlloyDB AI “parameterized secure views” secures data based on end-users’ context; available now in AlloyDB Omni. Bigtable, which turns 20 this year, got several new features: 109. Bigtable Data Boost, a pre-GA offering, delivers high-performance, workload-isolated, on-demand processing of transactional data, without disrupting operational workloads. 110. Bigtable authorized views, now generally available, allow multiple teams to leverage the same tables and securely share data directly from the database. 111. New Bigtable distributed counters in preview process high-frequency event data like clickstreams directly in the database. 112. Bigtable large nodes, the first of other workload-optimized node shapes, offer more performance stability at higher server utilization rates, and are in private preview. Memorystore for Redis Cluster, meanwhile: 113. Now supports both AOF (Append Only File) and RDB (Redis Database)-based persistence and has new node shapes that offer better performance and cost management. 114. Offers ultra-fast vector search, now generally available. 115. Includes new configuration options to tune max clients, max memory, max memory policies, and more, now in preview. Firestore users, take note: 116. Gemini Code Assist now incorporates assistive capabilities for developing with Firestore. 117. Firestore now has built-in support for vector search using exact nearest neighbors, the ability to automatically generate vector embeddings using popular embedding models via a turn-key extension, and integrations with popular generative AI libraries such as LangChain and LlamaIndex. 118. Firestore Query Explain in preview can help you troubleshoot your queries. 119. Firestore now supports Customer Managed Encryption Keys (CMEK) in preview, which allows you to encrypt data stored at-rest using your own specified encryption key. 120. You can now deploy Firestore in any available supported Google Cloud region, and Firestore’s Scheduled Backup feature can now retain backups for up to 98 days, up from seven days. 121. Cloud SQL Enterprise Plus edition now offers advanced failover capabilities such as orchestrated switchover and switchback Data analytics 122. BigQuery is now Google Cloud’s single integrated platform for data to AI workloads, with BigLake, BigQuery’s unified storage engine, providing a single interface across BigQuery native and open formats for analytics and AI workloads. 123. BigQuery better supports Iceberg, DDL, DML and high-throughput support in preview, while BigLake now supports the Delta file format, also in preview. 124. BigQuery continuous queries are in preview, providing continuous SQL processing over data streams, enabling real-time pipelines with AI operators or reverse ETL. The above-mentioned Gemini in BigQuery enables all manner of new capabilities and offerings: 125. New BigQuery integrations with Gemini models in Vertex AI support multimodal analytics and vector embeddings, and fine-tuning of LLMs. 126. BigQuery Studio provides a collaborative data workspace, the choice of SQL, Python, Spark or natural language directly, and new integrations for real-time streaming and governance; it is now generally available. 127. The new BigQuery data canvas provides a notebook-like experience with embedded visualizations and natural language support courtesy of Gemini. 128. BigQuery can now connect models in Vertex AI with enterprise data, without having to copy or move data out of BigQuery. 129. You can now use BigQuery with Gemini 1.0 Pro Vision to analyze both images and videos by combining them with your own text prompts using familiar SQL statements. 130. Column-level lineage in BigQuery and expanded lineage capabilities for Vertex AI pipelines will be in preview soon. Other updates to our data analytics portfolio include: 131. Apache Kafka for BigQuery as a managed service is in preview, to enable streaming data workloads based on open source APIs. 132. A serverless engine for Apache Spark integrated within BigQuery Studio is now in preview. 133. Dataplex features expanded data-to-AI governance capabilities in preview. Developers & operators Gemini Code Assist includes several new enhancements: 134. Full codebase awareness, in preview, uses Gemini 1.5 Pro to make complex changes, add new features, and streamline updates to your codebase. 135. A new code transformation feature available today in Cloud Workstations and Cloud Shell Editor lets you use natural language prompts to tell Gemini Code Assist to analyze, refactor, and optimize your code. 136. Gemini Code Assist now has extended local context, automatically retrieving relevant local files from your IDE workspace and displaying references to the files used. 137. With code customization in private preview, Gemini Code Assist lets you integrate private codebases and repositories for hyper-personalized code generation and completions, and connects to GitLab, GitHub, and Bitbucket source-code repositories. 138. Gemini Code Assist extends to Apigee and Application Integration in preview, to access and connect your applications. 139. We extended our partnership with Snyk to Gemini Code Assist, letting you learn about vulnerabilities and common security topics right within your IDE. 140. The new App Hub provides an accurate, up-to-date representation of deployed applications and their resource dependencies. Integrated with Gemini Cloud Assist, App Hub is generally available. Users of our Cloud Run and Google Kubernetes Engine (GKE) runtime environments can look forward to a variety of features: 141. Cloud Run application canvas lets developers generate, modify and deploy Cloud Run applications with integrations to Vertex AI, Firestore, Memorystore, and Cloud SQL, as well as load balancing and Gemini Cloud Assist. 142. GKE now supports container and model preloading to accelerate workload cold starts. 143. GPU sharing with NVIDIA Multi-Process Service (MPS) is now offered in GKE, enabling concurrent processing on a single GPU. 144. GKE support GCS FUSE read caching, now generally available, using a local directory as a cache to accelerate repeat reads for small and random I/Os. 145. GKE Autopilot mode now supports NVIDIA H100 GPUs, TPUs, reservations, and Compute Engine committed use discounts (CUDs). 146. Gemini Cloud Assist in GKE is available to help with optimizing costs, troubleshooting, and synthetic monitoring. Cloud Billing tools help you track and understand Google Cloud spending, pay your bill, and optimize your costs; here are a few new features: 147. Support for Cloud Storage costs at the bucket level and storage tags is included out of the box with Cloud Billing detailed data exports to BigQuery. 148. A new BigQuery data view for FOCUS allows users to compare costs and usage across clouds. 149. You can now convert cost management reports into BigQuery billing queries right from the Cloud Billing console. 150. A new Cloud FinOps Anomaly Detection feature is in private preview. 151. FinOps hub is now generally available, adds support to view top savings opportunities, and a preview of our FinOps hub dashboard lets you to analyze costs by project, region, or machine type. 152. A new CUD Analysis solution is available across Google Compute Engine resource families including TPU v5e, TPU v5p, A3, H3, and C3D. 153. There are new spend-based CUDs available for Memorystore, AlloyDB, BigTable, and Dataflow. Security Building on natural language search and case summaries in Chronicle, Gemini in Security Operations is coming to the entire investigation lifecycle, including: 154. A new assisted investigation feature, generally available at the end of this month, that guides analysts through their workflow in Chronicle Enterprise and Chronicle Enterprise Plus. 155. The ability to ask Gemini for the latest threat intelligence from Mandiant directly in-line — including any indicators of compromise found in their environment. 156. Gemini in Threat Intelligence, in public preview, allows you to tap into Mandiant’s frontline threat intelligence using conversational search. 157. VirusTotal now automatically ingests OSINT reports, which Gemini summarizes directly in the platform; generally available now. 158. Gemini in Security Command Center, which now lets security teams search for threats and other security events using natural language in preview, and provides summaries of critical- and high-priority misconfiguration and vulnerability alerts, and summarizes attack paths. 159. Gemini Cloud Assist also helps with security tasks, via: IAM Recommendations, which can provide straightforward, contextual recommendations to remove roles from over-permissioned users or service accounts; Key Insights, which help during encryption key creation based on its understanding of your data, your encryption preferences, and your compliance needs; and Confidential Computing Insights, which recommends options for adding confidential computing protection to sensitive workloads based on your data and your compute usage. Other security news includes: 160. The new Chrome Enterprise Premium, now generally available, combines the popular browser with Google threat and data protection, Zero Trust access controls, enterprise policy controls, and security insights and reporting. 161. Applied threat intelligence in Google Security Operations, now generally available, automatically applies global threat visibility and applies it to each customer’s unique environment. 162. Security Command Center Enterprise is now generally available and includesMandiant Hunt, now in preview. 163. Identity and Access Management Privileged Access Manager (PAM), now available in preview, provides just-in-time, time-bound, and approval-based access elevations. 164. Identity and Access Management Principal Access Boundary (PAB) is a new, identity-centered control now in preview that enforces restrictions on IAM principals. 165. Cloud Next-Gen Firewall (NGFW) Enterprise is now generally available, including threat protection from Palo Alto Networks. 166. Cloud Armor Enterprise is now generally available and offers a pay-as-you-go model that includes advanced network DDoS protection, web application firewall capabilities, network edge policy, adaptive protection, and threat intelligence. 167. Sensitive Data Protection integration with Cloud SQL is now generally available, and is deeply integrated into the Security Command Center Enterprise risk engine. 168. Key management with Autokey is now in preview, simplifying the creation and management of customer encryption keys (CMEK). 169. Bare metal HSM deployments in PCI-compliant facilities are now available in more regions. 170. Regional Controls for Assured Workloads is now in preview and is available in 32 cloud regions in 14 countries. 171. Audit Manager automates control verification with proof of compliance for workloads and data on Google Cloud, and is in preview. 172. Advanced API Security, part of Apigee API Management, now offers shadow API detection in preview. As part of our Confidential Computing portfolio, we announced: 173. Confidential VMs on Intel TDX are now in preview and available on the C3 machine series with Intel TDX. For AI and ML workloads, we support Intel AMX, which provides CPU-based acceleration by default on C3 series Confidential VMs. 174. Confidential VMs on general-purpose N2D machine series with AMD Secure Encrypted Virtualization-Secure Nested Paging (SEV-SNP) are now in preview. 175. Live Migration on Confidential VMs is now in general availability on N2D machine series across all regions. 176. Confidential VMs on the A3 machine series with NVIDIA Tensor Core H100 GPUs will be in private preview later this year. Migration 177. The Rapid Migration Program (RaMP) now covers migration and modernization use cases that span across applications and the underlying infrastructure, data and analytics. For example, as part of RaMP for Storage: Storage egress costs from Amazon S3 to Google Cloud Storage are now completely free. Cloud Storage's client libraries for Python, Node.js, and Java now support parallelization of uploads and downloads from client libraries. Migration Center also includes several excellent new additions: 178. Migration use case navigator, for mapping out how to migrate your resources (servers, databases, data warehouses, etc.) from on-prem and other clouds directly into Google Cloud, including new Cloud Spend Estimators for rapid TCO assessments of on-premises VMware and Exadata environments. 179. Database discovery and assessment for Microsoft SQL Server, PostgreSQL and MySQL to Cloud SQL migrations. Google Cloud VMware Engine, an integrated VMware service on Google Cloud now offers: 180. The intent to support VMware Cloud Foundation License Portability 181. General availability of larger instance type (ve2-standard-128) offerings. 182. Networking enhancements including next-gen VMware Engine Networking, automated zero-config VPC peering, and Cloud DNS for workloads. 183. Terraform Infrastructure as Code Automation. Migrate to Virtual Machines helps teams migrate their workloads. Here’s what we announced: 184. A new Disk Migration solution for migrating disk volumes to Google Cloud. 185. Image Import (preview) as a managed service. 186. BIOS to UEFI Conversion in preview, which automatically converts bootloaders to the newer UEFI format. 187. Amazon Linux Conversion in preview, for converting Amazon Linux to Rocky Linux in Google Compute Engine. 188. CMEK support, so you maintain control over your own encryption keys. When replatforming VMs to containers in GKE or Cloud Run, there’s: 189. The new Migrate to Containers (M2C) CLI, which generates artifacts that you can deploy to either GKE or Cloud Run. 190. M2C Cloud Code Extension, in preview, which migrates applications from VMs to containers running on GKE directly in Visual Studio. Here are the enhancements to our Database Migration Service: 191. Database Migration Service now offers AI-powered last-mile code conversion from Oracle to PostgreSQL. 192. Database Migration Service now performs migration from SQL Server (on any platform) to Cloud SQL for SQL Server, in preview. 193. In Datastream, SQL Server as a source for CDC performs data movement to BigQuery destinations. Migrating from a mainframe? Here are some new capabilities: 194. The Mainframe Assessment Tool (MAT) now powered by gen AI analyzes the application codebase, performing fit assessment and creating application-level summarization and test cases. 195. Mainframe Connector sends a copy of your mainframe data to BigQuery for off-mainframe analytics. 196. G4 refactors mainframe application code (COBOL, RPG, JCL etc.) and data from their original state/programming language to a modern stack (JAVA). 197. Dual Run lets you run a new system side by side with your existing mainframe, duplicating all transactions and checking for completeness, quality and effectiveness of the new solution. Partners & ecosystem 198. Partners showcased more than 100 solutions that leverage Google AI on the Next ‘24 show floor. 199. We announced the 2024 Google Cloud Partner of the Year winners. 200. Gemini models will be available in the SAP Generative AI Hub. 201. GitLab announced that its authentication, security, and CI/CD integrations with Google Cloud are now in public beta for customers. 202. Palo Alto Networks named Google Cloud its AI provider of choice and will use Gemini models to improve threat analysis and incident summarization for its Cortex XSIAM platform. 203. Exabeam is using Google Cloud AI to improve security outcomes for customers. 204. Global managed security services company Optiv is expanding support for Google Cloud products. 205. Alteryx, Dynatrace, and Harness are launching new features built with Google Cloud AI to automate workflows, support data governance, and enable users to better observe and manage the data. 206. A new Generative AI Services Specialization is available for partners who demonstrate the highest level of technical proficiency with Google Cloud gen AI. 207. We introduced new Generative AI Delivery Excellence and Technical Bootcamps, and advanced Challenge Labs in generative AI. 208. The Google Cloud Ready - BigQuery initiative has 21 new partners: Actable, AgileData, Amplitude, Boostkpi, CaliberMind, Calibrate Analytics, CloudQuery, DBeaver, Decube, DinMo, Estuary, Followrabbit, Gretel, Portable, Precog, Retool, SheetGo, Tecton, Unravel Data, Vallidio, and Vaultree 209. The Google Cloud Ready - AlloyDB initiative has six new partners: Boostkpi, DBeaver, Estuary, Redis, Thoughtspot, and SeeBurger 210. The Google Cloud Ready - Cloud SQL initiative has five new partners: BoostKPI, DBeaver, Estuary, Redis, and Thoughtspot 211. Crowdstrike is integrating its Falcon Platform with Google Cloud products. Members of our Google for Startups program, meanwhile, will be interested to learn that: 212. The Google for Startups Cloud Program has a new partnership with the NVIDIA Inception startup program. The benefits include providing Inception members with access to Google Cloud credits, go-to-market support, technical expertise, and fast-tracked onboarding to Google Cloud Marketplace. 213. As part of the NVIDIA Inception partnership, Google for Startups Cloud Program members can join NVIDIA Inception and gain access to technological expertise, NVIDIA Deep Learning Institute course credits, NVIDIA hardware and software, and more. Eligible members of the Google for Startups Cloud Program also can participate in NVIDIA Inception Capital Connect, a platform that gives startups exposure to venture capital firms interested in the space. 214. The new Google for Startups Accelerator: AI-First program for startups building AI solutions based in the U.S. and Canada has launched, and its cohort includes 15 AI startups: Aptori, Augmend, Backpack Healthcare, BrainLogic AI, Cicerai, CLIKA, Easel AI, Findly, Glass Health, Kodif, Liminal, mbue, Modulo Bio, Rocket Doctor, and Sibli. 215. The Startup Learning Center provides startups with curated content to help them grow with Google Cloud, and will be launching an offering for startup developers and future founders via Innovators Plus in the coming months Finally, Google Cloud Consulting, has the following services to help you build out your Google Cloud environment: 216. Google Cloud Consulting is offering no-cost, on-demand training to top customers through Google Cloud Skills Boost, including new gen AI skill badges: Prompt Design in Vertex AI, Develop Gen AI Apps with Gemini and Streamlit, and Inspect Rich Documents with Gemini Multimodality and Multimodal RAG. 217. The new Isolator solution protects healthcare data used in collaborations between parties using a variety of Google Cloud technologies including Chrome Enterprise Premium, VPC Service Controls, Chrome Enterprise, and encryption. 218. Google Cloud Consulting’s Delivery Navigator is now generally available to all Google Cloud qualified services partners. Phew. What a week! On behalf of Google Cloud, we’re so grateful you joined us at Next ‘24, and can’t wait to host you again next year back in Las Vegas at the Mandalay Bay on April 9 - 11 in 2025! View the full article

April 12
- 1
- google cloud next
- google gemini
- (and 6 more)
  Tagged with:

google cloud next 2024 Get inspired: Database success stories at Google Cloud Next

Google Cloud Platform posted a topic in Databases, Data Engineering & Data Science

Inspiration awaits! Google Cloud Next takes over Las Vegas on April 9-11, bringing together a powerhouse collection of innovative customers who are pushing the boundaries with Google Cloud. In this blog, we'll shine a spotlight on customers leveraging Google Cloud databases to transform their businesses. And don’t forget to add these sessions to your event agenda to catch their insights and experiences at Next ‘24. Nuro Autonomous driving company, Nuro, uses vector similarity search to help classify objects that autonomous vehicles encounter while driving on the road and ultimately trigger the right action. Nuro currently has hundreds of millions of vectors that are moving to AlloyDB AI in order to simplify their application architecture. Fei Meng Head of Data Platform, Nuro >> Add to my agenda << Lightricks Lightricks utilizes the popular pgvector extension on Cloud SQL for PostgreSQL to categorize video "Templates" within their Videoleap application. Videoleap UGC is a comprehensive social platform developed by Lightricks, which is designed for editing and sharing videos. Template categorization allows users to easily search through the provided templates to find one that matches their needs and generate their own customized videos. The usage of pgvector has enabled us to use semantic search instead of traditional keyword search. The retrieval rate due to the use of pgvector increased by 40% and the template usage from the retrieved results increased by 40% as well. The usage of the pgvector hnsw index enabled us to query millions of embeddings with high accuracy and response times below 100ms. David Gang Tech Lead, Brands >> Add to my agenda << Bayer Bayer Crop Science is a division of Bayer dedicated to agricultural advancements. Their modern data solution, “Field Answers,” which stores and analyzes vast amounts of observational data, experienced an increase in data load and latency requirements. And with the fall harvest season looming, Bayer needed a solution that would hold up to the upcoming demand. The team turned to AlloyDB for PostgreSQL, drawn by its compatibility with existing systems and low replication lag. This upgrade has helped streamline operations, centralize solutions, and improve collaboration with data scientists across the company. >> Add to my agenda << Yahoo! Yahoo!’s global reach demanded a data solution that would allow them to offer transformative experience at scale. With audacious goals for modernization, Yahoo! leveraged Spanner as a database to meet its strategy requirements. With Spanner’s superior performance, low cost, low operational overhead and global consistency, Yahoo! plans to consolidate diverse databases and expand Spanner’s footprint to support other services. >> Add to my agenda << Statsig Statsig helps companies ship, test, and manage software and application features with confidence. Facing bottlenecks and connectivity issues, the company realized it needed a performant, reliable, scalable, and fully managed Redis service—and Memorystore for Redis Cluster ticked all the boxes. With real-time analytics capabilities and robust storage (99.99% SLA) at a lower cost, Memorystore provides a higher queries per second (QPS) capacity. This allows Statsig to refocus on its core mission: building a full product observability platform that maximizes impact. >> Add to my agenda << Hit the databases jackpot at Google Cloud Next If these stories have sparked your imagination, then get ready for even more inspiration at Google Cloud Next '24! Register now and be sure to add these breakout sessions mentioned above to your agenda to experience firsthand how Google Cloud databases are empowering businesses to achieve amazing things. We'll see you in Las Vegas! View the full article

amazon redshift Successfully conduct a proof of concept in Amazon Redshift

Amazon Web Services posted a topic in Databases, Data Engineering & Data Science

Amazon Redshift is a fast, scalable, and fully managed cloud data warehouse that allows you to process and run your complex SQL analytics workloads on structured and semi-structured data. It also helps you securely access your data in operational databases, data lakes, or third-party datasets with minimal movement or copying of data. Tens of thousands of customers use Amazon Redshift to process large amounts of data, modernize their data analytics workloads, and provide insights for their business users. In this post, we discuss how to successfully conduct a proof of concept in Amazon Redshift by going through the main stages of the process, available tools that accelerate implementation, and common use cases. Proof of concept overview A proof of concept (POC) is a process that uses representative data to validate whether a technology or service fulfills a customer’s technical and business requirements. By testing the solution against key metrics, a POC provides insights that allow you to make an informed decision on the suitability of the technology for the intended use case. There are three major POC validation areas: Workload – Take a representative portion of an existing workload and test it on Amazon Redshift, such as an extract, transform, and load (ETL) process, reporting, or management Capability – Demonstrate how a specific Amazon Redshift feature, such as zero-ETL integration with Amazon Redshift, data sharing, or Amazon Redshift Spectrum, can simplify or enhance your overall architecture Architecture – Understand how Amazon Redshift fits into a new or existing architecture along with other AWS services and tools A POC is not: Planning and implementing a large-scale migration User-facing deployments, such as deploying a configuration for user testing and validation over extended periods (this is more of a pilot) End-to-end implementation of a use case (this is more of a prototype) Proof of concept process For a POC to be successful, it is recommended to follow and apply a well-defined and structured process. For a POC on Amazon Redshift, we recommend a three-phase process of discovery, implementation, and evaluation. Discovery phase The discovery phase is considered the most essential among the three phases and the longest. It defines through multiple sessions the scope of the POC and the list of tasks that need to be completed and later evaluated. The scope should contain inputs and data points on the current architecture as well as the target architecture. The following items need to be defined and documented to have a defined scope for the POC: Current state architecture and its challenges Business goals and the success criteria of the POC (such as cost, performance, and security) along with their associated priorities Evaluation criteria that will be used to evaluate and interpret the success criteria, such as service-level agreements (SLAs) Target architecture (the communication between the services and tools that will be used during the implementation of the POC) Dataset and the list of tables and schemas After the scope has been clearly defined, you should proceed with defining and planning the list of tasks that need to be run during the next phase in order to implement the scope. Also, depending on the technical familiarity with the latest developments in Amazon Redshift, a technical enablement session on Amazon Redshift is also highly recommended before starting the implementation phase. Optionally, a responsibility assignment matrix (RAM) is recommended, especially in large POCs. Implementation phase The implementation phase takes the output of the previous phase as input. It consists of the following steps: Set up the environment by respecting the defined POC architecture. Complete the implementation tasks such as data ingestion and performance testing. Collect data metrics and statistics on the completed tasks. Analyze the data and then optimize as necessary. Evaluation phase The evaluation phase is the POC assessment and the final step of the process. It aggregates the implementation results of the preceding phase, interprets them, and evaluates the success criteria described in the discovery phase. It is recommended to use percentiles instead of averages whenever possible for a better interpretation. Challenges In this section, we discuss the major challenges that you may encounter while planning your POC. Scope You may face challenges during the discovery phase while defining the scope of the POC, especially in complex environments. You should focus on the crucial requirements and prioritized success criteria that need to be evaluated so you avoid ending up with a small migration project instead of a POC. In terms of technical content (such as data structures, transformation jobs, and reporting queries), make sure to identify and consider as little as possible of the content that will still provide you with all the necessary information at the end of the implementation phase in order to assess the defined success criteria. Additionally, document any assumptions you are making. Time A time period should be defined for any POC project to ensure it stays focused and achieves clear results. Without an established time frame, scope creep can occur as requirements shift and unnecessary features get added. This may lead to misleading evaluations about the technology or concept being tested. The duration set for the POC depends on factors like workload complexity and resource availability. If a period such as 3 weeks has been committed to already without accounting for these considerations, the scope and planned content should be scaled to feasibly fit that fixed time period. Cost Cloud services operate on a pay-as-you-go model, and estimating costs accurately can be challenging during a POC. Overspending or underestimating resource requirements can impact budget allocations. It’s important to carefully estimate the initial sizing of the Redshift cluster, monitor resource usage closely, and consider setting service limits along with AWS Budget alerts to avoid unexpected expenditures. Technical The team running the POC has to be ready for initial technical challenges, especially during environment setup, data ingestion, and performance testing. Each data warehouse technology has its own design and architecture, which sometimes requires some initial tuning at the data structure or query level. This is an expected challenge that needs to be considered in the implementation phase timeline. Having a technical enablement session beforehand can alleviate such hurdles. Amazon Redshift POC tools and features In this section, we discuss tools that you can adapt based on the specific requirements and nature of the POC being conducted. It’s essential to choose tools that align with the scope and technologies involved. AWS Analytics Automation Toolkit The AWS Analytics Automation Toolkit enables automatic provisioning and integration of not only Amazon Redshift, but database migration services like AWS Database Migration Service (AWS DMS), AWS Schema Conversion Tool (AWS SCT), and Apache JMeter. This toolkit is essential in most POCs because it automates the provisioning of infrastructure and setup of the necessary environment. AWS SCT The AWS SCT makes heterogeneous database migrations predictable, secure, and fast by automatically converting the majority of the database code and storage objects to a format that is compatible with the target database. Any objects that can’t be automatically converted are clearly marked so that they can be manually converted to complete the migration. In the context of a POC, the AWS SCT becomes crucial by streamlining and enhancing the efficiency of the schema conversion process from one database system to another. Given the time-sensitive nature of POCs, the AWS SCT automates the conversion process, facilitating planning, and estimation of time and efforts. Additionally, the AWS SCT plays a role in identifying potential compatibility issues, data mapping challenges, or other hurdles at an early stage of the process. Furthermore, the database migration assessment report summarizes all the action items for schemas that can’t be converted automatically to your target database. Getting started with AWS SCT is a straightforward process. Also, consider following the best practices for AWS SCT. Amazon Redshift auto-copy The Amazon Redshift auto-copy (preview) feature can automate data ingestion from Amazon Simple Storage Service (Amazon S3) to Amazon Redshift with a simple SQL command. COPY statements are invoked and start loading data when Amazon Redshift auto-copy detects new files in the specified S3 prefixes. This also makes sure that end-users have the latest data available in Amazon Redshift shortly after the source files are available. You can use this feature for the purpose of data ingestion throughout the POC. To learn more about ingesting from files located in Amazon S3 using a SQL command, refer to Simplify data ingestion from Amazon S3 to Amazon Redshift using auto-copy (preview). The post also shows you how to enable auto-copy using COPY jobs, how to monitor jobs, and considerations and best practices. Redshift Auto Loader The custom Redshift Auto Loader framework automatically creates schemas and tables in the target database and continuously loads data from Amazon S3 to Amazon Redshift. You can use this during the data ingestion phase of the POC. Deploying and setting up the Redshift Auto Loader framework to transfer files from Amazon S3 to Amazon Redshift is a straightforward process. For more information, refer to Migrate from Google BigQuery to Amazon Redshift using AWS Glue and Custom Auto Loader Framework. Apache JMeter Apache JMeter is an open-source load testing application written in Java that you can use to load test web applications, backend server applications, databases, and more. In a database context, it’s an extremely valuable tool for repeating benchmark tests in a consistent manner, simulating concurrency workloads, and scalability testing on different database configurations. When implementing your POC, benchmarking Amazon Redshift is often one of the main components of evaluation and a key source of insight into the price-performance of different Amazon Redshift configurations. With Apache JMeter, you can construct high-quality benchmark tests for Amazon Redshift. Workload Replicator If you are currently using Amazon Redshift and looking to replicate your existing production workload or isolate specific workloads in a POC, you can use the Workload Replicator to run them across different configurations of Redshift clusters (ra3.xlplus, ra3.4xl,ra3.16xl, serverless) for performance evaluation and comparison. This utility has the ability to mimic COPY and UNLOAD workloads and can run the transactions and queries in the same time interval as they’re run in the production cluster. However, it’s crucial to assess the limitations of the utility and AWS Identity and Access Management (IAM) security and compliance requirements. Node Configuration Comparison utility If you’re using Amazon Redshift and have stringent SLAs for query performance in your Amazon Redshift cluster, or you want to explore different Amazon Redshift configurations based on the price-performance of your workload, you can use the Amazon Redshift Node Configuration Comparison utility. This utility helps evaluate performance of your queries using different Redshift cluster configurations in parallel and compares the end results to find the best cluster configuration that meets your need. Similarly, If you’re already using Amazon Redshift and want to migrate from your existing DC2 or DS2 instances to RA3, you can refer to our recommendations on node count and type when upgrading. Before doing that, you can use this utility in your POC to evaluate the new cluster’s performance by replaying your past workloads, which integrates with the Workload Replicator utility to evaluate performance metrics for different Amazon Redshift configurations to meet your needs. This utility functions in a fully automated manner and has similar limitations as the workload replicator. However, it requires full permissions across various services for the user running the AWS CloudFormation stack. Use cases You have the opportunity to explore various functionalities and aspects of Amazon Redshift by defining and selecting a business use case you want to validate during the POC. In this section, we discuss some specific use cases you can explore using a POC. Functionality evaluation Amazon Redshift consists of a set of functionalities and options that simplify data pipelines and effortlessly integrate with other services. You can use a POC to test and evaluate one or more of those capabilities before refactoring your data pipeline and implementing them in your ecosystem. Functionalities could be existing features or new ones such as zero-ETL integration, streaming ingestion, federated queries, or machine learning. Workload isolation You can use the data sharing feature of Amazon Redshift to achieve workload isolation across diverse analytics use cases and achieve business-critical SLAs without duplicating or moving the data. Amazon Redshift data sharing enables a producer cluster to share data objects with one or more consumer clusters, thereby eliminating data duplication. This facilitates collaboration across isolated clusters, allowing data to be shared for innovation and analytic services. Sharing can occur at various levels such as databases, schemas, tables, views, columns, and user-defined functions, offering fine-grained access control. It is recommended to use Workload Replicator for performance evaluation and comparison in a workload isolation POC. The following sample architectures explain workload isolation using data sharing. The first diagram illustrates the architecture before using data sharing. The following diagram illustrates the architecture with data sharing. Migrating to Amazon Redshift If you’re interested in migrating from your existing data warehouse platform to Amazon Redshift, you can try out Amazon Redshift by developing a POC on a selected business use case. In this type of POC, it is recommended to use the AWS Analytics Automation Toolkit for setting up the environment, auto-copy or Redshift Auto Loader for data ingestion, and AWS SCT for schema conversion. When the development is complete, you can perform performance testing using Apache JMeter, which provides data points to measure price-performance and compare results with your existing platform. The following diagram illustrates this process. Moving to Amazon Redshift Serverless You can migrate your unpredictable and variable workloads to Amazon Redshift Serverless, which enables you to scale as and when needed and pay as per usage, making your infrastructure scalable and cost-efficient. If you’re migrating your full workload from provisioned (DC2, RA3) to serverless, you can use the Node Configuration Comparison utility for performance evaluation. The following diagram illustrates this workflow. Conclusion In a competitive environment, conducting a successful proof of concept is a strategic imperative for businesses aiming to validate the feasibility and effectiveness of new solutions. Amazon Redshift provides you with better price-performance compared to other cloud-centered data warehouses, and a large list of features that help you modernize and optimize your data pipelines. For more details, see Amazon Redshift continues its price-performance leadership. With the process discussed in this post and by choosing the tools needed for your specific use case, you can accelerate the process of conducting a POC. This allows you to collect the data metrics that can help you understand the potential challenges, benefits, and implications of implementing the proposed solution on a larger scale. A POC provides essential data points that evaluate price-performance as well as feasibility, which plays a vital role in decision-making. About the Authors Ziad WALI is an Acceleration Lab Solutions Architect at Amazon Web Services. He has over 10 years of experience in databases and data warehousing, where he enjoys building reliable, scalable, and efficient solutions. Outside of work, he enjoys sports and spending time in nature. Omama Khurshid is an Acceleration Lab Solutions Architect at Amazon Web Services. She focuses on helping customers across various industries build reliable, scalable, and efficient solutions. Outside of work, she enjoys spending time with her family, watching movies, listening to music, and learning new technologies. Srikant Das is an Acceleration Lab Solutions Architect at Amazon Web Services. His expertise lies in constructing robust, scalable, and efficient solutions. Beyond the professional sphere, he finds joy in travel and shares his experiences through insightful blogging on social media platforms. View the full article

March 27
- poc
- databases

databases What is a Database? Everything You Need to Know

KDnuggets posted a topic in Databases, Data Engineering & Data Science

Unlocking Database Basics. View the full article

March 26

Choosing a suitable database for your startup: A overview of AlloyDB and Spanner

Google Cloud Platform posted a topic in Databases, Data Engineering & Data Science

In today's fast-paced business environment, startups need to leverage the power of the cloud to achieve scale, performance, and consistency for their apps. Google Cloud provides three popular cloud databases that enable reliable PostgreSQL: Spanner, AlloyDB and Cloud SQL. In this article, we will explore the features and benefits of these databases, focusing on AlloyDB and Spanner and how startups can use them — together or separately — to simplify infrastructure, reduce operational costs, and maximize performance. Spanner: a scalable and globally distributed database Spanner is a fully managed database for both relational and non-relational workloads that is designed to scale horizontally across multiple regions and continents. Combining strong consistency, high availability, and low latency, it stands as the ideal solution for mission-critical applications demanding high throughput and rapid response times. Spanner provides a PostgreSQL interface, ensuring your schemas and queries are portable to other environments within or outside of Google Cloud. This also allows developers to leverage many of the tools and techniques they already know, flattening the learning curve when transitioning to Spanner. One of the key features of Spanner is its ability to replicate data across multiple regions while maintaining strong, ACID (atomicity, consistency, isolation, durability) transactions and a familiar SQL interface. On top of that, Spanner offers schema changes without downtime, fully automatic data replication, and data redundancy. As a result, developers can build applications that operate seamlessly across multiple regions without worrying about data consistency issues, regional failures, or planned maintenance. When is Spanner the right fit? Spanner also offers automatic horizontal scaling, from an inexpensive slice of one compute node to thousands of nodes (see in graph below), making it easy for a startup to increase or decrease their query and data capacity based on their workload demands. Spanner allows you to resize elastically without downtime or other disruption, so you can better align your usage with the workload. As a result, startups save money by paying only for the resources needed. In contrast to legacy scale-up databases, changing capacity typically involves 1) standing up new infrastructure, 2) migrating the schema and all of the data, and 3) a big-bang cutover in coordination with downstream applications. Spanner allows you to adjust capacity — read and write — on the fly with no downtime. A built-in managed autoscaler adjusts the capacity for you based on signals, such as CPU usage. Spanner scales linearly from tiny workloads—100 processing units, the equivalent of 0.1 node, and 400GB of data—to thousands of nodes, handling PB of data and millions of queries per second. Recent improvements have raised the storage capacity to 10TB per node and increased the throughput by 50%. For example, Niantic runs 5,000 node instances handling the traffic for Pokémon GO. This elasticity saves you money, reduces risks, and provides scale insurance. Even if you aren’t there today, rest assured you can grow to Niantic or Gmail-sized workloads without disruptive re-architecture with Spanner. Start small and scale with Spanner AlloyDB: A cloud-native and managed PostgreSQL database Google Cloud AlloyDB for PostgreSQL is a fully managed, PostgreSQL-compatible database service that's designed for your most demanding workloads, including 1) transactional, 2) analytical, and 3) hybrid transactional and analytical processing (HTAP). In Google’s performance tests, AlloyDB delivers up to 100X faster analytical queries than standard PostgreSQL and is more than 2X faster than Amazon’s comparable PostgreSQL-compatible service for transactional workloads. AlloyDB also offers a number of features designed to simplify application development. For example, it supports standard PostgreSQL syntax and extensions, making it easy to write queries and manipulate data. Another important consideration: AlloyDB may be the better choice if you’re planning to build GenAI apps, thanks to AlloyDB AI, a built-in set of capabilities for working with vectors, models, and data. AlloyDB uses columnar storage for its columnar engine, which is designed to accelerate analytical queries. The columnar engine stores frequently queried data in an in-memory, columnar format, which can significantly improve the performance of these queries. Intelligent, workload-aware dynamic data organization leverages both row-based and column-based formats. Multiple layers of cache ensure excellent price-performance. Choosing the right database for your startup When it comes to choosing the right database for your startup, there are several factors to consider. First and foremost, you need to consider your application's requirements in terms of performance, availability, global consistency, and scalability. Are you building a consumer app for millions of concurrent users? Maybe a corporate app that will be used for real-time analytics? Each database has its own strengths. Feature AlloyDB Spanner Type Cloud-native, managed PostgreSQL database Globally distributed scalable database Supported engines PostgreSQL PostgreSQL, GoogleSQL Security Data encryption at rest and in transit Data encryption at rest and in transit Data residency Single region by default, multi-region available Multi-region by default Best for Hybrid transactional & analytical workloads, AI applications Mission-critical apps with high data consistency & global reach (multi-writer across regions) Spanner is the ideal choice for mission-critical applications that demand high scalability, unwavering consistency, and 99.999% SLA availability. Teams building applications that are evaluating sharding or active-active configurations, to work around scaling limitations can benefit from Spanner’s built-in, hands-free operations. Spanner enables development teams with its familiar SQL interface (including PostgreSQL dialect support) for seamless large-scale data processing. This ensures portability, flexibility, and simplifies use cases requiring high write scaling, global consistency, and adaptability to variable traffic. AlloyDB is a good choice for applications that need a high-performance, reliable, and scalable database with built-in support for advanced analytics and full PostgreSQL compatibility. AlloyDB supports real-time analytics applications because of its automatic data placement across tiers (e.g., buffer cache, ultra-fast cache, and block storage), and its ability to process up to 64 TiB of data per cluster in real time. AlloyDB is also reliable and offers a 99.99% SLA, including maintenance. Another option to consider is Cloud SQL, an enterprise-ready, fully managed relational database service that offers PostgreSQL, MySQL, and SQL Server engines. It is user-friendly as it provides a straightforward user interface with the familiar SQL interface with PostgreSQL, MySQL and SQL Server for easy interaction and only takes minutes to get your database up and running. Additionally, another important factor to keep in mind is your team's expertise and familiarity with different database technologies. If your team is already familiar with relational databases and the Google Cloud ecosystem, then Spanner may be the easier choice. If your team is more comfortable with PostgreSQL, then AlloyDB may be the better fit. Conclusion In conclusion, Spanner and AlloyDB are two powerful databases that offer different benefits and features for startups and can be used together or separately, depending on your needs. Together, AlloyDB and Spanner are a dynamic duo with which you can achieve performance and scalability based on Google’s innovations, delivering both responsive user interactions and robust, scalable back-end functionalities. With PostgreSQL and Google Cloud as the unifying threads, both services can co-exist seamlessly, forming a powerful combination for any application demanding high performance and unwavering reliability. For example, Character.ai uses AlloyDB and Spanner together in the same app that is at core of their business: AlloyDB for powering the interactive experience: At the user-facing front-end, AlloyDB shines as the engine behind quick, responsive interactions. Its unparalleled speed and performance ensure a smooth and intuitive user experience, critical for engaging with the AI model. Spanner as the backbone of history and workflow: Behind the scenes, Spanner maintains the complete history and workflow data integral to the AI integration. Its unshakeable scale and availability guarantee seamless data management, regardless of load or complexity. Both Spanner and AlloyDB operate within the familiar PostgreSQL ecosystem, offering a consistent and unified development experience. This empowers developers to leverage their existing skills and knowledge, accelerating integration and workflow. Additionally, the Google Cloud Platform provides a robust and secure environment for both services, ensuring seamless data management and operational efficiency. View the full article

March 22
- startups
- databases
- (and 2 more)
  Tagged with:
  - startups
  - databases
  - alloydb
  - spanner

dora metrics Database Observability Extends DORA Metrics and More to Database DevOps

Devops.com posted a topic in DevOps & SRE General Discussion

Database observability unlocks DORA metrics along with other indicators that matter to your DevOps, application, database and IT teams. View the full article

courses 5 Free University Courses to Learn Databases and SQL

KDnuggets posted a topic in Databases, Data Engineering & Data Science

Looking to learn SQL and databases to level up your data science skills? Learn SQL, database internals, and much more with these free university courses.View the full article

March 5
- university
- training
- (and 4 more)
  Tagged with:
  - university
  - training
  - learning
  - databases
  - sql
  - free

mongodb MongoDB in C++

Linux Hint posted a topic in Databases, Data Engineering & Data Science

This article is about MongoDB in C++, the most powerful and widely used database in our programming world that stores the data in JSON format. MongoDB is an open-source and document-oriented NoSQL database that offers us a flexible approach to storing and managing the records in the database. The user can insert(), delete(), and update() the queries using MongoDB in C++. Let’s learn how the MongoDB driver is installed and used in C++ to manage the database of any system with the help of proper examples for more understanding. How to Install the MongoDB Driver in C++ We will learn how to install the Mongo driver in C++. The official Mongo driver that is used for C++ is the MongoDB C++11 driver which can be installed in your system with a C++ environment. We must install the MongoDB driver library and connect the database to the C++ projects using a URL string. The MongoDB driver is appropriately functional and has built-in management methods that automatically connect the database on user request and reconnect the connection if lost. The MongoDB driver provides full authentication and authorization of the user request that is handled in C++ to the database. Create a MongoDB Database in the System Install the MongoDB setup in our system. After installing the MongoDB, from “C:\Program Files”, open the bin folder from the MongoDB folder. Copy the address of the bin folder address and add the environment variable PATH in Windows to activate the NoSQL MongoDB database. Ensure that MongoDB Compass is installed that has the mentioned user interface. We can see the address of this database, and we can access this database through the local host whose port number is “27017”. Open the command prompt in your system. Run the command -> mongo –version to show the version of MongoDB. Create a New Database in MongoDB Using Cmd We can easily create the new database in MongoDB by just running the command in the cmd of our system. We run the command that is mentioned in the following: > use mydb Show All Running Databases in MongoDB To show all the running databases in the MongoDB, we can run the following mentioned command in our cmd to show all the running databases: > Show dbs To launch the MongoDB server, we just need to follow and fulfill the requirements on the terminal. We can also get the collection of “Mongo” in the current default database which is “test” with records already in it. Only those databases with some data or records are retrieved or shown in the show database. Example: Connecting MongoDB in C++ Here, we connect this NoSQL MongoDB database to interact with C++. We first need to connect to your system’s MongoDB server. Make sure that the C++ setup and MongoDB are active in the system. The MongoDB driver library in C++ is now installed on your system. We can run the essential libraries in our code along the MongoDB C++ driver as “mongodbcxx/client.hpp” and “monodbcxx/instances.hpp”. In the MongoDB libraries, we use the “client” function that contains the URI “mongodb://localhost:27017”. If this URI is correct, display the message as “connected to MongoDB”. The MongoDB that runs locally is only accessible on port “27017” as displayed in the previous MongoDB screenshot. Maintain the CRUD in MongoDB CRUD is the main operation that is needed in the database management system. We can do nothing without CRUD in C++. In a database, CRUD means create, read, update, and delete the records from the database to high performance of the database. Insert the Data in the MongoDB Database C++ We can easily add the records to any new or existing database. We only create new tables in the database easily in C++ by defining the essential MongoDB libraries to connect with the database. After that, we write the connection code in C++ and then write the insert database query in C++ to insert the records in the database. MongoDB is created as a powerful driver that handles the C++ program which is “MongoDB driver C++” and the library that handles all the C++ operations whose name is “mongocxx”. Using the libraries, we create an instance of the C++ driver. Using the insert_one() method, we add the data to the NoSQL database. Delete the Data from the Database In every step, make this thing clear that the MongoDB connection is established and working fine. We access the MongoDB database using the “mongocxx” library and its useful methods that are derived to delete the data from the database in C++ language. We can access the database and its collection easily using the attributes of mongocxx, just like “mongodbcxx::database” with the “db” alias and “mongodbcxx::collection” for collection with the “colle” alias. After that, create the filter for every situation for the document that you definitely want to delete and then specify the criteria for deletion in MongoDB C++. Pass the filter in the “delete” function to remove the record from the database. Update the Records in the Database An update means we can change the existing records in the database. We can easily update the record from the database using the “update” method that is defined in the MongoDB C++ driver instance. Conclusion At the end of the article, we can say that the usage of NoSQL MongoDB is increasing rapidly because of its high efficiency and performance. MongoDB has developed the MongoDB driver to execute or deal with the C++ language. With the help of MongoDB, the users can easily add, delete, update, and show the records, tables, and databases without having any storage or space issues in the system. MongoDB takes its virtual space and easily deals with the C++ language using its special-purpose libraries. Hopefully, this article is very helpful and easy to learn. Remember to use smart techniques or databases to build new programs and applications to make the system more reliable. View the full article

amazon rds Your MySQL 5.7 and PostgreSQL 11 databases will be automatically enrolled into Amazon RDS Extended Support

Amazon Web Services posted a topic in Databases, Data Engineering & Data Science

Today, we are announcing that your MySQL 5.7 and PostgreSQL 11 database instances running on Amazon Aurora and Amazon Relational Database Service (Amazon RDS) will be automatically enrolled into Amazon RDS Extended Support starting on February 29, 2024. This will help avoid unplanned downtime and compatibility issues that can arise with automatically upgrading to a new major version. This provides you with more control over when you want to upgrade the major version of your database. This automatic enrollment may mean that you will experience higher charges when RDS Extended Support begins. You can avoid these charges by upgrading your database to a newer DB version before the start of RDS Extended Support. What is Amazon RDS Extended Support? In September 2023, we announced Amazon RDS Extended Support, which allows you to continue running your database on a major engine version past its RDS end of standard support date on Amazon Aurora or Amazon RDS at an additional cost. Until community end of life (EoL), the MySQL and PostgreSQL open source communities manage common vulnerabilities and exposures (CVE) identification, patch generation, and bug fixes for the respective engines. The communities release a new minor version every quarter containing these security patches and bug fixes until the database major version reaches community end of life. After the community end of life date, CVE patches or bug fixes are no longer available and the community considers those engines unsupported. For example, MySQL 5.7 and PostgreSQL 11 are no longer supported by the communities as of October and November 2023 respectively. We are grateful to the communities for their continued support of these major versions and a transparent process and timeline for transitioning to the newest major version. With RDS Extended Support, Amazon Aurora and RDS takes on engineering the critical CVE patches and bug fixes for up to three years beyond a major version’s community EoL. For those 3 years, Amazon Aurora and RDS will work to identify CVEs and bugs in the engine, generate patches and release them to you as quickly as possible. Under RDS Extended Support, we will continue to offer support, such that the open source community’s end of support for an engine’s major version does not leave your applications exposed to critical security vulnerabilities or unresolved bugs. You might wonder why we are charging for RDS Extended Support rather than providing it as part of the RDS service. It’s because the engineering work for maintaining security and functionality of community EoL engines requires AWS to invest developer resources for critical CVE patches and bug fixes. This is why RDS Extended Support is only charging customers who need the additional flexibility to stay on a version past community EoL. RDS Extended Support may be useful to help you meet your business requirements for your applications if you have particular dependencies on a specific MySQL or PostgreSQL major version, such as compatibility with certain plugins or custom features. If you are currently running on-premises database servers or self-managed Amazon Elastic Compute Cloud (Amazon EC2) instances, you can migrate to Amazon Aurora MySQL-Compatible Edition, Amazon Aurora PostgreSQL-Compatible Edition, Amazon RDS for MySQL, Amazon RDS for PostgreSQL beyond the community EoL date, and continue to use these versions these versions with RDS Extended Support while benefiting from a managed service. If you need to migrate many databases, you can also utilize RDS Extended Support to split your migration into phases, ensuring a smooth transition without overwhelming IT resources. In 2024, RDS Extended Support will be available for RDS for MySQL major versions 5.7 and higher, RDS for PostgreSQL major versions 11 and higher, Aurora MySQL-compatible version 2 and higher, and Aurora PostgreSQL-compatible version 11 and higher. For a list of all future supported versions, see Supported MySQL major versions on Amazon RDS and Amazon Aurora major versions in the AWS documentation. Community major version RDS/Aurora version Community end of life date End of RDS standard support date Start of RDS Extended Support pricing End of RDS Extended Support MySQL 5.7 RDS for MySQL 5.7 October 2023 February 29, 2024 March 1, 2024 February 28, 2027 Aurora MySQL 2 October 31, 2024 December 1, 2024 PostgreSQL 11 RDS for PostgreSQL 11 November 2023 March 31, 2024 April 1, 2024 March 31, 2027 Aurora PostgreSQL 11 February 29, 2024 RDS Extended Support is priced per vCPU per hour. Learn more about pricing details and timelines for RDS Extended Support at Amazon Aurora pricing, RDS for MySQL pricing, and RDS for PostgreSQL pricing. For more information, see the blog posts about Amazon RDS Extended Support for MySQL and PostgreSQL databases in the AWS Database Blog. Why are we automatically enrolling all databases to Amazon RDS Extended Support? We had originally informed you that RDS Extended Support would provide the opt-in APIs and console features in December 2023. In that announcement, we said that if you decided not to opt your database in to RDS Extended Support, it would automatically upgrade to a newer engine version starting on March 1, 2024. For example, you would be upgraded from Aurora MySQL 2 or RDS for MySQL 5.7 to Aurora MySQL 3 or RDS for MySQL 8.0 and from Aurora PostgreSQL 11 or RDS for PostgreSQL 11 to Aurora PostgreSQL 15 and RDS for PostgreSQL 15, respectively. However, we heard lots of feedback from customers that these automatic upgrades may cause their applications to experience breaking changes and other unpredictable behavior between major versions of community DB engines. For example, an unplanned major version upgrade could introduce compatibility issues or downtime if applications are not ready for MySQL 8.0 or PostgreSQL 15. Automatic enrollment in RDS Extended Support gives you additional time and more control to organize, plan, and test your database upgrades on your own timeline, providing you flexibility on when to transition to new major versions while continuing to receive critical security and bug fixes from AWS. If you’re worried about increased costs due to automatic enrollment in RDS Extended Support, you can avoid RDS Extended Support and associated charges by upgrading before the end of RDS standard support. How to upgrade your database to avoid RDS Extended Support charges Although RDS Extended Support helps you schedule your upgrade on your own timeline, sticking with older versions indefinitely means missing out on the best price-performance for your database workload and incurring additional costs from RDS Extended Support. MySQL 8.0 on Aurora MySQL, also known as Aurora MySQL 3, unlocks support for popular Aurora features, such as Global Database, Amazon RDS Proxy, Performance Insights, Parallel Query, and Serverless v2 deployments. Upgrading to RDS for MySQL 8.0 provides features including up to three times higher performance versus MySQL 5.7, such as Multi-AZ cluster deployments, Optimized Reads, Optimized Writes, and support for AWS Graviton2 and Graviton3-based instances. PostgreSQL 15 on Aurora PostgreSQL supports the Aurora I/O Optimized configuration, Aurora Serverless v2, Babelfish for Aurora PostgreSQL, pgvector extension, Trusted Language Extensions for PostgreSQL (TLE), and AWS Graviton3-based instances as well as community enhancements. Upgrading to RDS for PostgreSQL 15 provides features such as Multi-AZ DB cluster deployments, RDS Optimized Reads, HypoPG extension, pgvector extension, TLEs for PostgreSQL, and AWS Graviton3-based instances. Major version upgrades may make database changes that are not backward-compatible with existing applications. You should manually modify your database instance to upgrade to the major version. It is strongly recommended that you thoroughly test any major version upgrade on non-production instances before applying it to production to ensure compatibility with your applications. For more information about an in-place upgrade from MySQL 5.7 to 8.0, see the incompatibilities between the two versions, Aurora MySQL in-place major version upgrade, and RDS for MySQL upgrades in the AWS documentation. For the in-place upgrade from PostgreSQL 11 to 15, you can use the pg_upgrade method. To minimize downtime during upgrades, we recommend using Fully Managed Blue/Green Deployments in Amazon Aurora and Amazon RDS. With just a few steps, you can use Amazon RDS Blue/Green Deployments to create a separate, synchronized, fully managed staging environment that mirrors the production environment. This involves launching a parallel green environment with upper version replicas of your production databases lower version. After validating the green environment, you can shift traffic over to it. Then, the blue environment can be decommissioned. To learn more, see Blue/Green Deployments for Aurora MySQL and Aurora PostgreSQL or Blue/Green Deployments for RDS for MySQL and RDS for PostgreSQL in the AWS documentation. In most cases, Blue/Green Deployments are the best option to reduce downtime, except for limited cases in Amazon Aurora or Amazon RDS. For more information on performing a major version upgrade in each DB engine, see the following guides in the AWS documentation. Upgrading the MySQL DB engine for Amazon RDS Upgrading the PostgreSQL DB engine for Amazon RDS Upgrading the Amazon Aurora MySQL DB cluster Upgrading Amazon Aurora PostgreSQL DB clusters Now available Amazon RDS Extended Support is now available for all customers running Amazon Aurora and Amazon RDS instances using MySQL 5.7, PostgreSQL 11, and higher major versions in AWS Regions, including the AWS GovCloud (US) Regions beyond the end of the standard support date in 2024. You don’t need to opt in to RDS Extended Support, and you get the flexibility to upgrade your databases and continued support for up to 3 years. Learn more about RDS Extended Support in the Amazon Aurora User Guide and the Amazon RDS User Guide. For pricing details and timelines for RDS Extended Support, see Amazon Aurora pricing, RDS for MySQL pricing, and RDS for PostgreSQL pricing. Please send feedback to AWS re:Post for Amazon RDS and Amazon Aurora or through your usual AWS Support contacts. — Channy View the full article

December 21, 2023
- mysql
- postgresql
- (and 1 more)
  Tagged with:

visual studio code AWS announces Amazon Redshift integration with Visual Studio Code

Amazon Web Services posted a topic in Databases, Data Engineering & Data Science

AWS announces the support for Amazon Redshift with Visual Studio Code (VSCode), a free and open-source code editor. The integration with Visual Studio Code enables Amazon Redshift customers to use Visual Studio Code to author and run their SQL queries in a notebook interface and view the schema objects in their Redshift data warehouses. View the full article

October 17, 2023
- redshift
- databases
- (and 1 more)
  Tagged with:

systems insights Getting to know Systems insights, a simplified database system monitoring tool

Google Cloud Platform posted a topic in Logging, Monitoring & Observability

Getting to know Systems insights, a simplified database system monitoring toolView the full article

October 13, 2023
- databases
- gcp
- (and 1 more)
  Tagged with:
  - databases
  - gcp
  - monitoring

google cloud next 2022 Google Cloud Next for data professionals: analytics, databases and business intelligence

Google Cloud Platform posted a topic in Databases, Data Engineering & Data Science

Google Cloud Next kicks off tomorrow, and we’ve prepared a wealth of content — keynotes, customer panels, technical breakout sessions — designed for data professionals. If you haven’t already, now is the perfect time to register, and build out your schedule. Here’s a sampling of data-focused breakout sessions: 1. ANA204What's next for data analysts and data scientists Join this session to learn how Google's Data Cloud can transform your decision making and turn data into action by operationalizing Data Analytics and AI. Google Cloud brings together Google's most advanced Data and AI technology to help you train, deploy, and manage ML faster at scale. You will learn about the latest product innovations for BigQuery and Vertex AI to bring intelligence everywhere to analyze and activate your data. You will also hear from industry leading organizations who have realized tangible value with data analytics and AI using Google Cloud. 2. DSN100What's next for data engineers Organizations are facing increased pressure to deliver new, transformative user experiences in an always-on, global economy. Learn how Google’s data cloud unifies your data across analytical and transactional systems for increased agility and simplicity. You'll also hear about the latest product innovations across Spanner, AlloyDB, Cloud SQL and BigQuery. 3. ANA101What's new in BigQuery In the new digital-first era, data analytics continues to be at the core of driving differentiation and innovation for businesses. In this session, you’ll learn how BigQuery is fueling transformations and helping organizations build data ecosystems. You’ll hear about the latest product announcements, upcoming innovations, and strategic roadmap. 4. ANA100What's new in Looker and Data Studio Business intelligence (BI) is more than dashboards and reports, and we make it easy to deliver insights to your users and customers in the places where it’ll make the most difference. In this session, we’ll discuss the future of our BI products, as well as go through recent launches and the roadmap for Looker and Google Data Studio. Hear how you can use both products — today and in the future — to get insights from your data, including self-service visualization, modeling of data, and embedded analytics. 5. ANA102So long, silos: How to simplify data analytics across cloud environments Data often ends up in distributed environments like on-premises data centers and cloud service providers, making it incredibly difficult to get 360-degree business insights. In this session, we’ll share how organizations can get a complete view of their data across environments through a single pane of glass without building huge data pipelines. You’ll learn directly from Accenture and L’Oréal about their cross-cloud analytics journeys and how they overcame challenges like data silos and duplication. 6. ANA104How Boeing overcame their on-premises implementation challenges with data & AI Learn how leading aerospace company Boeing transformed its data operations by migrating hundreds of applications across multiple business groups and aerospace products to Google Cloud. This session will explore the use of data analytics, AI, and machine learning to design a data operating system that addresses the complexity and challenges of traditional on-premises implementations to take advantage of the scalability and flexibility of the cloud. 7. ANA106How leading organizations are making open source their super power Open source is no longer a separate corner of the data infrastructure. Instead, it needs to be integrated into the rest of your data platform. Join this session to learn how Walmart uses data to drive innovation and has built one of the largest hybrid clouds in the world, leveraging the best of cloud-native and open source technologies. Hear from Anil Madan, Corporate Vice President of Data Platform at Walmart, about the key principles behind their platform architecture and his advice to others looking to undertake a similar journey. Build your data playlist today One of the coolest things about the Next ‘22 website is the ability to create your own playlist, and share it with people. To explore the full catalog of breakout sessions and labs designed for data scientists and engineers, check out the Analyze and Design tracks in the Next ‘22 Catalog. Related Article Read Article

October 10, 2022
- google cloud next
- gcp
- (and 3 more)
  Tagged with:
  - google cloud next
  - gcp
  - bi
  - databases
  - analytics

Announcing new digital curriculum: Moving to Managed Databases on AWS

Amazon Web Services posted a topic in Amazon Web Services

This free new digital training curriculum contains modules that explain the benefits of and process for moving from self-managed databases to fully-managed database solutions in the cloud. The four-hour fundamental curriculum includes eight self-paced courses with video demonstrations and is designed for data platform engineers, database developers, and solutions architects. View the full article

Sign In

Search the Community

Search By Tags

Search By Author

Content Type

Forums

Calendars

Find results in...

Find results that contain...

Date Created

Start

End

Last Updated

Start

End

Filter by number of...

Minimum number of comments

Minimum number of replies

Minimum number of reviews

Minimum number of views

Joined

Start

End

Group

Website URL

LinkedIn Profile URL

About Me

Cloud Platforms

Cloud Experience

Development Experience

Current Role

Skills

Certifications

Favourite Tools

Interests

Forum Statistics