Data Engineering & Data Science
Data Engineering
Data Pipelines (ETL/ELT)
Big Data Technologies
Cloud Computing for Data
Data Governance & Quality
Data Science
Machine Learning (ML)
Statistical Analysis
Data Visualization
Natural Language Processing (NLP)
1,046 topics in this forum
-
Data is the fuel for AI, and organizations are racing to leverage enterprise data to build AI agents, intelligent search, and AI-powered analytics for productivity, deeper insights, and a competitive edge. To power their data clouds, tens of thousands of organizations already choose BigQuery and its integrated AI capabilities. This decade requires AI-native, multimodal, and agentic data-to-AI platforms, with BigQuery leading the way as the autonomous data-to-AI platform. Finally, we have a platform that infuses AI, makes unstructured data a first class citizen, accelerates open lakehouses and embeds governance... View the full article
-
- 0 replies
- 32 views
-
-
For decades, businesses have wrestled with unlocking the true potential of their data for real-time operations. Bigtable, Google Cloud's pioneering NoSQL database, has been the engine behind massive-scale, low-latency applications that operate at a global scale. It was purpose-built for the challenges faced in real-time applications, and remains a key piece of Google infrastructure, including YouTube and Ads. This week at Google Cloud Next, we announced continuous materialized views, an expansion of Bigtable’ SQL capabilities. Bigtable SQL and continuous materialized views enable users to build fully-managed, real-time application backends using familiar SQL syntax, incl…
-
- 0 replies
- 25 views
-
-
Today, we are announcing the Data Architect learning pathway, a dedicated learning track that equips data architects with the required resources and skills for success.View the full article
-
- 0 replies
- 33 views
-
-
Access to high-quality, real-world data is crucial for developing effective machine learning models. However, when this data contains sensitive information, organizations face a significant hurdleView the full article
-
- 0 replies
- 27 views
-
-
Databricks Secures Google Cloud Technology Partner of the Year Award for Data & Analytics - Smart Analytics! We’re excited to announce that Databricks has been View the full article
-
- 0 replies
- 141 views
-
-
-
Summary: LLMs have revolutionized software development by increasing the productivity of programmers. However, despite off-the-shelf LLMs being trained on a significant amount of code, they are notView the full article
-
- 0 replies
- 75 views
-
-
In modern data architectures, Apache Iceberg has emerged as a popular table format for data lakes, offering key features including ACID transactions and concurrent write support. Although these capabilities are powerful, implementing them effectively in production environments presents unique challenges that require careful consideration. Consider a common scenario: A streaming pipeline continuously writes data to an Iceberg table while scheduled maintenance jobs perform compaction operations. Although Iceberg provides built-in mechanisms to handle concurrent writes, certain conflict scenarios—such as between streaming updates and compaction operations—can lead to transac…
-
- 0 replies
- 58 views
-
-
Learn how to strike the right balance between real-time and warehouse-gated customer data architecture.View the full article
-
- 0 replies
- 58 views
-
-
At Databricks, we help our customers solve their problems by leveraging data and AI. To pursue this mission, we are continuing to expand our presenceView the full article
-
- 0 replies
- 69 views
-
-
-
Learn how Zoopla transformed real estate experiences in the UK with data-driven personalization and RudderStack's customer data infrastructure.View the full article
-
- 0 replies
- 53 views
-
-
Today, AWS announces the general availability of AWS Glue G.4X and G.8X workers in the US West (N. California), Asia Pacific (Seoul), Asia Pacific (Mumbai), Europe (London), Europe (Spain), and South America (São Paulo) AWS regions. Glue G.4X and G.8X workers enable you to run your most demanding serverless data integration workloads in these additional regions. AWS Glue is a serverless, scalable data integration service that makes it simple to discover, prepare, move, and integrate data from multiple sources. AWS Glue G.4X and G.8X workers provide higher compute, memory, and storage resources than current Glue workers. These new types of workers help you scale and r…
-
- 0 replies
- 93 views
-
-
The distinctions and intersections between Data Science, Machine Learning, and Artificial Intelligence can be complex and controversial.View the full article
-
- 0 replies
- 26 views
-
-
Are you a startup building core, customer-facing B2B products on Databricks? Then we have a Challenge for you! On the heels of our Generative AI...View the full article
-
- 0 replies
- 115 views
-
-
With today’s launch, AWS Clean Rooms provides additional privacy-enhancing controls to support aggregation and list analysis rules using the Spark analytics engine. Using AWS Clean Rooms Spark SQL, you and your partners can now manage how your data is used with aggregation, list, and custom analysis rules, running SQL queries with configurable resources based on your performance, scale, and cost requirements. For example, advertisers can use list analysis rules to create targeted audience segments from collective advertiser and publisher data sets without sharing the raw data used to create the segments. Similarly, publishers and their partners can run media planning a…
-
- 0 replies
- 59 views
-
-
Since our launch on Google Cloud Platform (GCP) in 2021, Databricks on Google Cloud has provided more than 1,500 joint customers with a tightly integrated...View the full article
-
- 0 replies
- 111 views
-
-
We’re excited to announce the General Availability of Lakeflow Connect for Salesforce and Workday. Lakeflow Connect introduces no-code ingestion connectors for popular SaaS applications, databases,...View the full article
-
- 0 replies
- 88 views
-
-
Introduction Game developers have always looked to build ongoing relationships with its players to maximize the play they bring to the world, and the success...View the full article
-
- 0 replies
- 121 views
-
-
Understanding GraphRAG What is a Knowledge Graph? To understand why one may use a Knowledge Graph (KG) instead of another structured data representation, it’s importantView the full article
-
- 0 replies
- 76 views
-
-
Learn about the importance of building a strong data foundation in the Starter Stage of the data maturity journey.View the full article
-
- 0 replies
- 46 views
-
-
Discover how the Canadian Football League transformed their fan engagement strategy by unifying ticketing, e-commerce, and fan data with RudderStackView the full article
-
- 0 replies
- 40 views
-
-
As more and more organizations embrace analytics, a wider range of problems are being brought forward to be solved. While data science teams are often...View the full article
-
- 0 replies
- 88 views
-
-
We’re excited to announce the Public Preview of the Microsoft Power BI task type in Databricks Workflows, available on Azure, AWS, and GCP. With this...View the full article
-
- 0 replies
- 80 views
-
-
Within the big data and analytics space there are two names at the forefront of conversation: Apache Spark and Databricks. While they’re closely related, they serve very different purposes in the data ecosystem. Understanding their core differences is critical for architects, developers, and data engineers looking to build scalable, high-performance data solutions in the cloud. […] The article Databricks vs Apache Spark: Key Differences and When to Use Each was originally published on Build5Nines. To stay up-to-date, Subscribe to the Build5Nines Newsletter. View the full article
-
- 0 replies
- 178 views
-
-
Willis Towers Watson (WTW) is a multinational company that provides a wide range of services in commercial insurance brokerage, risk management, employee benefits, and actuarial...View the full article
-
- 0 replies
- 84 views
-
-
Qwen models, developed by Alibaba, have shown strong performance in both code completion and instruction tasks. In this blog, we’ll show how you can register...View the full article
-
- 0 replies
- 87 views
-
-
Databricks enables organizations to securely share data, AI models, and analytics across teams, partners, and platforms without duplication or vendor lock-in. With Delta Sharing, Databricks...View the full article
-
- 0 replies
- 84 views
-
-
Databricks introduced last year Databricks Apps, completing its suite of tools that allows users to create and deploy applications directly on the Databricks Platform. With...View the full article
-
- 0 replies
- 88 views
-
-
We’re excited to announce that Anthropic Claude 3.7 Sonnet is now natively available in Databricks across AWS, Azure, and GCP. For the first time, you View the full article
-
- 0 replies
- 89 views
-
-
Prisma Cloud is the leading Cloud Security platform that provides comprehensive code-to-cloud visibility into your risks and incidents, offering key remediation capabilities to manage andView the full article
-
- 0 replies
- 62 views
-
-
Large language models are challenging to adapt to new enterprise tasks. Prompting is error-prone and achieves limited quality gains, while fine-tuning requires large amounts ofView the full article
-
- 0 replies
- 61 views
-
-
Databricks Apps provide a robust platform for building and hosting interactive applications. React is great for building modern, dynamic web applications that need to updateView the full article
-
- 0 replies
- 49 views
-
-
Driving Sustainable Aluminum Production: How to Calculate the Material Recovery Ratio with GraphFrames Sustainable production has become an imperative in today’s manufacturing market. According toView the full article
-
- 0 replies
- 41 views
-
-
The journey to data maturity is about taking the right steps at the right time to unlock value from the data you have. Learn how RudderStack can help.View the full article
-
- 0 replies
- 30 views
-
-
We’re excited to announce the General Availability of Explore in Tableau, a new integration that lets you create Tableau Cloud visualizations directly from Unity Catalog...View the full article
-
- 0 replies
- 28 views
-
-
We’re making it easier than ever for Databricks customers to run secure, scalable Apache Spark™ workloads on Unity Catalog Compute with Unity Catalog Lakeguard. In...View the full article
-
- 0 replies
- 42 views
-
-
Training AI models for real-world applications require vast amounts of labeled data, which can be costly, time-consuming, and difficult to obtain at scale. Synthetic data...View the full article
-
- 0 replies
- 34 views
-
-
We’re excited to announce the General Availability of Hive Metastore (HMS) and AWS Glue Federation in Unity Catalog! This new capability enables Unity Catalog to...View the full article
-
- 0 replies
- 39 views
-
-
In todayâs fast-paced digital landscape, data is being generated at an unprecedented rate.View the full article
-
- 0 replies
- 16 views
-
-
Introduction In this blog, we share the journey of building a Serverless optimized Artifact Registry from the ground up. The main goals are to ensure...View the full article
-
- 0 replies
- 29 views
-
-
Learn how modern companies are rethinking data governance to create competitive advantages while maintaining customer trust.View the full article
-
- 0 replies
- 29 views
-
-
At Home Trust, we measure success in terms of relationships. Whether we’re working with individuals or businesses, we strive to help them stay “Ready for...View the full article
-
- 0 replies
- 28 views
-
-
In AWS data engineering, Extract, Transform, and Load (ETL) processes are pivotal, as they allow you to prepare raw data sets for analytical purposes. This blog provides a detailed exploration of data engineering best practices specifically geared toward optimising ETL workflows, enhanced with relevant keywords and concepts for AWS Certified Data Engineer Associate Certification (DEA-C01)... View the full article
-
- 0 replies
- 54 views
-
-
Learn why clean, accurate event data is the foundation of understanding the full customer journeyView the full article
-
- 0 replies
- 28 views
-
-
Generative AI is transforming how organizations interact with their data, and batch LLM processing has quickly become one of Databricks' most popular use cases. Last...View the full article
-
- 0 replies
- 28 views
-
-
In today’s dynamic retail environment, staying connected to customer sentiments is more crucial than ever. With shoppers sharing their experiences across countless platforms, retailers are...View the full article
-
- 0 replies
- 31 views
-
-
Earlier this week, we announced new agent development capabilities on Databricks. After speaking with hundreds of customers, we've noticed two common challenges to advancing beyond...View the full article
-
- 0 replies
- 27 views
-
-
DLT offers a robust platform for building reliable, maintainable, and testable data processing pipelines within Databricks. By leveraging its declarative framework and automatically provisioning optimal...View the full article
-
- 0 replies
- 20 views
-
-
기업들이 전략적 의사 결정을 내릴 때 데이터 기반 인사이트를 적극 활용함에 따라 데이터 인텔리전스 플랫폼의 최신 트렌드는 더욱 정교하고 확장 가능하며 안전한 솔루션으로 발전하는 방향을...View the full article
-
- 0 replies
- 25 views
-