Data Engineering & Data Science
Data Engineering
Data Pipelines (ETL/ELT)
Big Data Technologies
Cloud Computing for Data
Data Governance & Quality
Data Science
Machine Learning (ML)
Statistical Analysis
Data Visualization
Natural Language Processing (NLP)
1,007 topics in this forum
-
Executive Summary In this blog post we explore how private equity (PE) firms can leverage data intelligence to enhance portfolio returns. We highlight... View the full article
-
- 0 replies
- 18 views
-
-
We are excited to announce the Public Preview of Cross-Platform View Sharing. Available today, it allows data providers to share views across different... View the full article
-
- 0 replies
- 14 views
-
-
Databricks is a well-known cloud-based data engineering, processing, and analytics platform. One of its key functions is DATEDIFF(date_diff()) used by data professionals widely. The DATEDIFF function in Databricks is very helpful in analyzing time-based data. Using this function helps the user do complex operations like finding time differences between two date values. It is used […]View the full article
-
- 0 replies
- 7 views
-
-
Data is everywhere. We make huge amounts of data every day from our social media interactions to the things we buy online. According to expert predictions, data will globally surpass 175 zettabytes by 2025, a figure that is nearly unfathomable. But having data isn’t enough; you need to use it in the right way to […]View the full article
-
- 0 replies
- 8 views
-
-
An ETL tool, which has become the critical choice for any organization today, is tied directly to the ever-growing importance of data integration. However, both Matillion and Talend are among the most used ETL tools, providing different functionalities suited to different business needs. Irrespective of whether it is a small business or an enterprise, what […]View the full article
-
- 0 replies
- 10 views
-
-
For today’s manufacturers, streamlined and automated workflows are crucial for overcoming challenges such as manual data management and equipment downtime. By leveraging automated... View the full article
-
- 0 replies
- 14 views
-
-
Large language models are revolutionizing how we interact with technology by leveraging advanced natural language processing to perform complex tasks. In recent years... View the full article
-
- 0 replies
- 22 views
-
-
In today's rapidly changing digital world, consumer data protection and privacy regulations are reshaping how businesses interact with their customers. These changes can... View the full article
-
- 0 replies
- 18 views
-
-
The Databricks Serverless compute infrastructure launches and manages millions of virtual machines (VMs) each day across three major cloud providers, and it is... View the full article
-
- 0 replies
- 16 views
-
-
Election betting volume reveals challenges for gaming platforms in collecting player data. Learn how RudderStack's warehouse-native CDP scales cost effectively.View the full article
-
- 0 replies
- 11 views
-
-
"We are delving deeper into the capabilities of MLFlow tracing. This functionality will be instrumental in diagnosing performance issues and enhancing the quality... View the full article
-
- 0 replies
- 18 views
-
-
Introduction Data is power. But in retail banking, it’s about turning that power into actionable insights while carefully navigating data security risks. Financial... View the full article
-
- 0 replies
- 12 views
-
-
We are thrilled to announce the winners of the Generative AI World Cup! This event brought together over 1500 data scientists and AI... View the full article
-
- 0 replies
- 14 views
-
-
In the rapidly evolving landscape of AI, organizations across all industries are eager to harness its transformational power. However, successful AI utilization and... View the full article
-
- 0 replies
- 17 views
-
-
In the modern field of data analytics, proper data management is the only way to maximize performance while minimizing costs. Google BigQuery, one of the leading cloud-based data warehouses, shows great skills in managing huge datasets by partitioning and clustering. Understanding the differences between BigQuery partitioning and clustering is cardinal for the data engineer or […]View the full article
-
- 0 replies
- 6 views
-
-
Do you have a fascination with Databricks architecture but you get lost with all the terms being used out there? Let’s break it down simply! If you are just getting familiar with cloud computing or just need a refresher, in this blog, let’s try distilling the key aspects of Databricks architecture in simple, easy-to-understand concepts. […]View the full article
-
- 0 replies
- 7 views
-
-
While large language models (LLMs) are increasingly adept at solving general tasks, they can often fall short on specific domains that are dissimilar... View the full article
-
- 0 replies
- 16 views
-
-
We’re excited to announce that the Databricks Assistant , now fully hosted and managed within Databricks, is available in public preview! This version... View the full article
-
- 0 replies
- 17 views
-
-
We are excited to announce that Azure Private Link is now Generally Available (GA) for Databricks serverless and Mosaic AI Model Serving workloads... View the full article
-
- 0 replies
- 16 views
-
-
Today, information has become one of the most important resources of a company. Businesses are now creating more data in their systems such as customer sales, web traffic and activity, CRM and so much more. However, raw data doesn’t provide the benefits; it is useful only when collated effectively and efficiently into a structured accessible […]View the full article
-
- 0 replies
- 7 views
-
-
In today’s data-driven world, choosing the right schema to store data is equally important as collecting it. Schema design plays a crucial role in the performance, scalability, and usability of your data systems. Different data use cases require the selection of different schema designs. It can depend on various factors like the complexity of your […]View the full article
-
- 0 replies
- 5 views
-
-
According to The Gartner Group, poor data quality drains a company on average $12.9 million annually in resources and expenses for operational inefficiencies, missed sales and unrealized new opportunities. Many companies, even today, struggle with balancing the high cost of computational resources against their often unpredictable needs. Autoscale Databricks is a practical answer to this […]View the full article
-
- 0 replies
- 5 views
-
-
We’re excited to announce a new integration between Databricks Notebooks and AI/BI Dashboards, enabling you to effortlessly transform insights from your notebooks into... View the full article
-
- 0 replies
- 18 views
-
-
Effective data governance is crucial for organizations to harness their data assets. Learn how bp uses Databricks Unity Catalog to enhance their data governance framework, highlighting challenges, strategies, and benefits. View the full article
-
- 0 replies
- 13 views
-
-
We are thrilled to unveil the finalists for the Databricks Generative AI Startup Challenge , a competition designed to spotlight innovative early-stage startups... View the full article
-
- 0 replies
- 17 views
-
-
We are excited to introduce the gated Public Preview of Predictive Optimization for statistics. Announced at the Data + AI Summit, Predictive Optimization... View the full article
-
- 0 replies
- 17 views
-
-
While GenAI is the focus today, most enterprises have been working for a decade or longer to make data intelligence a reality within... View the full article
-
- 0 replies
- 17 views
-
-
As organizations increasingly leverage the Databricks Data Intelligence Platform for data and AI needs, upgrading to Unity Catalog is a key step in... View the full article
-
- 0 replies
- 17 views
-
-
The Data + AI Skills Gap The “skills gap” has been a concern for CEOs and leaders for many years, and the gap... View the full article
-
- 0 replies
- 18 views
-
-
Databricks is turning up the heat at AWS re:Invent 2024 , and we’re bringing more than just data and AI solutions to the... View the full article
-
- 0 replies
- 15 views
-
-
Fivetran and Azure Data Factory, also known as ADF, are two popular names when it comes to data integration. Both powerful platforms are used for moving data sources to your warehouse or cloud storage. However, the difference between Fivetran vs ADF is in their features, ease of use, and flexibility. We will do a detailed […]View the full article
-
- 0 replies
- 5 views
-
-
This is an essential inflection point thanks to Medallion Architecture in enterprise data management, which was introduced by Databricks and adopted by Microsoft in their Fabric platform release. This architecture is intended to simplify the problem of structuring and handling data in the context of data lakes and data lake houses. The blog deals with […]View the full article
-
- 0 replies
- 6 views
-
-
Accessing and performing large volumes of data is crucial in data analytics and engineering. As datasets grow larger and more complex, executing queries repeatedly can become a bottleneck, slowing down data analysis and decision-making. One solution to overcome this challenge is using materialized views – a powerful database object designed to optimize query performance and store the […]View the full article
-
- 0 replies
- 6 views
-
-
In an era where data is the lifeblood of medical advancement, the clinical trial industry finds itself at a critical crossroads. The current... View the full article
-
- 0 replies
- 16 views
-
-
Providence Health's extensive network spans 50+ hospitals and numerous other facilities across multiple states, presenting many challenges in predicting patient volume and daily... View the full article
-
- 0 replies
- 16 views
-
-
When the Generative AI boom first ignited, every enterprise rushed to deploy the technology. For many, that excitement remains. But companies are also... View the full article
-
- 0 replies
- 16 views
-
-
Many AI use cases now depend on transforming unstructured inputs into structured data. Developers are increasingly relying on LLMs to extract structured data... View the full article
-
- 0 replies
- 15 views
-
-
Databricks Lakehouse is an open data management architecture which combines the scalability, cost-effectiveness, and flexibility of data lakes with the data management and ACID transactions of data warehouses. Databricks Lakehouse is the best of both worlds of data lakes and data warehouses. It enables machine learning and business intelligence on all data with more reliability. […]View the full article
-
- 0 replies
- 5 views
-
-
Managing today’s flood of data is not a small task. Every organization is balancing a constant stream of new information with the need to meet regulatory standards, keep data clean and accurate, and avoid using too much storage. The more data you have, the harder it gets to modify or delete. That’s why Deletion Vectors […]View the full article
-
- 0 replies
- 7 views
-
-
The Future: From Rules Engines to Instruction-Following AI Agent Systems In sectors such as banking and insurance, rules engines have long played a... View the full article
-
- 0 replies
- 16 views
-
-
Whether you’re coming from healthcare, aerospace, manufacturing, government or any other industries the term big data is no foreign concept; however how that... View the full article
-
- 0 replies
- 17 views
-
-
Monolithic to Modular The proof of concept (POC) of any new technology often starts with large, monolithic units that are difficult to characterize... View the full article
-
- 0 replies
- 16 views
-
-
The most recent wave of artificial intelligence (AI), spearheaded by the advent and mass adoption of large language models (LLM), showed the potential... View the full article
-
- 0 replies
- 18 views
-
-
Over the last few years, we've seen tremendous growth and adoption of Databricks SQL , our intelligent data warehouse purpose-built on the Data... View the full article
-
- 0 replies
- 17 views
-
-
Learn more about clickstream data here. We define what it is, explore its benefits, and reveal how RudderStack can streamline your clickstream data pipeline.View the full article
-
- 0 replies
- 11 views
-
-
Event streaming involves collecting and analyzing data in real time, increasing efficiency and agility. Learn how RudderStack makes event streaming easy here.View the full article
-
- 0 replies
- 13 views
-
-
This is a joint post with the Hugging Face Gradio team; read their announcement here! You can find the full report with all of the detailed findings from our security audit of Gradio 5 here. Hugging Face hired Trail of Bits to audit Gradio 5, a popular open-source library that provides a web interface that […] The post Auditing Gradio 5, Hugging Face’s ML GUI framework appeared first on Security Boulevard. View the full article
-
- 0 replies
- 3 views
-
-
Today, we are excited to announce the general availability of Databricks Assistant Autocomplete on all cloud platforms. Assistant Autocomplete provides personalized AI-powered code... View the full article
-
- 0 replies
- 17 views
-
-
With the pace of modern business and the competitive need for more and more data, organizations now correctly ask whether their data management... View the full article
-
- 0 replies
- 14 views
-
-
In industries like finance and retail, vast data is leveraged to generate billions in profits. Yet, in healthcare, the struggle to access critical... View the full article
-
- 0 replies
- 16 views
-