Jump to content

Data Engineering & Data Science

Data Engineering

  • Data Pipelines (ETL/ELT)

  • Big Data Technologies

  • Cloud Computing for Data

  • Data Governance & Quality

Data Science

  • Machine Learning (ML)

  • Statistical Analysis

  • Data Visualization

  • Natural Language Processing (NLP)

  1. There are thousands of datasets available to institutional investors, each dataset promising to unlock significant insights in investment decisioning. Across the thousands of... View the full article

  2. Welcome to the blog series covering product advancements in 2023 for Databricks SQL, the serverless data warehouse from Databricks. This is part 2... View the full article

  3. Quantization is a technique for making machine learning models smaller and faster. We quantize Llama2-70B-Chat, producing an equivalent-quality model that generates 2.2x more... View the full article

  4. At Databricks, we believe that AI will change the way that enterprises interact with their data. That’s why today, we're excited to welcome t... View the full article

  5. The post highlights real-world examples of NLP use cases across industries. It also covers NLP's objectives, challenges, and latest research developments.View the full article

  6. Databricks recently announced the Data Intelligence Platform, a natural evolution of the lakehouse architecture we pioneered. The idea of a Data Intelligence Platform... View the full article

  7. Reliable, accurate and trusted data is the most critical requirement for any data application in an enterprise. As Databricks customers increasingly rely on... View the full article

  8. Today, we are announcing the industry's first Generative AI Engineer learning pathway and certification to help ensure that data and AI practitioners have... View the full article

  9. Started by Databricks,

    This post is part of a series. Check out Part 1: The Data + AI Trifecta: People, Process, and Platform In the current... View the full article

  10. This is part 1 of a blog series where we look back at the major areas of progress for Databricks SQL in 2023... View the full article

  11. Since COVID, countless articles have been written about the "Great Resignation", including in-depth analysis by the World Economic Forum. One key thing this... View the full article

  12. Started by TDS,

    My personal take on justifying the existence of Data MeshA senior stakeholder at one my projects mentioned that they wanted to decentralise their data platform architecture and democratise data across the organisation. When I heard the words ‘decentralised data architecture’, I was left utterly confused at first! In my then limited experience as a Data Engineer, I had only come across centralised data architectures and they seemed to be working very well. So, I was left wondering what was it that we wanted to solve using a decentralised data architecture? Or were we creating a new problem that did not ever exist in the first place? .. A Prequel to Data Mesh was originally…

    • 0 replies
    • 6.4k views
  13. Cyber threats and the tools to combat them have become more sophisticated. SIEM is over 20 years old and has evolved significantly in... View the full article

  14. “Short cuts make long delays.” ― J.R.R. Tolkien, The Fellowship of the Ring The lakehouse pattern, in which you store all of your struc... View the full article

  15. Data engineers rely on math and statistics to coax insights out of complex, noisy data. Among the most important domains is calculus, which... View the full article

  16. Ray is an open-source unified compute framework that simplifies scaling AI and Python workloads in a distributed environment. Since we introduced support for... View the full article

  17. The communications industry is undergoing one of the most significant periods of growth (and change) in its 100+ year history. The dramatic increase... View the full article

  18. Back in July, we released the public preview of the new Databricks Assistant, a context-aware AI assistant available in Databricks Notebooks, SQL editor... View the full article

  19. In today's interconnected digital landscape, data sharing and collaboration across organizations and platforms are crucial for modern business operations. Delta Sharing, an innovative... View the full article

  20. At Databricks, we want to help our customers build and deploy generative AI applications on their own data without sacrificing data privacy or... View the full article

  21. Started by Databricks,

    PySpark has always provided wonderful SQL and Python APIs for querying data. As of Databricks Runtime 12.1 and Apache Spark 3.4, parameterized queries... View the full article

  22. Introduction Anomaly detection is widely applied across various industries, playing a significant role in the enterprise sector. This blog focuses on its application... View the full article

  23. Over the past six months, we've been working with NVIDIA to get the most out of their new TensorRT-LLM library. TensorRT-LLM provides an easy-to-use Python interface to integrate with a web server for fast, efficient inference performance with LLMs. In this post, we're highlighting some key areas where our collaboration with NVIDIA has been particularly important. View the full article

  24. We are excited to announce that Gartner has recognized Databricks as a Leader for a third consecutive year in the 2023 Gartner® Magic... View the full article

  25. Today, Databricks is excited to announce support for Mixtral 8x7B in Model Serving. Mixtral 8x7B is a sparse Mixture of Experts (MoE) open... View the full article

  26. Started by Databricks,

    Governance ensures data and AI products are consistently developed and maintained, adhering to precise guidelines and standards. It's the blueprint for architects, bringing... View the full article

  27. We are excited to share new identity and access management features to help simplify the set-up and scale of Databricks for admins. Unity... View the full article

    • 0 replies
    • 1.4k views
  28. Request a meeting with Databricks executives/thought leaders at NRF! Each January, thousands of leaders from retailers around the globe gather at Javits Center... View the full article

  29. An effective campaign can help improve a company's revenue by increasing the sales of its products, clearing out more stock, bringing in more... View the full article

  30. Started by Databricks,

    "AFROTECH was not only insightful, but also greatly heightened my sense of belonging in the tech space! It was amazing to both make... View the full article

    • 0 replies
    • 2.8k views
  31. We’re excited to announce that the latest release of sparklyr on CRAN introduces support for Databricks Connect. R users now have seamless access t... View the full article

  32. Introduction Databricks Lakehouse Monitoring allows you to monitor all your data pipelines – from data to features to ML models – without additional too... View the full article

  33. As businesses grow, data volumes scale from GBs to TBs (or more), and latency demands go from hours to minutes (or less), making... View the full article

  34. Retrieval Augmented Generation (RAG) is an efficient mechanism to provide relevant data as context in Gen AI applications. Most RAG applications typically use... View the full article

  35. Enterprise leaders are turning to the Databricks Data Intelligence Platform to create a centralized source of high-quality data that business teams can leverage... View the full article

  36. Following the announcement we made yesterday around Retrieval Augmented Generation (RAG), today, we’re excited to announce the public preview of Databricks Vector Search. W... View the full article

  37. Retrieval-Augmented-Generation (RAG) has quickly emerged as a powerful way to incorporate proprietary, real-time data into Large Language Model (LLM) applications. Today we are... View the full article

  38. This was written in collaboration with Andrew Mullins, Director of Data Science at Kin + Carta. With the rise of new technologies from... View the full article

  39. We’re excited to announce the launch of Azure Qatar. With the expanded availability of Azure Databricks, it is now easier than ever for o... View the full article

  40. Recent data show that the number of recall campaigns caused by product deficiencies keeps increasing, while each known recorded case is a multi-million... View the full article

  41. To mark the announcement of Databricks listing in Guidewire Marketplace, Marcela Granados, our GTM Director for Insurance, Justin Fenton, Senior Director, Alliances, sat... View the full article

  42. This blog was written in collaboration with Ben Eisenberg, VP of Innovation at People Data Labs, and Tom Ashenmacher, Chief Revenue Officer at... View the full article

  43. We are excited to introduce five new integrations in Databricks Partner Connect—a one-stop portal enabling you to use partner solutions with your Databricks D... View the full article

  44. Background: Modernizing Data Delivery Today's enterprise data estates are vastly different from 10 years ago. Industries have transitioned their analytics from monolithic data... View the full article

  45. Defining what a data culture is can vary by organization. A data culture is the shared values, attitudes, and behaviors that enable organizations... View the full article

  46. Want to support the behavior of built-in functions and method calls in your Python classes? Magic methods in Python let you do just that! So let’s uncover the method behind the magic.View the full article

  47. Building on the momentum of Databricks Assistant, the context-aware AI assistant integrated within Databricks Notebooks, SQL editor, and file editor, and now powering... View the full article

  48. The costs of fraud are staggering. In 2022, just one type of fraud, card-not-present fraud, resulted in almost $6bn in losses in the... View the full article

  49. Started by KDnuggets,

    This article is about the four key soft skills every data scientist needs, and how to work on them.View the full article

  50. We recently announced our AI-generated documentation feature, which uses large language models (LLMs) to automatically generate documentation for tables and columns in Unity... View the full article