Jump to content

Data Engineering

  1. BigQuery, now with first-party support for Delta Lake, grows Delta Lake’s vibrant connector ecosystem and simplifies its integration with Databricks. View the full article

    • 0 replies
    • 33 views
  2. Started by Databricks,

    Introduction The Internet of Things (IoT) is generating an unprecedented amount of data. IBM estimates that annual IoT data volume will reach approximately... View the full article

    • 0 replies
    • 33 views
  3. We are excited to announce that Forrester has recognized Databricks as a Leader in The Forrester Wave™: AI Foundation Models for Language, Q2... View the full article

    • 0 replies
    • 39 views
  4. Data + AI Summit 2024 will be held in person and virtually on June 10-13, 2024, with a highly anticipated lineup of keynotes... View the full article

    • 0 replies
    • 42 views
  5. Started by Databricks,

    Introduction Today, manufacturers’ field maintenance is often more reactive than proactive, which can lead to costly downtime and repairs. Historically, data warehouses have... View the full article

    • 0 replies
    • 35 views
  6. The annual Data Team Awards spotlight data teams and the pivotal role they play in business operations across industries and markets. By continually... View the full article

    • 0 replies
    • 36 views
  7. Over the last year, we’ve been listening to feedback and iterating on new ideas with a single goal: to build the best data-focused... View the full article

    • 0 replies
    • 41 views
  8. Started by Databricks,

    We are excited to announce that we have agreed to acquire Tabular, Inc, a data management company founded by Ryan Blue, Daniel Weeks... View the full article

    • 0 replies
    • 27 views
  9. Businesses are making remarkable progress on their data and AI journeys. They’re advancing from a few pilot projects confined to use cases likely... View the full article

    • 0 replies
    • 26 views
  10. Started by Databricks,

    The secret to good AI is great data. As AI adoption soars, the data platform is the most important component of any enterprise's... View the full article

    • 0 replies
    • 38 views
  11. Delta Lake UniForm, now in GA, enables customers to benefit from Delta Lake’s industry-leading price-performance when connecting to tools in the Iceberg ecosystem. View the full article

    • 0 replies
    • 33 views
  12. We are excited to announce a new data type called variant for semi-structured data. Variant provides an order of magnitude performance improvements compared... View the full article

    • 0 replies
    • 37 views
  13. With more and more customer interactions moving into the digital domain, it's increasingly important that organizations develop insights into online customer behaviors. In... View the full article

    • 0 replies
    • 54 views
  14. Special thanks to Caleb Benningfield and Sam Malissa at Amperity for their valuable insights and contributions to this blog. Today, businesses face a... View the full article

    • 0 replies
    • 38 views
  15. We are excited to announce the general availability of Row Filters and Column Masks in Unity Catalog on AWS , Azure and GCP... View the full article

    • 0 replies
    • 46 views
  16. Salesforce and Databricks are excited to announce an expanded strategic partnership that delivers a powerful new integration - Salesforce Bring Your Own Model... View the full article

    • 0 replies
    • 37 views
  17. Whether you are working on a live title, pre/post production, ongoing maintenance, future releases, another version of a game, or a brand new... View the full article

    • 0 replies
    • 43 views
  18. This blog is authored by Michael Ewins, Director of Engineering at Skyscanner At Skyscanner , we're more than just a flight search engine... View the full article

    • 0 replies
    • 60 views
  19. Over the last few years, Large Language Models (LLMs) have been reshaping the field of natural language, thanks to their transformer-based architectures and... View the full article

    • 0 replies
    • 36 views
  20. The annual Data Team Awards celebrate the critical contributions of data teams to various sectors, spotlighting their role in driving progress and positive... View the full article

    • 0 replies
    • 59 views
  21. The annual Data Team Awards showcase the remarkable efforts of top global enterprise data teams committed to tackling some of today's toughest business... View the full article

    • 0 replies
    • 70 views
  22. In today's digital landscape, secure data sharing is critical to operational efficiency and innovation. Databricks and the Linux Foundation developed Delta Sharing as... View the full article

    • 0 replies
    • 36 views
  23. The Data Team Awards annually recognize the indispensable roles of enterprise data teams across industries, celebrating their resilience and innovation from around the... View the full article

    • 0 replies
    • 41 views
  24. If you’ve been following the world of industry-grade LLM technology for the last year, you’ve likely observed a plethora of frameworks and tools... View the full article

    • 0 replies
    • 36 views
  25. We’re excited to announce the General Availability of Delta Lake Liquid Clustering in the Databricks Data Intelligence Platform. Liquid Clustering is an innovative... View the full article

    • 0 replies
    • 33 views
  26. Generative AI (GenAI) is moving incredibly fast. So much so, that in less than two years, GenAI has emerged as one of the... View the full article

    • 0 replies
    • 35 views
  27. We're excited to announce native support in Databricks for ingesting XML data . XML is a popular file format for representing complex data... View the full article

    • 0 replies
    • 37 views
  28. In the last year, the Databricks Money Engineering Team has embarked on an exhilarating journey, achieving nearly double our operational efficiency. We are... View the full article

    • 0 replies
    • 36 views
  29. Following the announcement we made around a suite of tools for Retrieval Augmented Generation, today we are thrilled to announce the general availability... View the full article

    • 0 replies
    • 36 views
  30. Started by Databricks,

    We’re excited to announce the Databricks AI Fund, showcasing our commitment to supporting a new generation of founders and startups. View the full article

    • 0 replies
    • 31 views
  31. We are excited to introduce Databricks Assistant Autocomplete now in Public Preview. This feature brings the AI-powered assistant to you in real-time, providing... View the full article

    • 0 replies
    • 31 views
  32. We are thrilled to announce an exciting new feature on the Databricks Marketplace that simplifies the process of setting up private exchanges for... View the full article

    • 0 replies
    • 46 views
  33. In the semiconductor industry, research and development tasks, manufacturing processes, and enterprise planning systems produce an array of data artifacts that can be fused to create an intelligent semiconductor enterprise. Through intelligent data use, an intelligent semiconductor enterprise accelerates time to market, increases manufacturing yield, and enhances product reliability. View the full article

    • 0 replies
    • 26 views
  34. Databricks is pleased to announce we are ranked #2 in the inaugural annual Glassdoor Award List of Best-Led Companies in 2024 ! At... View the full article

    • 0 replies
    • 30 views
  35. Successfully building GenAI applications means going beyond just leveraging the latest cutting-edge models. It requires the development of compound AI systems that integrate... View the full article

    • 0 replies
    • 30 views
  36. In the fast-paced landscape of data science and engineering, integrating Artificial Intelligence (AI) has become integral for enhancing productivity. We’ve seen many tools... View the full article

    • 0 replies
    • 26 views
  37. We recently introduced DBRX : an open, state-of-the-art, general-purpose LLM. DBRX was trained, fine-tuned, and evaluated using Mosaic AI Training, scaling training to... View the full article

    • 0 replies
    • 33 views
  38. How we reached 79.9% on the Spider dev dataset with Llama3 8B through savvy prompting and fine-tuning on Databricks. View the full article

    • 0 replies
    • 34 views
  39. The annual Data Team Awards highlight how diverse enterprise data teams are tackling some of the most prevalent and complex issues facing the... View the full article

    • 0 replies
    • 42 views
  40. Last year, we launched foundation model support in Databricks Model Serving to enable enterprises to build secure and custom GenAI apps on a... View the full article

    • 0 replies
    • 34 views
  41. In December, we announced a new suite of tools to get Generative AI applications to production using Retrieval Augmented Generation (RAG). Since then... View the full article

    • 0 replies
    • 35 views
  42. The Data Team Awards celebrates enterprise data teams' essential role in helping businesses across sectors face their most pressing challenges. With more than... View the full article

    • 0 replies
    • 52 views
  43. Introduction Organizations aiming to become AI and data-driven often need to provide their internal teams with high-quality and trusted data products . Building... View the full article

    • 0 replies
    • 36 views
  44. Data, analytics and AI governance is perhaps the most important yet challenging aspect of any data and AI democratization effort. For your data... View the full article

    • 0 replies
    • 34 views
  45. Moving generative AI applications from the proof of concept stage into production requires control, reliability and data governance. Organizations are turning to open... View the full article

    • 0 replies
    • 39 views
  46. In the fast-paced world of sports, where every second and every play can make a difference, the need for advanced analytics and real-time... View the full article

    • 0 replies
    • 43 views
  47. The generative AI revolution is transforming the way that teams work, and Databricks Assistant leverages the best of these advancements. It allows you... View the full article

    • 0 replies
    • 84 views
  48. The Databricks Data Intelligence Platform offers unparalleled flexibility, allowing users to access nearly instant, horizontally scalable compute resources. This ease of creation can... View the full article

    • 0 replies
    • 59 views
  49. The modern data stack is designed to address the difficulties with data collection, storage, and analysis as the volume and complexity of data... View the full article

    • 0 replies
    • 61 views
  50. A good benchmark is one that clearly shows which models are better and which are worse. The Databricks Mosaic Research team is dedicated... View the full article

  51. We are excited to announce that Databricks on AWS GovCloud is now in public preview and that we recently earned our first FedRAMP®... View the full article

  52. We are proud to announce that Forrester has recognized Databricks as a Leader with the highest scores in both current offering and strategy... View the full article

  53. We are thrilled to announce Unity Catalog Lakeguard , which allows you to run Apache Spark™ workloads in SQL, Python, and Scala with... View the full article

  54. Data democratization may sound like just another technology buzzword, but with organizations collecting more and more data every day, the accuracy, trustworthiness, and... View the full article

  55. We're thrilled to announce the General Availability (GA) of Databricks Asset Bundles (DABs) . With DABs you can easily bundle resources like jobs... View the full article

  56. For a limited time, we're offering 50% off training and certification at Data + AI Summit with the following code: TRAIN50FOTY. This offer... View the full article

  57. The next generation of Databricks SQL (DBSQL) dashboards, also known as Lakeview Dashboards, is now generally available on AWS and Azure. This new... View the full article

  58. We recently made significant improvements to the underlying algorithms supporting AI-generated comments in Unity Catalog and we’re excited to share our results. Through... View the full article

  59. Introduction In this blog post we dive into inference with DBRX, the open state-of-the-art large language model (LLM) created by Databricks (see Introducing... View the full article

  60. We released Ray support public preview last year and since then, hundreds of Databricks customers have been using it for variety of use... View the full article

  61. Are you ready to discover how one of the world's leading tech giants is transforming its data analytics to stay ahead of the... View the full article

  62. Fostering a paradigm shift towards a smarter, cleaner & reliable energy system Electricity is the new oil. Sources of energy are becoming more... View the full article

    • 0 replies
    • 41 views
  63. Started by Databricks,

    Large language models (LLMs) have generated interest in effective human-AI interaction through optimizing prompting techniques. “Prompt engineering” is a growing methodology for tailoring... View the full article

    • 0 replies
    • 355 views
  64. We're excited to announce that Databricks has been honored with the 2024 Google Cloud Technology Partner of the Year award for Data -... View the full article

    • 0 replies
    • 55 views
  65. Introduction The ability for organizations to adopt machine learning, AI, and large language models (LLMs) has accelerated in recent years thanks to the... View the full article

    • 0 replies
    • 55 views
  66. Innovation in the Power and Utilities industry is all but a necessary step to move forward with the evolution of the national power... View the full article

    • 0 replies
    • 65 views
  67. Databricks Unity Catalog ("UC") provides a single unified governance solution for all of a company's data and AI assets across clouds and data... View the full article

    • 0 replies
    • 65 views
  68. Today, we are excited to announce the general availability of Databricks Notebooks on SQL warehouses. Databricks SQL warehouses are SQL-optimized compute that provide... View the full article

    • 0 replies
    • 62 views
  69. Overview In the competitive world of professional hockey, NHL teams are always seeking to optimize their performance. Advanced analytics has become increasingly important... View the full article

    • 0 replies
    • 64 views
  70. Databricks Runtime 14.3 includes a new capability that allows users to access and analyze Structured Streaming 's internal state data: the State Reader... View the full article

    • 0 replies
    • 46 views
  71. Today, we are excited to introduce DBRX, an open, general-purpose LLM created by Databricks. Across a range of standard benchmarks, DBRX sets a... View the full article

    • 0 replies
    • 62 views
  72. Databricks’ mission is to deliver data intelligence to every enterprise by allowing organizations to understand and use their unique data to build their... View the full article

    • 0 replies
    • 58 views
  73. By Steve Sobel - Global Industry Leader; Communications, Media & Entertainment Today Databricks and Adobe are excited to announce a strategic partnership focused... View the full article

    • 0 replies
    • 80 views
  74. As new Generative AI capabilities continue to emerge with heightened customer expectations, data modernization and migration to the cloud have become critical success... View the full article

    • 0 replies
    • 50 views
  75. With the releases of Apache Spark 3.4 and 3.5 in 2023, we focused heavily on improving PySpark performance, flexibility, and ease of use... View the full article

    • 0 replies
    • 63 views
  76. The GGUF file format is a binary file format used for storing and loading model weights for the GGML library. The library documentation... View the full article

    • 0 replies
    • 1.1k views
  77. In the previous blog , we discussed how to securely access Azure Data Services from Azure Databricks using Virtual Network Service Endpoints or... View the full article

    • 0 replies
    • 38 views
  78. Introduction After a whirlwind year of developments in 2023, many enterprises are eager to adopt increasingly capable generative AI models to supercharge their... View the full article

    • 0 replies
    • 40 views
  79. Next-generation customer experiences are built upon data and insights derived from various touchpoints. Through these, marketers can detect subtle differences in customer needs... View the full article

    • 0 replies
    • 48 views
  80. Today, we are thrilled to announce that Lilac is joining Databricks. Lilac is a scalable, user-friendly tool for data scientists to search, cluster... View the full article

    • 0 replies
    • 87 views
  81. On ecommerce platforms, a good product description can make an item stand out and drive sales. A good product description should not only... View the full article

    • 0 replies
    • 50 views
  82. Game development is a multifaceted journey that stretches from the initial concept to post-launch support and live operations. At the heart of this... View the full article

    • 0 replies
    • 67 views
  83. Artificial Intelligence is top-of-mind with every C-suite in Retail & Consumer Goods. Companies see the potential to deliver better customer service, derive faster... View the full article

    • 0 replies
    • 52 views
  84. Today, we are excited to announce the general availability of Feature Serving. Features play a pivotal role in AI Applications, typically requiring considerable... View the full article

    • 0 replies
    • 49 views
  85. "Building vehicles that are more like smartphones is the future. We're about to change the ride just like Apple and all the smartphone... View the full article

    • 0 replies
    • 51 views
  86. This post was written in collaboration with Jason Labonte, Chief Executive Officer, Veritas Data Research In the realm of healthcare and life sciences... View the full article

    • 0 replies
    • 47 views
  87. Today, we're excited to announce the launch of Brickbuilder Unity Catalog Accelerators. This is an expansion to the Brickbuilder Accelerator program, which pairs... View the full article

    • 0 replies
    • 50 views
  88. The DataFrame equality test functions were introduced in Apache Spark™ 3.5 and Databricks Runtime 14.2 to simplify PySpark unit testing. The full set o... View the full article

    • 0 replies
    • 73 views
  89. This blog continues our series looking at advancements from 2023 to the serverless data warehouse Databricks SQL . The best data warehouse is... View the full article

    • 0 replies
    • 39 views
  90. KX and Databricks have partnered to develop time series analytics solutions for the capital markets sector to support many use cases including quant... View the full article

    • 0 replies
    • 79 views
  91. Check out our LLM Solution Accelerators for Retail for more details and to download the notebooks. Product recommendations are a core feature of... View the full article

    • 0 replies
    • 75 views
  92. StreamNative, a leading Apache Pulsar-based real-time data platform solutions provider, and Databricks, the Data Intelligence Platform, are thrilled to announce the enhanced Pulsar-Spark... View the full article

    • 0 replies
    • 60 views
  93. We are thrilled to announce major improvements to the search capabilities in your Databricks workspace. These enhancements build on DatabricksIQ, the Data Intelligence... View the full article

    • 0 replies
    • 48 views
  94. Special thanks to Barb MacLean, SVP, Head of Technology Operations and Implementation at Coastal Community Bank (Coastal) and Rob Cavallo, President at Cavallo... View the full article

    • 0 replies
    • 1.3k views
  95. Artificial Intelligence (AI) is going to be embedded in every product and service a business produces and customers interact with. With Generative AI... View the full article

    • 0 replies
    • 50 views
  96. With Game Developers Conference a week away, we're thrilled to present the 2nd Edition of Databricks' Ultimate Guide to Game Data and AI... View the full article

    • 0 replies
    • 63 views
  97. This post is the second part of our two-part series on the latest performance improvements of stateful pipelines. The first part of this... View the full article

  98. Started by Databricks,

    (This post written in collaboration with Zeqiu (Ellen) Wu and Yushi Hu , both PhD students affiliated with the University of Washington, and... View the full article

  99. This blog was written in collaboration with Tim Sedlak, Senior Solutions Architect at Stardog In healthcare and life sciences, accuracy is everything. That's... View the full article

    • 0 replies
    • 1.7k views
  100. Started by Databricks,

    Introduction On January 4th, a new era in digital marketing began as Google initiated the gradual removal of third-party cookies, marking a seismic... View the full article

    • 0 replies
    • 2.5k views