Jump to content

Data Engineering & Data Science

Data Engineering

  • Data Pipelines (ETL/ELT)

  • Big Data Technologies

  • Cloud Computing for Data

  • Data Governance & Quality

Data Science

  • Machine Learning (ML)

  • Statistical Analysis

  • Data Visualization

  • Natural Language Processing (NLP)

  1. RudderStack Profiles enables data teams to power their businesses with complete customer profiles and now supports the Databricks Data Intelligence Platform.View the full article

  2. Generative AI fever shows no signs of cooling off. As pressure and excitement build to execute strong GenAI strategies, data leaders and practitioners... View the full article

  3. We're excited to announce the General Availability of Databricks Predictive Optimization. This capability intelligently optimizes your table data layouts for faster queries and... View the full article

  4. The Databricks Partner Ecosystem, comprising over 3,800 partners worldwide, plays a pivotal role in building and delivering premier data and AI solutions globally... View the full article

  5. In the dynamic, innovative landscape of the San Francisco Bay Area, Databricks stands out not just for our groundbreaking data and AI solutions... View the full article

  6. This is a collaborative post from Databricks and Google Cloud. We thank Nicole Huynh , Partner Marketing Manager - Data Cloud, for her... View the full article

    • 0 replies
    • 86 views
  7. In June 2023, we launched Databricks Marketplace as an open marketplace for all your data, analytics, and AI needs, powered by the open... View the full article

    • 0 replies
    • 55 views
  8. This is a collaborative post from Databricks and Microsoft. We thank Mohini Verma , Senior Product Marketing Manager, for her contributions. Data +... View the full article

    • 0 replies
    • 68 views
  9. This blog is authored by Bhaskar Palit , Senior Director, Data & Analytics, PepsiCo, and Sudipta Das , Data Architect Senior Manager, PepsiCo... View the full article

    • 0 replies
    • 48 views
  10. BigQuery, now with first-party support for Delta Lake, grows Delta Lake’s vibrant connector ecosystem and simplifies its integration with Databricks. View the full article

    • 0 replies
    • 52 views
  11. Started by Databricks,

    Introduction The Internet of Things (IoT) is generating an unprecedented amount of data. IBM estimates that annual IoT data volume will reach approximately... View the full article

    • 0 replies
    • 56 views
  12. We are excited to announce that Forrester has recognized Databricks as a Leader in The Forrester Wave™: AI Foundation Models for Language, Q2... View the full article

    • 0 replies
    • 65 views
  13. Data + AI Summit 2024 will be held in person and virtually on June 10-13, 2024, with a highly anticipated lineup of keynotes... View the full article

    • 0 replies
    • 61 views
  14. Started by Databricks,

    Introduction Today, manufacturers’ field maintenance is often more reactive than proactive, which can lead to costly downtime and repairs. Historically, data warehouses have... View the full article

    • 0 replies
    • 60 views
  15. The annual Data Team Awards spotlight data teams and the pivotal role they play in business operations across industries and markets. By continually... View the full article

    • 0 replies
    • 59 views
  16. Over the last year, we’ve been listening to feedback and iterating on new ideas with a single goal: to build the best data-focused... View the full article

    • 0 replies
    • 63 views
  17. Started by Databricks,

    We are excited to announce that we have agreed to acquire Tabular, Inc, a data management company founded by Ryan Blue, Daniel Weeks... View the full article

    • 0 replies
    • 52 views
  18. Businesses are making remarkable progress on their data and AI journeys. They’re advancing from a few pilot projects confined to use cases likely... View the full article

    • 0 replies
    • 47 views
  19. Started by Databricks,

    The secret to good AI is great data. As AI adoption soars, the data platform is the most important component of any enterprise's... View the full article

    • 0 replies
    • 57 views
  20. Delta Lake UniForm, now in GA, enables customers to benefit from Delta Lake’s industry-leading price-performance when connecting to tools in the Iceberg ecosystem. View the full article

    • 0 replies
    • 51 views
  21. We are excited to announce a new data type called variant for semi-structured data. Variant provides an order of magnitude performance improvements compared... View the full article

    • 0 replies
    • 56 views
  22. With more and more customer interactions moving into the digital domain, it's increasingly important that organizations develop insights into online customer behaviors. In... View the full article

    • 0 replies
    • 77 views
  23. As data continues to grow at an unprecedented rate, the need for an efficient and scalable open-source ETL solution becomes increasingly pressing. However, with every organisation’s varying needs and the cluttered market for ETL tools, finding and choosing the right tool can be strenuous. I have curated an open-source etl tools list, ranked by popularity […]View the full article

    • 0 replies
    • 56 views
  24. Secure data sharing is a prominent feature of Snowflake that allows you to collaborate with others in the same region. This feature enables the secure transfer of specific database objects between Snowflake accounts. As a consumer, you can access the data without worrying about local storage; you pay only for the computing resources used. Since […]View the full article

    • 0 replies
    • 44 views
  25. Have you ever wondered how Snowflake’s date and time functions can enhance your organization’s data analysis capabilities? Consider a scenario where you must prepare a financial report by calculating the days between the order and delivery dates for all products sold within a week. While a Snowflake minus (-) operator might suffice to subtract these […]View the full article

    • 0 replies
    • 57 views
  26. The exponential data growth has increased the demand for tools that make data processes, such as data collection, integration, and transformation, as smooth as possible. These tools and technologies can help you evolve your methods of handling organizational data. Among these tools, you can leverage Snowflake and dbt to address some of these crucial data […]View the full article

    • 0 replies
    • 55 views
  27. How often does your data team use third-party data to make business decisions? Snowflake data exchange is a good step in that direction to turn your business data-driven. It creates a data-sharing hub for organizations to become data providers or consume data from others, depending on the business use case. In this article, I will […]View the full article

    • 0 replies
    • 58 views
  28. “According to Statista, the total volume of data was 64.2 zettabytes in 2020; it’s predicted to reach 181 zettabytes by 2025.” In this day and age, the importance of good data collection and efficient data cleansing for better analysis has grown to become vital. The reason is straightforward: A data-driven decision is as good as […]View the full article

    • 0 replies
    • 52 views
  29. MySQL is a Relational Database Management System. This Open-source tool is one of the best RDBMS available in the market that is being used to develop web-based software applications among others. MySQL is scalable, intuitive, and swift when compared to its contemporaries. It houses a Client-Server architecture. At the core of the MySQL Database lies […]View the full article

    • 0 replies
    • 53 views
  30. With the huge volumes of data being generated by enterprises today, businesses are looking for modernized means of data storage. On-premise storage options are associated with many limitations including lack of adequate scalability, poor accessibility, and high burden on maintenance. That’s why organizations are moving their data from on-premise storage to the Cloud. This article […]View the full article

    • 0 replies
    • 48 views
  31. Data plays an important role in most of the decision-making processes be it related to business or even Engineering processes. It was difficult in the past to set up and maintain the analytics stack easily for it to play a pivotal role in every process hence it was seldom used, however, the advent of cloud […]View the full article

    • 0 replies
    • 48 views
  32. Apache Airflow is a tool that can create, organize, and monitor workflows. It is open-source hence it is free and has a wide range of support as well. It is one of the most trusted platforms that is used for orchestrating workflows and is widely used and recommended by top data engineers. This tool provides […]View the full article

    • 0 replies
    • 53 views
  33. Special thanks to Caleb Benningfield and Sam Malissa at Amperity for their valuable insights and contributions to this blog. Today, businesses face a... View the full article

    • 0 replies
    • 62 views
  34. We are excited to announce the general availability of Row Filters and Column Masks in Unity Catalog on AWS , Azure and GCP... View the full article

    • 0 replies
    • 68 views
  35. Salesforce and Databricks are excited to announce an expanded strategic partnership that delivers a powerful new integration - Salesforce Bring Your Own Model... View the full article

    • 0 replies
    • 54 views
  36. Whether you are working on a live title, pre/post production, ongoing maintenance, future releases, another version of a game, or a brand new... View the full article

    • 0 replies
    • 62 views
  37. This blog is authored by Michael Ewins, Director of Engineering at Skyscanner At Skyscanner , we're more than just a flight search engine... View the full article

    • 0 replies
    • 82 views
  38. Over the last few years, Large Language Models (LLMs) have been reshaping the field of natural language, thanks to their transformer-based architectures and... View the full article

    • 0 replies
    • 61 views
  39. The annual Data Team Awards celebrate the critical contributions of data teams to various sectors, spotlighting their role in driving progress and positive... View the full article

    • 0 replies
    • 75 views
  40. The annual Data Team Awards showcase the remarkable efforts of top global enterprise data teams committed to tackling some of today's toughest business... View the full article

    • 0 replies
    • 87 views
  41. In today's digital landscape, secure data sharing is critical to operational efficiency and innovation. Databricks and the Linux Foundation developed Delta Sharing as... View the full article

    • 0 replies
    • 56 views
  42. 2 examples of how we’re experimenting with practical customer data use cases for LLMs: Making customer success more efficient and unlocking 1:1 personalization.View the full article

  43. The Data Team Awards annually recognize the indispensable roles of enterprise data teams across industries, celebrating their resilience and innovation from around the... View the full article

    • 0 replies
    • 60 views
  44. If you’ve been following the world of industry-grade LLM technology for the last year, you’ve likely observed a plethora of frameworks and tools... View the full article

    • 0 replies
    • 53 views
  45. We’re excited to announce the General Availability of Delta Lake Liquid Clustering in the Databricks Data Intelligence Platform. Liquid Clustering is an innovative... View the full article

    • 0 replies
    • 54 views
  46. Generative AI (GenAI) is moving incredibly fast. So much so, that in less than two years, GenAI has emerged as one of the... View the full article

    • 0 replies
    • 52 views
  47. We're excited to announce native support in Databricks for ingesting XML data . XML is a popular file format for representing complex data... View the full article

    • 0 replies
    • 55 views
  48. In the last year, the Databricks Money Engineering Team has embarked on an exhilarating journey, achieving nearly double our operational efficiency. We are... View the full article

    • 0 replies
    • 58 views
  49. Following the announcement we made around a suite of tools for Retrieval Augmented Generation, today we are thrilled to announce the general availability... View the full article

    • 0 replies
    • 54 views
  50. Started by Databricks,

    We’re excited to announce the Databricks AI Fund, showcasing our commitment to supporting a new generation of founders and startups. View the full article

    • 0 replies
    • 50 views