Search the Community
Showing results for tags 'snowflake cortex'.
-
In March, Snowflake announced exciting releases, including advances in AI and ML with new features in Snowflake Cortex, new governance and privacy features in Snowflake Horizon, and broader developer support with the Snowflake CLI. Read on to learn more about everything we announced last month. Snowflake Cortex LLM Functions – in public preview Snowflake Cortex is an intelligent, fully managed service that delivers state-of-the-art large language models (LLMs) as serverless SQL/Python functions; there are no integrations to set up, data to move or GPUs to provision. In Snowflake Cortex, there are task-specific functions that teams can use to quickly and cost-effectively execute complex tasks, such as translation, sentiment analysis and summarization. Additionally, to build custom apps, teams can use the complete function to run custom prompts using LLMs from Mistral AI, Meta and Google. Learn more. Streamlit Streamlit 1.26 – in public preview We’re excited to announce support for Streamlit version 1.26 within Snowflake. This update, in preview, expands your options for building data apps directly in Snowflake’s secure environment. Now you can leverage the latest features and functionalities available in Streamlit 1.26.0 — including st.chat_input and st.chat_message, two powerful primitives for creating conversational interfaces within your data apps. This addition allows users to interact with your data applications using natural language, making them more accessible and user-friendly. You can also utilize the new features of Streamlit 1.26.0 to create even more interactive and informative data visualizations and dashboards. To learn more and get started, head over to the Snowflake documentation. Snowflake Horizon Sensitive Data Custom Classification – in public preview In addition to using standard classifiers in Snowflake, customers can now also write their own classifiers using SQL with custom logic to define what data is sensitive to their organization. This is an important enhancement to data classification and provides the necessary extensibility that customers need to detect and classify more of their data. Learn more. Data Quality Monitoring – in public preview Data Quality Monitoring is a built-in solution with out-of-the-box metrics, like null counts, time since the object was last updated and count of rows inserted into an object. Customers can even create custom metrics to monitor the quality of data. They can then effectively monitor and report on data quality by defining the frequency it is automatically measured and configure alerts to receive email notifications when quality thresholds are violated. Learn more. Snowflake Data Clean Rooms – generally available in select regions Snowflake Data Clean Rooms allow customers to unlock insights and value through secure data collaboration. Launched as a Snowflake Native App on Snowflake Marketplace, Snowflake Data Clean Rooms are now generally available to customers in AWS East, AWS West and Azure West. Snowflake Data Clean Rooms make it easy to build and use data clean rooms for both technical and non-technical users, with no additional access fees set by Snowflake. Find out more in this blog. DevOps on Snowflake Snowflake CLI – public preview The new Snowflake CLI is an open source tool that empowers developers with a flexible and extensible interface for managing the end-to-end lifecycle of applications across various workloads (Snowpark, Snowpark Container Services, Snowflake Native Applications and Streamlit in Snowflake). It offers features such as user-defined functions, stored procedures, Streamlit integration and direct SQL execution. Learn more. Snowflake Marketplace Snowflake customers can tap into Snowflake Marketplace for access to more than 2,500 live and ready-to-query third-party data, apps and AI products all in one place (as of April 10, 2024). Here are all the providers who launched on Marketplace in March: AI/ML Products Brillersys – Time Series Data Generator Atscale, Inc. – Semantic Modeling Data paretos GmbH – Demand Forecasting App Connectors/SaaS Data HALitics – eCommerce Platform Connector Developer Tools DataOps.live – CI/CD, Automation and DataOps Data Governance, Quality and Cost Optimization Select Labs US Inc. – Snowflake Performance & Cost Optimization Foreground Data Solutions Inc – PII Data Detector CareEvolution – Data Format Transformation Merse, Inc – Snowflake Performance & Cost Optimization Qbrainx – Snowflake Performance & Cost Optimization Yuki – Snowflake Performance Optimization DATAN3RD LLC – Data Quality App Third-Party Data Providers Upper Hand – Sports Facilities & Athletes Data Sporting Group – Sportsbook Data Quiet Data – UK Company Data Manifold Data Mining – Demographics Data in Canada SESAMm – ESG Controversy Data KASPR Datahaus – Internet Quality & Anomaly Data Blitzscaling – Blockchain Data Starlitics – ETF and Mutual Fund Data SFR Analytics – Geographic Data SignalRank – Startup Data GfK SE – Purchasing Power Data —- Forward-Looking Statement This post contains express and implied forward-looking statements, including statements regarding (i) Snowflake’s business strategy, (ii) Snowflake’s products, services, and technology offerings, including those that are under development or not generally available, (iii) market growth, trends, and competitive considerations, and (iv) the integration, interoperability, and availability of Snowflake’s products with and on third-party platforms. These forward-looking statements are subject to a number of risks, uncertainties, and assumptions, including those described under the heading “Risk Factors” and elsewhere in the Quarterly Reports on Form 10-Q and Annual Reports of Form 10-K that Snowflake files with the Securities and Exchange Commission. In light of these risks, uncertainties, and assumptions, actual results could differ materially and adversely from those anticipated or implied in the forward-looking statements. As a result, you should not rely on any forward-looking statements as predictions of future events. © 2024 Snowflake Inc. All rights reserved. Snowflake, the Snowflake logo, and all other Snowflake product, feature, and service names mentioned herein are registered trademarks or trademarks of Snowflake Inc. in the United States and other countries. All other brand names or logos mentioned or used herein are for identification purposes only and may be the trademarks of their respective holder(s). Snowflake may not be associated with, or be sponsored or endorsed by, any such holder(s). The post New Snowflake Features Released in March 2024 appeared first on Snowflake. View the full article
-
- 1
-
- snowflake cortex
- llms
- (and 9 more)
-
Today, enterprises are focused on enhancing decision-making with the power of AI and machine learning (ML). But the complexity of ML models and data science techniques often leaves behind organizations without data scientists or with limited data science resources. And for those organizations with strong data analyst resources, complex ML models and frameworks may seem overwhelming, potentially preventing them from driving faster, higher-quality insights. That’s why Snowflake Cortex ML Functions were developed: to abstract away the complexity of ML frameworks and algorithms, automate much of the data science process, and democratize ML for everyone. These functions make activities such as data quality monitoring through anomaly detection, or retail sales forecasting through time series forecasting, faster, easier and more robust — especially for data analysts, data engineers, and citizen data scientists. As a continuation of this suite of functions, Snowflake Cortex ML Classification is now in public preview. It enables data analysts to categorize data into predefined classes or labels, and both binary classification (two classes) and multi-class classification (more than two classes) are supported. All of this can be done with a simple SQL command, for use cases such as lead scoring or churn prediction. How ML Classification works Imagine you are a data analyst on a marketing team and want to ensure your team takes quick action on the highest-priority sales leads, optimizing the value from investments in sales and marketing. With ML Classification, you can easily classify certain leads as having a higher likelihood to convert, and thus give them a higher priority for follow-up. And for those with a low likelihood to convert, your marketing team can choose to nurture those or contact them less frequently. ML Classification can be accomplished in two simple steps: First, train a machine learning model using your CRM data for all leads you’ve pursued in the past and labeled as either “Converted” or “Not converted.” Then, use that model to classify your new set of leads as likely to convert or not. When you generate your Snowflake ML Classification predictions, you’ll get not only the predicted “class” (likely to convert vs. not likely), but also the probability of that prediction. That way, you can prioritize outreach and marketing to leads that have the highest probability of converting — even within all leads that are likely to convert. Here’s how to use Classification with just a few lines of SQL: -- Train a model on all historical leads. CREATE OR REPLACE SNOWFLAKE.ML.CLASSIFICATION my_lead_model( INPUT_DATA => SYSTEM$REFERENCE('TABLE', 'historical_leads'), TARGET_COLNAME => 'CONVERT' ); -- Generate predictions. CREATE TABLE my_predictions AS SELECT my_lead_model!PREDICT(object_construct(*)) as prediction FROM new_leads; The above SQL generates an ML model you can use repeatedly to assess whether new leads are likely to convert. It also generates a table of predictions that includes not only the expected class (likely to convert vs. not likely) but also the probability of each class. If you’re interested in pulling out just the predicted class and probability of that class, you can use the following SQL to parse the results: CREATE TABLE my_predictions AS SELECT prediction:class as convert_or_not, prediction['probability']['"1"'] as convert_probability FROM (SELECT my_lead_model!PREDICT(object_construct(*)) as prediction FROM new_leads); To support your assessment of the model (“Is this good enough for my team to use?”) and understanding of the model (“What parts of the data I’ve trained the model on are most useful to the model?”), this classification function produces evaluation metrics and feature importance data. -- Get evaluation metrics CALL my_lead_model!SHOW_EVALUATION_METRICS(); CALL my_lead_model!SHOW_GLOBAL_EVALUATION_METRICS(); CALL my_lead_model!SHOW_CONFUSION_MATRIX(); -- Get feature importances CALL my_lead_model!SHOW_FEATURE_IMPORTANCE(); ML Classification can be used for other use cases as well, such as churn prediction. For example, customers classified as having a high likelihood to churn can be targeted with special offers, personalized communication or other retention efforts. The two problems we describe above — churn prediction and lead scoring — are binary classification problems, where the value we’re predicting takes on just two values. This classification function can also solve multi-class problems, where the value we’re predicting takes on three or more values. For example, say your marketing team segments customers into threethree groups (Bronze, Silver, and Gold) (Bronze, Silver, and Gold) based on their purchasing habits, demographic and psychographic characteristics. This classification function could help you bucket new customers and prospects into those three value-based segments with ease. -- Train a model on all existing customers. CREATE OR REPLACE SNOWFLAKE.ML.CLASSIFICATION my_marketing_model( INPUT_DATA => SYSTEM$REFERENCE('TABLE', 'customers'), TARGET_COLNAME => 'value_grouping' ); -- Generate predictions for prospects. CREATE TABLE my_value_predictions AS SELECT my_marketing_model!PREDICT(object_construct(*)) as prediction FROM prospects; -- Parse results. CREATE TABLE my_predictions_parsed AS SELECT prediction:class as value_grouping, prediction['probability'][class] as probability FROM my_value_predictions; How Faraday uses Snowflake Cortex ML Classification Faraday, a customer behavior prediction platform, has been using ML Classification during private preview. For Faraday, having classification models right next to their customers’ Snowflake data accelerates their use of next-generation AI/ML and drives value for their customers. “Snowflake Cortex ML Functions allow our data engineering team to run complex ML models where our customers’ data lives. This provides us out-of-the-box data science resources and means we don’t have to move our customers’ data to run this analysis,” said Seamus Abshere, Co-Founder and CTO at Faraday. “The public release of Cortex ML Classification is a big unlock; it disrupts a long tradition of separating data engineering and data science.” What’s next? To continue improving the ML Classification experience, we plan to release support for text and timestamps in training and prediction data. We are also continuously improving the amount of data that can be used in training and prediction and the speed of training and prediction – as well as model accuracy. Not only do we want to put AI and ML in the hands of all data analysts and data engineers, but we want to empower business users, too. That’s why the Snowflake Cortex UI is now in private preview. This clickable user interface helps our Snowflake customers discover Snowflake Cortex functions from Snowsight and guides users through the process of selecting data, setting parameters and scheduling recurring training and prediction for AI and ML models — all through an easy-to-use interface. To learn more about Snowflake Cortex ML functions, visit Snowflake documentation or try out this Quickstart. The post Predict Known Categorical Outcomes with Snowflake Cortex ML Classification, Now in Public Preview appeared first on Snowflake. View the full article
-
Snowflake is committed to helping our customers unlock the power of artificial intelligence (AI) to drive better decisions, improve productivity and reach more customers using all types of data. Large Language Models (LLMs) are a critical component of generative AI applications, and multimodal models are an exciting category that allows users to go beyond text and incorporate images and video into their prompts to get a better understanding of the context and meaning of the data. Today we are excited to announce we’re furthering our partnership with Reka to support its suite of highly capable multimodal models in Snowflake Cortex. This includes Flash, an optimized model for everyday questions and developing support for Core, Reka’s largest and most performant model. This will allow our customers to seamlessly unlock value from more types of data with the power of multimodal AI in the same environment where their data lives, protected by the built-in security and governance of the Snowflake Data Cloud. Reka’s latest testing reveals that both Flash and Core are highly capable with Core’s capabilities approaching GPT-4 and Gemini Ultra, making it one of the most capable LLMs available today. In addition to expanding our partnership with NVIDIA to power gen AI applications and enhance model performance and scalability, our partnership with Reka and other LLM providers are the latest examples of how Snowflake is accelerating our AI capabilities for customers. Snowflake remains steadfast in our commitment to make AI secure, easy to use and quick-to-implement, for both business and technical users. Taken together, our partnerships and investments in AI ensure we continue to provide customers with maximum choice around the tools and technologies they need to build powerful AI applications. The post Snowflake Brings Gen AI to Images, Video and More With Multimodal Language Models from Reka in Snowflake Cortex appeared first on Snowflake. View the full article
-
Because human-machine interaction using natural language is now possible with large language models (LLMs), more data teams and developers can bring AI to their daily workflows. To do this efficiently and securely, teams must decide how they want to combine the knowledge of pre-trained LLMs with their organization’s private enterprise data in order to deal with the hallucinations (that is, incorrect responses) that LLMs can generate due to the fact that they’ve only been trained on data available up to a certain date. To reduce these AI hallucinations, LLMs can be combined with private data sets via processes that either don’t require LLM customization (such as prompt engineering or retrieval augmented generation) or that do require customization (like fine-tuning or retraining). To decide where to start, it is important to make trade-offs between the resources and time it takes to customize AI models and the required timelines to show ROI on generative AI investments. While every organization should keep both options on the table, to quickly deliver value, the key is to identify and deploy use cases that can deliver value using prompt engineering and retrieval augmented generation (RAG), as these can be fast and cost-effective approaches to get value from enterprise data with LLMs. To empower organizations to deliver fast wins with generative AI while keeping data secure when using LLMs, we are excited to announce Snowflake Cortex LLM functions are now available in public preview for select AWS and Azure regions. With Snowflake Cortex, a fully managed service that runs on NVIDIA GPU-accelerated compute, there is no need to set up integrations, manage infrastructure or move data outside of the Snowflake governance boundary to use the power of industry-leading LLMs from Mistral AI, Meta and more. So how does Snowflake Cortex make AI easy, whether you are doing prompt engineering or RAG? Let’s dive into the details and check out some code along the way. To prompt or not to prompt In Snowflake Cortex, there are task-specific functions that work out of the box without the need to define a prompt. Specifically, teams can quickly and cost-effectively execute tasks such as translation, sentiment analysis and summarization. All that an analyst or any other user familiar with SQL needs to do is point the specific function below to a column of a table containing text data and voila! Snowflake Cortex functions take care of the rest — no manual orchestration, data formatting or infrastructure to manage. This is particularly useful for teams constantly working with product reviews, surveys, call transcripts and other long-text data sources traditionally underutilized within marketing, sales and customer support teams. SELECT SNOWFLAKE.CORTEX.SUMMARIZE(review_text) FROM reviews_table LIMIT 10; Of course, there are going to be many use cases where customization via prompts becomes useful. For example: Custom text summaries in JSON format Turning email domains into rich data sets Building data quality agents using LLMs All of these and more can quickly be accomplished with the power of industry-leading foundation models from Mistral AI (Mistral Large, Mistral 8x7B, Mistral 7B), Google (Gemma-7b) and Meta (Llama2 70B). All of these foundation LLMs are accessible via the complete function, which just like any other Snowflake Cortex function can run on a table with multiple rows without any manual orchestration or LLM throughput management. Figure 1: Multi-task accuracy of industry-leading LLMs based on MLLU benchmark. Source SELECT SNOWFLAKE.CORTEX.COMPLETE( 'mistral-large', CONCAT('Summarize this product review in less than 100 words. Put the product name, defect and summary in JSON format: <review>', content, '</review>') ) FROM reviews LIMIT 10; For use cases such as chatbots on top of documents, it may be costly to put all the documents as context in the prompt. In such a scenario, a different approach may be more cost effective by minimizing the volume of tokens (a general rule of thumb is that 75 words approximately equals 100 tokens) going into the LLM. A popular framework to solve this problem without having to make changes to the LLM is RAG, which is easy to do in Snowflake. What is RAG? Let’s go over the basics of RAG before jumping into how to do this in Snowflake. RAG is a popular framework in which an LLM gets access to a specific knowledge base with the most up-to-date, accurate information available before generating a response. Because there is no need to retrain the model, this extends the capability of any LLM to specific domains in a cost-effective way. To deploy this retrieval, augmentation and generation framework teams need a combination of: Client / app UI: This is where the end user, such as a business decision-maker, is able to interact with the knowledge base, typically in the form of a chat service. Context repository: This is where relevant data sources are aggregated, governed and continuously updated as needed to provide an up-to-date knowledge repository. This content needs to be inserted into an automated pipeline that chunks (that is, breaks documents into smaller pieces) and embeds the text into a vector store. Vector search: This requires the combination of a vector store, which maintains the numerical or vector representation of the knowledge base, and semantic search to provide easy retrieval of the chunks most relevant to the question. LLM inference: The combination of these enables teams to embed the question and the context to find the most relevant information and generate contextualized responses using a conversational LLM. Figure 2: Generalized RAG framework from question to contextualized answer. From RAG to rich LLM apps in minutes with Snowflake Cortex Now that we understand how RAG works in general, how can we apply it to Snowflake? Using the Snowflake platform’s rich foundation for data governance and management, which includes vector data type (in private preview), developing and deploying an end-to-end AI app using RAG is possible without integrations, infrastructure management or data movement using three key features: Figure 3: Key Snowflake features needed to build end-to-end RAG in Snowflake. Here is how these features map to the key architecture components of a RAG framework: Client / app UI: Use Streamlit in Snowflake out-of-the box chat elements to quickly build and share user interfaces all in Python. Context repository: The knowledge repository can be easily updated and governed using Snowflake stages. Once documents are loaded, all of your data preparation, including generating chunks (smaller, contextually rich blocks of text), can be done with Snowpark. For the chunking in particular, teams can seamlessly use LangChain as part of a Snowpark User Defined Function. Vector search: Thanks to the native support of VECTOR as a data type in Snowflake, there is no need to integrate and govern a separate store or service. Store VECTOR data in Snowflake tables and execute similarity queries with system-defined similarity functions (L2, cosine, or inner-product distance). LLM inference: Snowflake Cortex completes the workflow with serverless functions for embedding and text completion inference (using either Mistral AI, Llama or Gemma LLMs). Figure 4: End-to-end RAG framework in Snowflake. Show me the code Ready to try Snowflake Cortex and its tightly integrated ecosystem of features that enable fast prototyping and agile deployment of AI apps in Snowflake? Get started with one of these resources: Snowflake Cortex LLM functions documentation Run 3 useful LLM inference jobs in 10 minutes with Snowflake Cortex Build a chat-with-your-documents LLM app using RAG with Snowflake Cortex To watch live demos and ask questions of Snowflake Cortex experts, sign up for one of these events: Snowflake Cortex Live Ask Me Anything (AMA) Snowflake Cortex RAG hands-on lab Want to network with peers and learn from other industry and Snowflake experts about how to use the latest generative AI features? Make sure to join us at Snowflake Data Cloud Summit in San Francisco this June! The post Easy and Secure LLM Inference and Retrieval Augmented Generation (RAG) Using Snowflake Cortex appeared first on Snowflake. View the full article
-
With Snowflake Cortex, Snowflake users now have access to a set of serverless functions that easily accelerate everyday analytics and AI app development. With just a single line of SQL or Python, analysts can instantly access specialized ML and LLM models tuned for specific tasks. They can also leverage more general purpose models for prompt engineering and in-context learning. Since these are fully hosted and managed by Snowflake Cortex, users always have access to them without the need to bring up and manage expensive GPU infrastructure. They can also use and leverage Snowflake’s unified governance framework to seamlessly secure and manage access to their data. These functions include the ones listed below... View the full article
-
Forum Statistics
63.6k
Total Topics61.7k
Total Posts