Jump to content

Search the Community

Showing results for tags 'amazon opensearch'.

  • Search By Tags

    Type tags separated by commas.
  • Search By Author

Content Type


Forums

  • General
    • General Discussion
    • Artificial Intelligence
    • DevOpsForum News
  • DevOps & SRE
    • DevOps & SRE General Discussion
    • Databases, Data Engineering & Data Science
    • Development & Programming
    • CI/CD, GitOps, Orchestration & Scheduling
    • Docker, Containers, Microservices, Serverless & Virtualization
    • Infrastructure-as-Code
    • Kubernetes & Container Orchestration
    • Linux
    • Logging, Monitoring & Observability
    • Security, Governance, Risk & Compliance
  • Cloud Providers
    • Amazon Web Services
    • Google Cloud Platform
    • Microsoft Azure

Find results in...

Find results that contain...


Date Created

  • Start

    End


Last Updated

  • Start

    End


Filter by number of...

Joined

  • Start

    End


Group


Website URL


LinkedIn Profile URL


About Me


Cloud Platforms


Cloud Experience


Development Experience


Current Role


Skills


Certifications


Favourite Tools


Interests

Found 20 results

  1. Amazon OpenSearch Service adds support for Hebrew and HanLP (Chinese NLP) language analyzer plugins. These are now available as optional plugins that you can associate with your Amazon OpenSearch Service clusters. View the full article
  2. Amazon OpenSearch Service recently introduced the OpenSearch Optimized Instance family (OR1), which delivers up to 30% price-performance improvement over existing memory optimized instances in internal benchmarks, and uses Amazon Simple Storage Service (Amazon S3) to provide 11 9s of durability. With this new instance family, OpenSearch Service uses OpenSearch innovation and AWS technologies to reimagine how data is indexed and stored in the cloud. Today, customers widely use OpenSearch Service for operational analytics because of its ability to ingest high volumes of data while also providing rich and interactive analytics. In order to provide these benefits, OpenSearch is designed as a high-scale distributed system with multiple independent instances indexing data and processing requests. As your operational analytics data velocity and volume of data grows, bottlenecks may emerge. To sustainably support high indexing volume and provide durability, we built the OR1 instance family. In this post, we discuss how the reimagined data flow works with OR1 instances and how it can provide high indexing throughput and durability using a new physical replication protocol. We also dive deep into some of the challenges we solved to maintain correctness and data integrity. Designing for high throughput with 11 9s of durability OpenSearch Service manages tens of thousands of OpenSearch clusters. We’ve gained insights into typical cluster configurations that customers use to meet high throughput and durability goals. To achieve higher throughput, customers often choose to drop replica copies to save on the replication latency; however, this configuration results in sacrificing availability and durability. Other customers require high durability and as a result need to maintain multiple replica copies, resulting in higher operating costs for them. The OpenSearch Optimized Instance family provides additional durability while also keeping costs lower by storing a copy of the data on Amazon S3. With OR1 instances, you can configure multiple replica copies for high read availability while maintaining indexing throughput. The following diagram illustrates an indexing flow involving a metadata update in OR1 During indexing operations, individual documents are indexed into Lucene and also appended to a write-ahead log also known as a translog. Before sending back an acknowledgement to the client, all translog operations are persisted to the remote data store backed by Amazon S3. If any replica copies are configured, the primary copy performs checks to detect the possibility of multiple writers (control flow) on all replica copies for correctness reasons. The following diagram illustrates the segment generation and replication flow in OR1 instances Periodically, as new segment files are created, the OR1 copy those segments to Amazon S3. When the transfer is complete, the primary publishes new checkpoints to all replica copies, notifying them of a new segment being available for download. The replica copies subsequently download newer segments and make them searchable. This model decouples the data flow that happens using Amazon S3 and the control flow (checkpoint publication and term validation) that happens over inter-node transport communication. The following diagram illustrates the recovery flow in OR1 instances OR1 instances persist not only the data, but the cluster metadata like index mappings, templates, and settings in Amazon S3. This makes sure that in the event of a cluster-manager quorum loss, which is a common failure mode in non-dedicated cluster-manager setups, OpenSearch can reliably recover the last acknowledged metadata. In the event of an infrastructure failure, an OpenSearch domain can end up losing one or more nodes. In such an event, the new instance family guarantees recovery of both the cluster metadata and the index data up to the latest acknowledged operation. As new replacement nodes join the cluster, the internal cluster recovery mechanism bootstraps the new set of nodes and then recovers the latest cluster metadata from the remote cluster metadata store. After the cluster metadata is recovered, the recovery mechanism starts to hydrate the missing segment data and translog from Amazon S3. Then all uncommitted translog operations, up to the last acknowledged operation, are replayed to reinstate the lost copy. The new design doesn’t modify the way searches work. Queries are processed normally by either the primary or replica shard for each shard in the index. You may see longer delays (in the 10-second range) before all copies are consistent to a particular point in time because the data replication is using Amazon S3. A key advantage of this architecture is that it serves as a foundational building block for future innovations, like separation of readers and writers, and helps segregate compute and storage layers. How redefining the replication strategy boosts the indexing throughput OpenSearch supports two replication strategies: logical (document) and physical (segment) replication. In the case of logical replication, the data is indexed on all the copies independently, leading to redundant computation on the cluster. The OR1 instances use the new physical replication model, where data is indexed only on the primary copy and additional copies are created by copying data from the primary. With a high number of replica copies, the node hosting the primary copy requires significant network bandwidth, replicating the segment to all the copies. The new OR1 instances solve this problem by durably persisting the segment to Amazon S3, which is configured as a remote storage option. They also help with scaling replicas without bottlenecking on primary. After the segments are uploaded to Amazon S3, the primary sends out a checkpoint request, notifying all replicas to download the new segments. The replica copies then need to download the incremental segments. Because this process frees up compute resources on replicas, which is otherwise required to redundantly index data and network overhead incurred on primaries to replicate data, the cluster is able to churn more throughput. In the event the replicas aren’t able to process the newly created segments, due to overload or slow network paths, the replicas beyond a point are marked as failed to prevent them from returning stale results. Why high durability is a good idea, but hard to do well Although all committed segments are durably persisted to Amazon S3 whenever they get created, one of key challenges in achieving high durability is synchronously writing all uncommitted operations to a write-ahead log on Amazon S3, before acknowledging back the request to the client, without sacrificing throughput. The new semantics introduce additional network latency for individual requests, but the way we’ve made sure there is no impact to throughput is by batching and draining requests on a single thread for up to a specified interval, while making sure other threads continue to index requests. As a result, you can drive higher throughput with more concurrent client connections by optimally batching your bulk payloads. Other challenges in designing a highly durable system include enforcing data integrity and correctness at all times. Although some events like network partitions are rare, they can break the correctness of the system and therefore the system needs to be prepared to deal with these failure modes. Therefore, while switching to the new segment replication protocol, we also introduced a few other protocol changes, like detecting multiple writers on each replica. The protocol makes sure that an isolated writer can’t acknowledge a write request, while another newly promoted primary, based on the cluster-manager quorum, is concurrently accepting newer writes. The new instance family automatically detects the loss of a primary shard while recovering data, and performs extensive checks on network reachability before the data can be re-hydrated from Amazon S3 and the cluster is brought back to a healthy state. For data integrity, all files are extensively checksummed to make sure we are able to detect and prevent network or file system corruption that may result in data being unreadable. Furthermore, all files including metadata are designed to be immutable, providing additional safety against corruptions and versioned to prevent accidental mutating changes. Reimagining how data flows The OR1 instances hydrate copies directly from Amazon S3 in order to perform recovery of lost shards during an infrastructure failure. By using Amazon S3, we are able to free up the primary node’s network bandwidth, disk throughput, and compute, and therefore provide a more seamless in-place scaling and blue/green deployment experience by orchestrating the entire process with minimal primary node coordination. OpenSearch Service provides automatic data backups called snapshots at hourly intervals, which means in case of accidental modifications to data, you have the option to go back to a previous point in time state. However, with the new OpenSearch instance family, we’ve discussed that the data is already durably persisted on Amazon S3. So how do snapshots work when we already have the data present on Amazon S3? With the new instance family, snapshots serve as checkpoints, referencing the already present segment data as it exists at a point in time. This makes snapshots more lightweight and faster because they don’t need to re-upload any additional data. Instead, they upload metadata files that capture the view of the segments at that point in time, which we call shallow snapshots. The benefit of shallow snapshots extends to all operations, namely creation, deletion, and cloning of snapshots. You still have the option to snapshot an independent copy with manual snapshots for other administrative operations. Summary OpenSearch is an open source, community-driven software. Most of the foundational changes including the replication model, remote-backed storage, and remote cluster metadata have been contributed to open source; in fact, we follow an open source first development model. Efforts to improve throughput and reliability is a never-ending cycle as we continue to learn and improve. The new OpenSearch optimized instances serve as a foundational building block, paving the way for future innovations. We are excited to continue our efforts in improving reliability and performance and to see what new and existing solutions builders can create using OpenSearch Service. We hope this leads to a deeper understanding of the new OpenSearch instance family, how this offering achieves high durability and better throughput, and how it can help you configure clusters based on the needs of your business. If you’re excited to contribute to OpenSearch, open up a GitHub issue and let us know your thoughts. We would also love to hear about your success stories achieving high throughput and durability on OpenSearch Service. If you have other questions, please leave a comment. About the Authors Bukhtawar Khan is a Principal Engineer working on Amazon OpenSearch Service. He is interested in building distributed and autonomous systems. He is a maintainer and an active contributor to OpenSearch. Gaurav Bafna is a Senior Software Engineer working on OpenSearch at Amazon Web Services. He is fascinated about solving problems in distributed systems. He is a maintainer and an active contributor to OpenSearch. Sachin Kale is a senior software development engineer at AWS working on OpenSearch. Rohin Bhargava is a Sr. Product Manager with the Amazon OpenSearch Service team. His passion at AWS is to help customers find the correct mix of AWS services to achieve success for their business goals. Ranjith Ramachandra is a Senior Engineering Manager working on Amazon OpenSearch Service. He is passionate about highly scalable distributed systems, high performance and resilient systems. View the full article
  3. Amazon OpenSearch Service is now extending the ability to update the number of data nodes without requiring a blue/green deployment for clusters without dedicated cluster manager (master) nodes. This change will allow you to make node count changes faster. Clusters with dedicated cluster manager nodes already supported updating the data node count without a blue/green deployment. View the full article
  4. OpenSearch is an open-source search and analytics suite, derived from Elasticsearch 7.10.2 and Kibana 7.10.2. It’s designed to provide distributed search, analytics, and visualization capabilities across large volumes of data in near real-time. OpenSearch was created following changes in licensing for Elasticsearch and Kibana by Elastic, which prompted AWS (Amazon Web Services) to fork these projects to maintain an open-source version under the Apache 2.0 license. It comprises two main components: OpenSearch: The core search and analytics engine that offers scalable search, document indexing, and deep analytics capabilities. OpenSearch Dashboards: A visualization tool in the suite that allows for creating and sharing dashboards to visualize and explore data stored in OpenSearch. OpenSearch provides a highly scalable system for providing fast access and response to large volumes of data with an integrated visualization tool, OpenSearch Dashboards, that makes it easy for users to explore their data. OpenSearch is powered by the Apache Lucene search library, and it supports a number of search and analytics capabilities such as k-nearest neighbors (KNN) search, SQL, Anomaly Detection, Machine Learning Commons, Trace Analytics, full-text search, and more. Use Cases of OpenSearch OpenSearch is versatile and caters to a wide range of applications, including: Log Analytics: Aggregating, monitoring, and analyzing system and application logs to understand behavior, troubleshoot issues, and monitor infrastructure. Full-Text Search: Providing powerful search capabilities across websites, applications, and documents with support for complex queries and search operations. Real-Time Analytics: Analyzing and visualizing data in real time to gain insights into operations, performance, and trends. Security Information and Event Management (SIEM): Collecting, normalizing, and analyzing security event data to detect and respond to threats. Application Performance Monitoring (APM): Monitoring application performance and tracking anomalies or issues affecting user experience. Geo-Spatial Search: Enabling search capabilities based on geographical location and distances, useful for location-based services and applications. Key Use Cases of OpenSearch: Real-time Application Monitoring: Gain insights into application performance, identify errors or bottlenecks quickly, and optimize resource utilization. Log Analytics: Efficiently analyze and explore log data to understand application behavior, troubleshoot issues, and ensure system health. Website Search: Implement robust and scalable full-text search capabilities for your website, delivering a seamless user experience. Security and Threat Detection: Analyze security logs to detect anomalies, investigate potential threats, and enhance overall security posture. Business Intelligence and Analytics: Uncover valuable insights from various data sources through powerful search and visualization tools to inform critical business decisions. Similar Tools to OpenSearch Several tools and platforms offer functionality similar to OpenSearch, catering to various aspects of search and analytics: Elasticsearch: The original search and analytics engine from which OpenSearch was forked. It remains a popular choice for distributed search and analytics, especially when paired with Kibana for visualization. Apache Solr: An open-source search platform built on Apache Lucene, providing robust full-text search, faceted search, real-time indexing, and more. Splunk: A commercial product that specializes in searching, monitoring, and analyzing machine-generated big data via a web-style interface. Apache Lucene: A high-performance, full-featured text search engine library written entirely in Java. It’s a technology suitable for nearly any application that requires full-text search, especially cross-platform. Graylog: An open-source log management tool that focuses on log aggregation, search, and analysis. It’s often used for monitoring and troubleshooting IT infrastructure issues. OpenSearch vs. Elasticsearch FeatureOpenSearchElasticsearchLicenseApache License 2.0 (Open Source)Elastic License (custom, with paid options)GovernanceCommunity-driven, vendor-neutralElastic company-drivenCostFree and open-sourceFree tier with paid features and supportFeature ParityAims for feature parity with ElasticsearchMay have additional features not in OpenSearchPerformanceGenerally performs slightly slower than ElasticsearchMay be faster in some scenariosSecurity FeaturesFull suite of security features included by defaultBasic security in free tier, advanced features paidIntegrationsMay require adjustments for existing Elasticsearch integrationsMore integrations readily available due to longer historyCommunity SupportGrowing community, active developmentLarger, established community How OpenSearch works? Choosing the Right Tool: The best tool for you depends on your specific needs and priorities. Consider factors like: Scale: How much data do you need to handle? Do you anticipate significant growth? Community: How important is a strong community for support and development? Licensing: Are you comfortable with a permissive open-source license like Apache 2.0 (OpenSearch) or do you have specific licensing requirements? Feature Set: Does the tool offer the necessary features for your use case (e.g., security analytics, machine learning integrations)? Ease of Use: How important is a user-friendly interface and deployment process? Reference https://opensearch.org https://aws.amazon.com/what-is/opensearch https://github.com/opensearch-project/OpenSearch The post What is OpenSearch? appeared first on DevOpsSchool.com. View the full article
  5. Amazon OpenSearch Ingestion now enables you to enrich events with geographical location data from an IP address, allowing you to add additional context to your observability and security data in realtime. Additionally, you can configure mapping templates in Amazon OpenSearch clusters to automatically display these enriched events on a geographical map using OpenSearch Dashboards. View the full article
  6. OR1, the OpenSearch Optimized Instance family, now doubles the max allowed storage per instance. OR1 also expands availability to four additional regions- Canada Central, EU (London), and Asia Pacific (Hyderabad, Seoul). OR1 delivers up to 30% price-performance improvement over existing instances (based on internal benchmarks), and uses Amazon S3 to provide 11 9s of durability. The new OR1 instances are best suited for indexing-heavy workloads, and offers better indexing performance compared to the existing memory optimized instances available on OpenSearch Service. View the full article
  7. Amazon OpenSearch Service is an Apache-2.0-licensed distributed search and analytics suite offered by AWS. This fully managed service allows organizations to secure data, perform keyword and semantic search, analyze logs, alert on anomalies, explore interactive log analytics, implement real-time application monitoring, and gain a more profound understanding of their information landscape. OpenSearch Service provides the tools and resources needed to unlock the full potential of your data. With its scalability, reliability, and ease of use, it’s a valuable solution for businesses seeking to optimize their data-driven decision-making processes and improve overall operational efficiency. This post delves into the transformative world of search templates. We unravel the power of search templates in revolutionizing the way you handle queries, providing a comprehensive guide to help you navigate through the intricacies of this innovative solution. From optimizing search processes to saving time and reducing complexities, discover how incorporating search templates can elevate your query management game. Search templates Search templates empower developers to articulate intricate queries within OpenSearch, enabling their reuse across various application scenarios, eliminating the complexity of query generation in the code. This flexibility also grants you the ability to modify your queries without requiring application recompilation. Search templates in OpenSearch use the mustache template, which is a logic-free templating language. Search templates can be reused by their name. A search template that is based on mustache has a query structure and placeholders for the variable values. You use the _search API to query, specifying the actual values that OpenSearch should use. You can create placeholders for variables that will be changed to their true values at runtime. Double curly braces ({{}}) serve as placeholders in templates. Mustache enables you to generate dynamic filters or queries based on the values passed in the search request, making your search requests more flexible and powerful. In the following example, the search template runs the query in the “source” block by passing in the values for the field and value parameters from the “params” block: GET /myindex/_search/template { "source": { "query": { "bool": { "must": [ { "match": { "{{field}}": "{{value}}" } } ] } } }, "params": { "field": "place", "value": "sweethome" } } You can store templates in the cluster with a name and refer to them in a search instead of attaching the template in each request. You use the PUT _scripts API to publish a template to the cluster. Let’s say you have an index of books, and you want to search for books with publication date, ratings, and price. You could create and publish a search template as follows: PUT /_scripts/find_book { "script": { "lang": "mustache", "source": { "query": { "bool": { "must": [ { "range": { "publish_date": { "gte": "{{gte_date}}" } } }, { "range": { "rating": { "gte": "{{gte_rating}}" } } }, { "range": { "price": { "lte": "{{lte_price}}" } } } ] } } } } } In this example, you define a search template called find_book that uses the mustache template language with defined placeholders for the gte_date, gte_rating, and lte_price parameters. To use the search template stored in the cluster, you can send a request to OpenSearch with the appropriate parameters. For example, you can search for products that have been published in the last year with ratings greater than 4.0, and priced less than $20: POST /books/_search/template { "id": "find_book", "params": { "gte_date": "now-1y", "gte_rating": 4.0, "lte_price": 20 } } This query will return all books that have been published in the last year, with a rating of at least 4.0, and a price less than $20 from the books index. Default values in search templates Default values are values that are used for search parameters when the query that engages the template doesn’t specify values for them. In the context of the find_book example, you can set default values for the from, size, and gte_date parameters in case they are not provided in the search request. To set default values, you can use the following mustache template: PUT /_scripts/find_book { "script": { "lang": "mustache", "source": { "query": { "bool": { "filter": [ { "range": { "publish_date": { "gte": "{{gte_date}}{{^gte_date}}now-1y{{/gte_date}}" } } }, { "range": { "rating": { "gte": "{{gte_rating}}" } } }, { "range": { "price": { "lte": "{{lte_price}}" } } } ] }, "from": "{{from}}{{^from}}0{{/from}}", "size": "{{size}}{{^size}}2{{/size}}" } } } } In this template, the {{from}}, {{size}}, and {{gte_date}} parameters are placeholders that can be filled in with specific values when the template is used in a search. If no value is specified for {{from}}, {{size}}, and {{gte_date}}, OpenSearch uses the default values of 0, 2, and now-1y, respectively. This means that if a user searches for products without specifying from, size, and gte_date, the search will return just two products matching the search criteria for 1 year. You can also use the render API as follows if you have a stored template and want to validate it: POST _render/template { "id": "find_book", "params": { "gte_date": "now-1y", "gte_rating": 4.0, "lte_price": 20 } } Conditions in search templates The conditional statement that allows you to control the flow of your search template based on certain conditions. It’s often used to include or exclude certain parts of the search request based on certain parameters. The syntax as follows: {{#Any condition}} ... code to execute if the condition is true ... {{/Any}} The following example searches for books based on the gte_date, gte_rating, and lte_price parameters and an optional stock parameter. The if condition is used to include the condition_block/term query only if the stock parameter is present in the search request. If the is_available parameter is not present, the condition_block/term query will be skipped. GET /books/_search/template { "source": """{ "query": { "bool": { "must": [ {{#is_available}} { "term": { "in_stock": "{{is_available}}" } }, {{/is_available}} { "range": { "publish_date": { "gte": "{{gte_date}}" } } }, { "range": { "rating": { "gte": "{{gte_rating}}" } } }, { "range": { "price": { "lte": "{{lte_price}}" } } } ] } } }""", "params": { "gte_date": "now-3y", "gte_rating": 4.0, "lte_price": 20, "is_available": true } } By using a conditional statement in this way, you can make your search requests more flexible and efficient by only including the necessary filters when they are needed. To make the query valid inside the JSON, it needs to be escaped with triple quotes (""") in the payload. Loops in search templates A loop is a feature of mustache templates that allows you to iterate over an array of values and run the same code block for each item in the array. It’s often used to generate a dynamic list of filters or queries based on the values passed in the search request. The syntax is as follows: {{#list item in array}} ... code to execute for each item ... {{/list}} The following example searches for books based on a query string ({{query}}) and an array of categories to filter the search results. The mustache loop is used to generate a match filter for each item in the categories array. GET books/_search/template { "source": """{ "query": { "bool": { "must": [ {{#list}} { "match": { "category": "{{list}}" } } {{/list}} { "match": { "title": "{{name}}" } } ] } } }""", "params": { "name": "killer", "list": ["Classics", "comics", "Horror"] } } The search request is rendered as follows: { "query": { "bool": { "must": [ { "match": { "title": "killer" } }, { "match": { "category": "Classics" } }, { "match": { "category": "comics" } }, { "match": { "category": "Horror" } } ] } } } The loop has generated a match filter for each item in the categories array, resulting in a more flexible and efficient search request that filters by multiple categories. By using the loops, you can generate dynamic filters or queries based on the values passed in the search request, making your search requests more flexible and powerful. Advantages of using search templates The following are key advantages of using search templates: Maintainability – By separating the query definition from the application code, search templates make it straightforward to manage changes to the query or tune search relevancy. You don’t have to compile and redeploy your application. Consistency – You can construct search templates that allow you to design standardized query patterns and reuse them throughout your application, which can help maintain consistency across your queries. Readability – Because templates can be constructed using a more terse and expressive syntax, complicated queries are straightforward to test and debug. Testing – Search templates can be tested and debugged independently of the application code, facilitating simpler problem-solving and relevancy tuning without having to re-deploy the application. You can easily create A/B testing with different templates for the same search. Flexibility – Search templates can be quickly updated or adjusted to account for modifications to the data or search specifications. Best practices Consider the following best practices when using search templates: Before deploying your template to production, make sure it is fully tested. You can test the effectiveness and correctness of your template with example data. It is highly recommended to run the application tests that use these templates before publishing. Search templates allow for the addition of input parameters, which you can use to modify the query to suit the needs of a particular use case. Reusing the same template with varied inputs is made simpler by parameterizing the inputs. Manage the templates in an external source control system. Avoid hard-coding values inside the query—instead, use defaults. Conclusion In this post, you learned the basics of search templates, a powerful feature of OpenSearch, and how templates help streamline search queries and improve performance. With search templates, you can build more robust search applications in less time. If you have feedback about this post, submit it in the comments section. If you have questions about this post, start a new thread on the Amazon OpenSearch Service forum or contact AWS Support. Stay tuned for more exciting updates and new features in OpenSearch Service. About the authors Arun Lakshmanan is a Search Specialist with Amazon OpenSearch Service based out of Chicago, IL. He has over 20 years of experience working with enterprise customers and startups. He loves to travel and spend quality time with his family. Madhan Kumar Baskaran works as a Search Engineer at AWS, specializing in Amazon OpenSearch Service. His primary focus involves assisting customers in constructing scalable search applications and analytics solutions. Based in Bengaluru, India, Madhan has a keen interest in data engineering and DevOps. View the full article
  8. Knowledge Bases for Amazon Bedrock is a fully managed Retrieval-Augmented Generation (RAG) capability that allows you to connect foundation models (FMs) to internal company data sources to deliver more relevant, context-specific, and accurate responses. We are excited to announce that Knowledge Bases now supports private network policies for Amazon OpenSearch Serverless (OSS). View the full article
  9. Amazon OpenSearch Service has been a long-standing supporter of both lexical and semantic search, facilitated by its utilization of the k-nearest neighbors (k-NN) plugin. By using OpenSearch Service as a vector database, you can seamlessly combine the advantages of both lexical and vector search. The introduction of the neural search feature in OpenSearch Service 2.9 further simplifies integration with artificial intelligence (AI) and machine learning (ML) models, facilitating the implementation of semantic search. Lexical search using TF/IDF or BM25 has been the workhorse of search systems for decades. These traditional lexical search algorithms match user queries with exact words or phrases in your documents. Lexical search is more suitable for exact matches, provides low latency, and offers good interpretability of results and generalizes well across domains. However, this approach does not consider the context or meaning of the words, which can lead to irrelevant results. In the past few years, semantic search methods based on vector embeddings have become increasingly popular to enhance search. Semantic search enables a more context-aware search, understanding the natural language questions of user queries. However, semantic search powered by vector embeddings requires fine-tuning of the ML model for the associated domain (such as healthcare or retail) and more memory resources compared to basic lexical search. Both lexical search and semantic search have their own strengths and weaknesses. Combining lexical and vector search improves the quality of search results by using their best features in a hybrid model. OpenSearch Service 2.11 now supports out-of-the-box hybrid query capabilities that make it straightforward for you to implement a hybrid search model combining lexical search and semantic search. This post explains the internals of hybrid search and how to build a hybrid search solution using OpenSearch Service. We experiment with sample queries to explore and compare lexical, semantic, and hybrid search. All the code used in this post is publicly available in the GitHub repository. Hybrid search with OpenSearch Service In general, hybrid search to combine lexical and semantic search involves the following steps: Run a semantic and lexical search using a compound search query clause. Each query type provides scores on different scales. For example, a Lucene lexical search query will return a score between 1 and infinity. On the other hand, a semantic query using the Faiss engine returns scores between 0 and 1. Therefore, you need to normalize the scores coming from each type of query to put them on the same scale before combining the scores. In a distributed search engine, this normalization needs to happen at the global level rather than shard or node level. After the scores are all on the same scale, they’re combined for every document. Reorder the documents based on the new combined score and render the documents as a response to the query. Prior to OpenSearch Service 2.11, search practitioners would need to use compound query types to combine lexical and semantic search queries. However, this approach does not address the challenge of global normalization of scores as mentioned in Step 2. OpenSearch Service 2.11 added the support of hybrid query by introducing the score normalization processor in search pipelines. Search pipelines take away the heavy lifting of building normalization of score results and combination outside your OpenSearch Service domain. Search pipelines run inside the OpenSearch Service domain and support three types of processors: search request processor, search response processor, and search phase results processor. In a hybrid search, the search phase results processor runs between the query phase and fetch phase at the coordinator node (global) level. The following diagram illustrates this workflow. The hybrid search workflow in OpenSearch Service contains the following phases: Query phase – The first phase of a search request is the query phase, where each shard in your index runs the search query locally and returns the document ID matching the search request with relevance scores for each document. Score normalization and combination – The search phase results processor runs between the query phase and fetch phase. It uses the normalization processer to normalize scoring results from BM25 and KNN subqueries. The search processor supports min_max and L2-Euclidean distance normalization methods. The processor combines all scores, compiles the final list of ranked document IDs, and passes them to the fetch phase. The processor supports arithmetic_mean, geometric_mean, and harmonic_mean to combine scores. Fetch phase – The final phase is the fetch phase, where the coordinator node retrieves the documents that matches the final ranked list and returns the search query result. Solution overview In this post, you build a web application where you can search through a sample image dataset in the retail space, using a hybrid search system powered by OpenSearch Service. Let’s assume that the web application is a retail shop and you as a consumer need to run queries to search for women’s shoes. For a hybrid search, you combine a lexical and semantic search query against the text captions of images in the dataset. The end-to-end search application high-level architecture is shown in the following figure. The workflow contains the following steps: You use an Amazon SageMaker notebook to index image captions and image URLs from the Amazon Berkeley Objects Dataset stored in Amazon Simple Storage Service (Amazon S3) into OpenSearch Service using the OpenSearch ingest pipeline. This dataset is a collection of 147,702 product listings with multilingual metadata and 398,212 unique catalog images. You only use the item images and item names in US English. For demo purposes, you use approximately 1,600 products. OpenSearch Service calls the embedding model hosted in SageMaker to generate vector embeddings for the image caption. You use the GPT-J-6B variant embedding model, which generates 4,096 dimensional vectors. Now you can enter your search query in the web application hosted on an Amazon Elastic Compute Cloud (Amazon EC2) instance (c5.large). The application client triggers the hybrid query in OpenSearch Service. OpenSearch Service calls the SageMaker embedding model to generate vector embeddings for the search query. OpenSearch Service runs the hybrid query, combines the semantic search and lexical search scores for the documents, and sends back the search results to the EC2 application client. Let’s look at Steps 1, 2, 4, and 5 in more detail. Step 1: Ingest the data into OpenSearch In Step 1, you create an ingest pipeline in OpenSearch Service using the text_embedding processor to generate vector embeddings for the image captions. After you define a k-NN index with the ingest pipeline, you run a bulk index operation to store your data into the k-NN index. In this solution, you only index the image URLs, text captions, and caption embeddings where the field type for the caption embeddings is k-NN vector. Step 2 and Step 4: OpenSearch Service calls the SageMaker embedding model In these steps, OpenSearch Service uses the SageMaker ML connector to generate the embeddings for the image captions and query. The blue box in the preceding architecture diagram refers to the integration of OpenSearch Service with SageMaker using the ML connector feature of OpenSearch. This feature is available in OpenSearch Service starting from version 2.9. It enables you to create integrations with other ML services, such as SageMaker. Step 5: OpenSearch Service runs the hybrid search query OpenSearch Service uses the search phase results processor to perform a hybrid search. For hybrid scoring, OpenSearch Service uses the normalization, combination, and weights configuration settings that are set in the normalization processor of the search pipeline. Prerequisites Before you deploy the solution, make sure you have the following prerequisites: An AWS account Familiarity with the Python programming language Familiarity with AWS Identity and Access Management (IAM), Amazon EC2, OpenSearch Service, and SageMaker Deploy the hybrid search application to your AWS account To deploy your resources, use the provided AWS CloudFormation template. Supported AWS Regions are us-east-1, us-west-2, and eu-west-1. Complete the following steps to launch the stack: On the AWS CloudFormation console, create a new stack. For Template source, select Amazon S3 URL. For Amazon S3 URL, enter the path for the template for deploying hybrid search. Choose Next. Name the stack hybridsearch. Keep the remaining settings as default and choose Submit. The template stack should take 15 minutes to deploy. When it’s done, the stack status will show as CREATE_COMPLETE. When the stack is complete, navigate to the stack Outputs tab. Choose the SagemakerNotebookURL link to open the SageMaker notebook in a separate tab. In the SageMaker notebook, navigate to the AI-search-with-amazon-opensearch-service/opensearch-hybridsearch directory and open HybridSearch.ipynb. If the notebook prompts to set the kernel, Choose the conda_pytorch_p310 kernel from the drop-down menu, then choose Set Kernel. The notebook should look like the following screenshot. Now that the notebook is ready to use, follow the step-by-step instructions in the notebook. With these steps, you create an OpenSearch SageMaker ML connector and a k-NN index, ingest the dataset into an OpenSearch Service domain, and host the web search application on Amazon EC2. Run a hybrid search using the web application The web application is now deployed in your account and you can access the application using the URL generated at the end of the SageMaker notebook. Copy the generated URL and enter it in your browser to launch the application. Complete the following steps to run a hybrid search: Use the search bar to enter your search query. Use the drop-down menu to select the search type. The available options are Keyword Search, Vector Search, and Hybrid Search. Choose GO to render results for your query or regenerate results based on your new settings. Use the left pane to tune your hybrid search configuration: Under Weight for Semantic Search, adjust the slider to choose the weight for semantic subquery. Be aware that the total weight for both lexical and semantic queries should be 1.0. The closer the weight is to 1.0, the more weight is given to the semantic subquery, and this setting minus 1.0 goes as weightage to the lexical query. For Select the normalization type, choose the normalization technique (min_max or L2). For Select the Score Combination type, choose the score combination techniques: arithmetic_mean, geometric_mean, or harmonic_mean. Experiment with Hybrid Search In this post, you run four experiments to understand the differences between the outputs of each search type. As a customer of this retail shop, you are looking for women’s shoes, and you don’t know yet what style of shoes you would like to purchase. You expect that the retail shop should be able to help you decide according to the following parameters: Not to deviate from the primary attributes of what you search for. Provide versatile options and styles to help you understand your preference of style and then choose one. As your first step, enter the search query “women shoes” and choose 5 as the number of documents to output. Next, run the following experiments and review the observation for each search type Experiment 1: Lexical search For a lexical search, choose Keyword Search as your search type, then choose GO. The keyword search runs a lexical query, looking for same words between the query and image captions. In the first four results, two are women’s boat-style shoes identified by common words like “women” and “shoes.” The other two are men’s shoes, linked by the common term “shoes.” The last result is of style “sandals,” and it’s identified based on the common term “shoes.” In this experiment, the keyword search provided three relevant results out of five—it doesn’t completely capture the user’s intention to have shoes only for women. Experiment 2: Semantic search For a semantic search, choose Semantic search as the search type, then choose GO. The semantic search provided results that all belong to one particular style of shoes, “boots.” Even though the term “boots” was not part of the search query, the semantic search understands that terms “shoes” and “boots” are similar because they are found to be nearest neighbors in the vector space. In this experiment, when the user didn’t mention any specific shoe styles like boots, the results limited the user’s choices to a single style. This hindered the user’s ability to explore a variety of styles and make a more informed decision on their preferred style of shoes to purchase. Let’s see how hybrid search can help in this use case. Experiment 3: Hybrid search Choose Hybrid Search as the search type, then choose GO. In this example, the hybrid search uses both lexical and semantic search queries. The results show two “boat shoes” and three “boots,” reflecting a blend of both lexical and semantic search outcomes. In the top two results, “boat shoes” directly matched the user’s query and were obtained through lexical search. In the lower-ranked items, “boots” was identified through semantic search. In this experiment, the hybrid search gave equal weighs to both lexical and semantic search, which allowed users to quickly find what they were looking for (shoes) while also presenting additional styles (boots) for them to consider. Experiment 4: Fine-tune the hybrid search configuration In this experiment, set the weight of the vector subquery to 0.8, which means the keyword search query has a weightage of 0.2. Keep the normalization and score combination settings set to default. Then choose GO to generate new results for the preceding query. Providing more weight to the semantic search subquery resulted in higher scores to the semantic search query results. You can see a similar outcome as the semantic search results from the second experiment, with five images of boots for women. You can further fine-tune the hybrid search results by adjusting the combination and normalization techniques. In a benchmark conducted by the OpenSearch team using publicly available datasets such as BEIR and Amazon ESCI, they concluded that the min_max normalization technique combined with the arithmetic_mean score combination technique provides the best results in a hybrid search. You need to thoroughly test the different fine-tuning options to choose what is the most relevant to your business requirements. Overall observations From all the previous experiments, we can conclude that the hybrid search in the third experiment had a combination of results that looks relevant to the user in terms of giving exact matches and also additional styles to choose from. The hybrid search matches the expectation of the retail shop customer. Clean up To avoid incurring continued AWS usage charges, make sure you delete all the resources you created as part of this post. To clean up your resources, make sure you delete the S3 bucket you created within the application before you delete the CloudFormation stack. OpenSearch Service integrations In this post, you deployed a CloudFormation template to host the ML model in a SageMaker endpoint and spun up a new OpenSearch Service domain, then you used a SageMaker notebook to run steps to create the SageMaker-ML connector and deploy the ML model in OpenSearch Service. You can achieve the same setup for an existing OpenSearch Service domain by using the ready-made CloudFormation templates from the OpenSearch Service console integrations. These templates automate the steps of SageMaker model deployment and SageMaker ML connector creation in OpenSearch Service. Conclusion In this post, we provided a complete solution to run a hybrid search with OpenSearch Service using a web application. The experiments in the post provided an example of how you can combine the power of lexical and semantic search in a hybrid search to improve the search experience for your end-users for a retail use case. We also explained the new features available in version 2.9 and 2.11 in OpenSearch Service that make it effortless for you to build semantic search use cases such as remote ML connectors, ingest pipelines, and search pipelines. In addition, we showed you how the new score normalization processor in the search pipeline makes it straightforward to establish the global normalization of scores within your OpenSearch Service domain before combining multiple search scores. Learn more about ML-powered search with OpenSearch and set up hybrid search in your own environment using the guidelines in this post. The solution code is also available on the GitHub repo. About the Authors Hajer Bouafif is an Analytics Specialist Solutions Architect at Amazon Web Services. She focuses on Amazon OpenSearch Service and helps customers design and build well-architected analytics workloads in diverse industries. Hajer enjoys spending time outdoors and discovering new cultures. Praveen Mohan Prasad is an Analytics Specialist Technical Account Manager at Amazon Web Services and helps customers with pro-active operational reviews on analytics workloads. Praveen actively researches on applying machine learning to improve search relevance. View the full article
  10. Improving the relevance of your LLM application by leveraging Charmed Opensearch’s vector database Large Language Models (LLMs) fall under the category of Generative AI (GenAI), an artificial intelligence type that produces content based on user-defined context. These models undergo training using an extensive dataset composed of trillions of combinations of words from natural language, enabling them to empower interactive and conversational applications across various scenarios. Renowned LLMs like GPT, BERT, PaLM, and LLaMa can experience performance improvements by gaining access to additional structured and unstructured data. This additional data may include public or internal documents, websites, and various text forms and content. This methodology, termed retrieval-augmented generation (RAG), ensures that your conversational application generates accurate results with contextual relevance and domain-specific knowledge, even in areas where the pertinent facts were not part of the initial training dataset. RAG can drastically improve the accuracy of an LLM’s responses. See the example below: “What is PRO?” response without RAG Pro is a subscription-based service that offers additional features and functionality to users. For example, Pro users can access exclusive content, receive priority customer support, and more. To become a Pro user, you can sign up for a Pro subscription on our website. Once you have signed up, you can access all of the Pro features and benefits. “What is PRO?” response with RAG Ubuntu Pro is an additional stream of security updates and packages that meet compliance requirements, such as FIPS or HIPAA, on top of an Ubuntu LTS. It provides an SLA for security fixes for the entire distribution (‘main and universe’ packages) for ten years, with extensions for industrial use cases. Ubuntu Pro is free for personal use, offering the full suite of Ubuntu Pro capabilities on up to 5 machines. This article guides you on leveraging Charmed OpenSearch to maintain a relevant and up-to-date LLM application. What is OpenSearch? OpenSearch is an open-source search and analytics engine. Users can extend the functionality of OpenSearch with a selection of plugins that enhance search, security, performance analysis, machine learning, and more. This previous article we wrote provides additional details on the comprehensive features of OpenSearch. We discussed the capability of enabling enterprise-grade solutions through Charmed OpenSearch. This blog will emphasise a specific feature pertinent to RAG: utilising OpenSearch as a vector database. What is a vector database? Vector databases allow you to store and index, for example, text documents, rich media, audio, geospatial coordinates, tables, and graphs into vectors. These vectors represent points in N-dimensional spaces, effectively encapsulating the context of an asset. Search tools can look into these spaces using low-latency queries to find similar assets in neighbouring data points. These search tools typically do this by exploiting the efficiency of different methods for obtaining, for example, the k-nearest neighbours (k-NN) from an index of vectors. In particular, OpenSearch enables this feature with the k-NN plugin and augments this functionality by providing your conversational applications with other essential features, such as fault tolerance, resource access controls, and a powerful query engine. Using the OpenSearch k-NN plugin for RAG IIn this section, we provide a practical example of using Charmed OpenSearch in the RAG process as a retrieval tool with an experiment using a Jupyter notebook on top of Charmed Kubeflow to infer an LLM. 1. Deploy Charmed OpenSearch and enable the k-NN plugin. Follow the Charmed OpenSearch tutorial, which is a good starting point. At the end, verify if the plugin is enabled, which is enabled by default: $ juju config opensearch plugin_opensearch_knn true 2. Get your credentials. The easiest way to create and retrieve your first administrator credentials is to add a relation between Charmed Opensearch and the Data Integrator Charm, which is also part of the tutorial. 3. Create a vector index for your k-NN index. Now, we can create a vector index for your additional documents encoded into the knn_vectors data type. For simplicity, we will use the opensearch-py client. from opensearchpy import OpenSearch os_host = 10.56.118.209 os_port = 9200 os_url = "https://10.56.118.209:9200" os_auth = ("opensearch-client_7","sqlKjlEK7ldsBxqsOHNcFoSXayDudf30") os_client = OpenSearch( hosts = [{'host': os_host, 'port': os_port}], http_compress = True, http_auth = os_auth, use_ssl = True, verify_certs = False, ssl_assert_hostname = False, ssl_show_warn = False ) os_index_name = "rag-index" settings = { "settings": { "index": { "knn": True, "knn.space_type": "cosinesimil" } } } opensearch_client.indices.create(index=os_index_name, body=settings) properties={ "properties": { "vector_field": { "type": "knn_vector", "dimension": 384 }, "text": { "type": "keyword" } } } opensearch_client.indices.put_mapping(index=os_index_name, body=properties) 4. Aggregate source documents. In this example, we will select a list of web content that we want our application to use as relevant information to provide accurate answers: content_links = [ https://discourse.ubuntu.com/t/ubuntu-pro-faq/34042 ] 5. Load document contents into memory and split the content into chunks. It will allow us to create the embeddings from the selected documents and upload them to the index we created. from langchain.document_loaders import WebBaseLoader loader = WebBaseLoader(content_links) htmls = loader.load() from langchain.text_splitter import CharacterTextSplitter text_splitter = CharacterTextSplitter( chunk_size=500, chunk_overlap=0, separator="\n") docs = text_splitter.split_documents(htmls) 6. Create embeddings for text chunks and store embeddings in the vector index. It will allow us to create the embeddings from the selected documents and upload them to the index we created. from langchain.embeddings import HuggingFaceEmbeddings embeddings = HuggingFaceEmbeddings( model_name="sentence-transformers/all-MiniLM-L12-v2", encode_kwargs={'normalize_embeddings': False}) from langchain.vectorstores import OpenSearchVectorSearch docsearch = OpenSearchVectorSearch.from_documents(docs, embeddings, ef_construction=256, engine="faiss", space_type="innerproduct", m=48, opensearch_url=os_url, index_name=os_index_name, http_auth=os_auth, verify_certs=False) 7. Use the similarity search to retrieve the documents that provide context to your query. The search engine will perform the Approximate k-NN Search, for example, using the cosine similarity formula, and return the relevant documents in the context of your question. query = """ What is Pro? """ similar_docs = docsearch.similarity_search(query, k=2, raw_response=True, search_type="approximate_search", space_type="cosinesimil") 8. Prepare you LLM. We used a simple example using a HugginFace pipeline to load an LLM. from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline from langchain.llms import HuggingFacePipeline model_name="TheBloke/Llama-2-7B-Chat-GPTQ" model = AutoModelForCausalLM.from_pretrained( model_name, cache_dir="model", device_map='auto' ) tokenizer = AutoTokenizer.from_pretrained(model_name,cache_dir="llm/tokenizer") pl = pipeline( "text-generation", model=model, tokenizer=tokenizer, max_length = 2048. ) llm = HuggingFacePipeline(pipeline=pl) 9. Create a prompt template. It will define the expectations of the response and specify that we will provide context for an accurate answer. from langchain import PromptTemplate question_prompt_template = """ You are a friendly chatbot assistant that responds in a conversational manner to user's questions. Respond in short but complete answers unless specifically asked by the user to elaborate on something. Use History and Context to inform your answers. Context: --------- {context} --------- Question: {question} Helpful Answer:""" QUESTION_PROMPT = PromptTemplate( template=question_prompt_template, input_variables=["context", "question"] ) 10. Infer the LLM to answer your question using the context documents retrieved from OpenSearch. from langchain.chains.question_answering import load_qa_chain question = "What is Pro?" chain = load_qa_chain(llm, chain_type="stuff", prompt=QUESTION_PROMPT) chain.run(input_documents=similar_docs, question=query) Conclusion Retrieval-augmented generation (RAG) is a method that enables users to converse with data repositories. It’s a tool that can revolutionise how you access and utilise data, as we showed in our tutorial. With RAG, you can improve data retrieval, enhance knowledge sharing, and enrich the results of your LLMs to give more contextually relevant, insightful responses that better reflect the most up-to-date information in your organisation. The benefits of better LLMs that can access your knowledge base are as obvious as they are alluring: you gain better customer support, employee training and developer productivity. On top of that, you ensure that your teams get LLM answers and results that reflect accurate, up-to-date policy and information rather than generalised or even outright useless answers. As we showed, Charmed OpenSearch is a simple and robust technology that can enable RAG capabilities. With it (and our helpful tutorial), any business can leverage RAG to transform their technical or policy manuals and logs into comprehensive knowledge bases. Enterprise-grade and fully supported OpenSearch solution Charmed OpenSearch is available for the open-source community. Canonical’s team of experts can help you get started with it as the vector database to leverage the power of the k-NN search for your LLM applications at any scale. Contact Canonical if you have questions. Watch the webinar: Future-proof AI applications with OpenSearch as a vector database View the full article
  11. We are excited to announce that Amazon OpenSearch Serverless can now scan and search up to 10TB of time series data which includes one or more indexes within a collection. OpenSearch Serverless is a serverless deployment option for Amazon OpenSearch Service that makes it simple for you to run search and analytics workloads without having to think about infrastructure management. With the support for much larger datasets than before, you can further enhance unlocking valuable operational insights and make data driven decisions to troubleshoot application downtime, improve system performance, or identify fraudulent activities. View the full article
  12. We are excited to announce that Amazon OpenSearch Serverless is enhancing access controls for VPC endpoints. With this feature, administrators can attach endpoint policies to control which AWS principals are allowed or denied access to the OpenSearch resources through their VPC endpoint(s). With a VPC endpoint policy, users can also combine actions along with AWS principals and resources to have finer control on the allowing or denying the traffic through their VPC endpoint(s). View the full article
  13. We are pleased to announce Amazon OpenSearch Serverless now offers improved security options for workloads with the support of Transport Layer Security (TLS) version 1.3. OpenSearch Serverless is the serverless option for Amazon OpenSearch Service that makes it simpler for you to run search and analytics workloads without having to think about infrastructure management. View the full article
  14. Amazon OpenSearch Service now lets you update cluster volume size, volume type, IOPS and throughput without requiring a blue/green deployment. This makes it easier for you to make changes to your EBS settings without having to plan upfront for a blue/green deployment. View the full article
  15. Amazon OpenSearch Service now provides improved visibility into the progress of domain updates. You can see granular status values representing different stages of an update, simplifying monitoring and automation of configuration changes. View the full article
  16. OpenSearch Service 2.11 now supports hybrid query score normalization. It is now easier than ever for search practitioners to leverage a combination of lexical and semantic search to improve their search relevance with OpenSearch. View the full article
  17. Amazon OpenSearch Service now offers support for the Amazon Graviton2 instance family in six additional regions- Africa (Cape Town), Asia Pacific (Osaka), Europe (Zurich), Middle East (Bahrain). Israel (Tel Aviv), and AWS GovCloud (US-West). Graviton-based instances (C6g/M6g/R6g) in OpenSearch Service provide up to 30% better price-performance than comparable x86-based (C5/M5/R5) Amazon Elastic Compute Cloud instances. View the full article
  18. Amazon OpenSearch Service now supports Neural Search on OpenSearch 2.9, enabling builders to create and operationalize semantic search applications with reduced undifferentiated heavy-lifting. For years, customers have been building semantic search applications on OpenSearch k-NN, but they’ve been burdened with building middleware to integrate text embedding models into search and ingest pipelines. Amazon OpenSearch Service customers can power Neural Search through integrations with Amazon SageMaker and Amazon Bedrock enabling semantic search pipelines that run on-cluster. View the full article
  19. Amazon Personalize launches a new integration with self-managed OpenSearch that enables customers to personalize search results for each user and assists in predicting their search needs. The Amazon Personalize Search Ranking plugin within OpenSearch helps customers to leverage the deep learning capabilities offered by Amazon Personalize and add personalization to OpenSearch search results, without any ML expertise. View the full article
  20. Amazon OpenSearch Service now supports managed VPC endpoints (powered by AWS PrivateLink) to connect to your Amazon OpenSearch Service VPC-enabled domain in a Virtual Private Cloud (VPC). With an Amazon OpenSearch Service managed endpoint, you can now privately access your OpenSearch Service domain within your VPC from your client applications in other VPCs, within the same or across AWS accounts, without using public IPs or requiring traffic to traverse the Internet. View the full article
  • Forum Statistics

    43.8k
    Total Topics
    43.3k
    Total Posts
×
×
  • Create New...