Search the Community

Showing results for tags 'cloud'.

Found 2 results

Sort By
- Date
- Relevancy

Inference: The future of AI in the cloud

TechRadar posted a topic in Artificial Intelligence

Now that it’s 2024, we can’t overlook the profound impact that Artificial Intelligence (AI) is having on our operations across businesses and market sectors. Government research has found that one in six UK organizations has embraced at least one AI technology within its workflows, and that number is expected to grow through to 2040. With increasing AI and Generative AI (GenAI) adoption, the future of how we interact with the web hinges on our ability to harness the power of inference. Inference happens when a trained AI model uses real-time data to predict or complete a task, testing its ability to apply the knowledge gained during training. It's the AI model’s moment of truth to show how well it can apply information from what it has learned. Whether you work in healthcare, ecommerce or technology, the ability to tap into AI insights and achieve true personalization will be crucial to customer engagement and future business success. Inference: the Key to true personalisation The key to personalisation lies in the strategic deployment of inference by scaling out inference clusters closer to the geographical location of the end user. This approach ensures that AI-driven predictions for inbound user requests are accurate and delivered with minimal delays and low latency. Businesses must embrace GenAI’s potential to unlock the ability to provide tailored and personalised user experiences. Businesses that haven’t anticipated the importance of the inference cloud will get left behind in 2024. It is fair to say that 2023 was the year of AI experimentation, but the inference cloud will enable the realisation of actual outcomes with GenAI in 2024. Enterprises can unlock innovation in open-source Large Language Models (LLMs) and make true personalisation a reality with cloud inference. A new web app Before the entrance of GenAI, the focus was on providing pre-existing content without personalization close to the end user. Now, as more companies undergo the GenAI transformation, we’ll see the emergence of inference at the edge - where compact LLMs can create personalized content according to users’ prompts. Some businesses still lack a strong edge strategy – much less a GenAI edge strategy. They need to understand the importance of training centrally, inferring locally, and deploying globally. In this case, serving inference at the edge requires organizations to have a distributed Graphics Processing Unit (GPU) stack to train and fine-tune models against localized datasets. Once these datasets are fine-tuned, the models are then deployed globally across data centers to comply with local data sovereignty and privacy regulations. Companies can provide a better, more personalized customer experience by integrating inference into their web applications by using this process. GenAI requires GPU processing power, but GPUs are often out of reach for most companies due to high costs. When deploying GenAI, businesses should look to smaller, open-source LLMs rather than large hyperscale data centers to ensure flexibility, accuracy and cost efficiency. Companies can avoid complex and unnecessary services, a take-it-or-leave-it approach that limits customization, and vendor lock-in that makes it difficult to migrate workloads to other environments. GenAI in 2024: Where we are and where we're heading The industry can expect a shift in the web application landscape by the end of 2024 with the emergence of the first applications powered by GenAI models. Training AI models centrally allows for comprehensive learning from vast datasets. Centralized training ensures that models are well-equipped to understand complex patterns and nuances, providing a solid foundation for accurate predictions. Its true potential will be seen when these models are deployed globally, allowing businesses to tap into a diverse range of markets and user behaviors. The crux lies in the local inference component. Inferring locally involves bringing the processing power closer to the end-user, a critical step in minimizing latency and optimising the user experience. As we witness the rise of edge computing, local inference aligns seamlessly with distributing computational tasks closer to where they are needed, ensuring real-time responses and improving efficiency. This approach has significant implications for various industries, from e-commerce to healthcare. Consider if an e-commerce platform leveraged GenAI for personalized product recommendations. By inferring locally, the platform analyses user preferences in real-time, delivering tailored suggestions that resonate with their immediate needs. The same concept applies to healthcare applications, where local inference enhances diagnostic accuracy by providing rapid and precise insights into patient data. This move towards local inference also addresses data privacy and compliance concerns. By processing data closer to the source, businesses can adhere to regulatory requirements while ensuring sensitive information remains within the geographical boundaries set out by data protection laws. The Age of Inference has arrived The journey towards the future of AI-driven web applications is marked by three strategies - central training, global deployment, and local inference. This approach not only enhances AI model capabilities but is vendor-agonistic, regardless of cloud computing platform or AI service provider. As we enter a new era of the digital age, businesses must recognize the pivotal role of inference in shaping the future of AI-driven web applications. While there's a tendency to focus on training and deployment, bringing inference closer to the end-user is just as important. Their collective impact will offer unprecedented opportunities for innovation and personalization across diverse industries. We've listed the best productivity tool. This article was produced as part of TechRadarPro's Expert Insights channel where we feature the best and brightest minds in the technology industry today. The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing find out more here: https://www.techradar.com/news/submit-your-story-to-techradar-pro View the full article
- March 13
- - ai
  - cloud
Conf42.com: Cloud Native 2021

James posted an event in DevOps Events

Apr 28

Wednesday 28 April 2021, 11:00 PM
Conf42.com: Cloud Native 2021 Are you cloud native? Do you love Kubernetes? Have you gotten rid of the monolith and adopted microservices.. and then moved back to the monolith? Is your pet dog named Helm, your pet cat Prometheus or your goldfish Envoy? If any of this applies to you, we need you! Come and talk to like-minded people about all the things cloud and cloud native: running in the cloud adopting Kubernetes and related technologies microservices service meshes lessons learned from production failures Details https://www.papercall.io/conf42-cloud-native-2021
- October 24, 2020
- - event
  - cloud native
  - (and 6 more)
    Tagged with:
    
    event
    
    cloud native
    
    cloud
    
    k8s
    
    microservices
    
    service meshes
    
    production failures
    
    lessons learnt

Forum Statistics

43.3k
Total Topics

42.7k
Total Posts

Sign In

Search the Community

Search By Tags

Search By Author

Content Type

Forums

Calendars

Find results in...

Find results that contain...

Date Created

Start

End

Last Updated

Start

End

Filter by number of...

Minimum number of comments

Minimum number of replies

Minimum number of reviews

Minimum number of views

Joined

Start

End

Group

Website URL

LinkedIn Profile URL

About Me

Cloud Platforms

Cloud Experience

Development Experience

Current Role

Skills

Certifications

Favourite Tools

Interests

Inference: The future of AI in the cloud

Conf42.com: Cloud Native 2021

Forum Statistics