Newsletter: March 2024
The latest Deeper Insights blogs
Will AI Kill Music?
AI and music intersect to redefine creativity and production, highlighting breakthroughs in audio tech and innovation. This exploration invites envisioning a future where technology and artistic expression merge, reshaping the musical landscape. [Read more]
Beyond The Cloud: Reinventing AI with Local GPUs
Discover the shift from cloud to local GPUs for advanced AI work, boosting control, creativity, and cost efficiency. This strategic move enhances experimentation and secures a competitive edge in AI development. [Read more]
The Convergence of AI and Genomics: Deciphering Life’s Blueprint
AI's potential to revolutionise genetics promises groundbreaking advancements in healthcare, enabling precise disease prediction, personalised medicine, and deeper understanding of genetic data's impact on health. [Read more]
Exploring The Mind: AI’s Journey Into Human Thought
Discover how AI and neuroscience are merging to read minds and interpret brain activity, through innovations like Latent Diffusion Models and brain-computer interfaces, pushing the boundaries of technology and cognition. [Read more]
Featured GenAI news
NHS AI Outsmarts Doctors - BBC.co.uk
An AI tool tested by the NHS has successfully identified signs of breast cancer missed by doctors in a trial involving over 10,000 mammograms. This innovation underscores AI's potential in enhancing early cancer detection and improving patient outcomes, showcasing its capacity to augment medical diagnostics and streamline patient care by potentially reducing waiting times for results. [Read more]
Sora meets Hollywood - Bloomberg.com
OpenAI is poised to enter Hollywood, with scheduled meetings in Los Angeles to discuss partnerships with film industry leaders for its AI video generator's integration into movie production. Following an impressive debut in February, the AI firm showcased Sora, a service capable of creating realistic videos from text, to industry professionals. Despite its potential, the adoption of AI in filmmaking raises concerns over job security and compensation for content use, even as OpenAI navigates the competitive landscape against tech giants and AI startups alike. [Read more]
AI Act Passes Major EU Vote - TechCrunch.com
The EU's AI Act, aiming to regulate AI applications based on risk, has passed a significant vote, moving closer to adoption. This regulation categorises AI applications by risk, banning certain uses, imposing governance on high-risk applications, and requiring transparency for AI chatbots. The unanimous vote by EU Member State ambassadors overcomes opposition and sets the stage for final approval by the European Parliament. [Read more]
Humans Hold Exclusive Rights to Patent Ownership - ArsTechnica.com
The United States Patent and Trademark Office (USPTO) clarified that only humans can be listed as inventors on patents, not AI models. This guidance emphasises the importance of significant human contributions in the creation of inventions, even those assisted by AI. [Read more]
Shell Harnesses AI for Emissions Monitoring - FinancialTimes.com
Energy firms are progressively utilising artificial intelligence to enhance the efficiency and sustainability of their operations. This advancement is creating new roles, such as AI ethics specialists and data engineers, while potentially transforming traditional jobs. Shell, among other companies, employs AI for tasks like emissions monitoring. The advancements in AI technology promise a more reliable energy supply and enhanced safety in asset management, leading to an increased demand for specialised skills in AI and cybersecurity within the sector. [Read more]
GenAI news snapshots - Industry report
- The impact of AI, deepfakes, and social media on misinformation, comparing modern challenges to historical instances of gossip influencing politics. It outlines concerns over democracy's vulnerability due to fake news, influential social media figures, and state-led disinformation campaigns. The article calls for vigilant efforts to combat misinformation and preserve democratic integrity in an era of pervasive digital manipulation. [Read more]
- AI is transforming accountancy by automating mundane tasks, allowing professionals to focus on strategic decisions. While AI increases efficiency and may affect jobs, particularly in auditing, it also prompts changes in training for AI management and client relations. [Read more]
- OpenAI is enhancing DALL-E 3 images with new watermarks to authenticate AI-generated content, addressing digital misinformation. This effort, in collaboration with the C2PA, embeds visible and invisible marks in image metadata, though challenges remain in maintaining these watermarks across social media platforms. [Read more]
- Google rebrands Bard to Gemini, launching a mobile app and the advanced Ultra 1.0 AI model. Available through the Google One AI Premium Plan for $19.99/month, Gemini Advanced offers enhanced capabilities for complex tasks. The mobile app, supporting Android and iOS, aims to provide easy access to AI assistance. [Read more]
- Google also unveils Gemini 1.5, a significant upgrade from Gemini 1.0 Ultra, featuring enhanced performance and long-context understanding across modalities. This new model can process up to 1 million tokens, facilitating advanced analysis of extensive datasets. Developed with efficiency and safety in mind, Gemini 1.5 Pro offers comparable capabilities to 1.0 Ultra but with greater compute efficiency. It's currently available for early testing to select developers and enterprise customers, emphasizing responsible AI development and extensive safety testing. [Read more]
- BUD-E is an open-source project aimed at improving AI voice assistants to deliver more natural and empathetic conversations. Developed by LAION and partners, it focuses on real-time responses, multi-speaker interactions, and maintaining long-term conversational context on consumer hardware. With achievements in low-latency responses, BUD-E's roadmap includes enhancing speech naturalness, conversation memory, and multi-modal understanding. The project invites collaboration to further refine these conversational AI capabilities. [Read more]
- NVIDIA introduces "Chat with RTX," a local PC-run chatbot that accesses personal files for queries and summaries, utilising models like Mistral or Llama 2. It supports various file formats and YouTube content integration for enhanced queries. The chatbot offers privacy benefits by processing data locally, but requires NVIDIA GeForce RTX 30 Series GPUs or higher. This launch underscores NVIDIA's focus on advancing AI technologies while prioritizing user privacy. [Read more]
- MagiScan for Business offers a platform for adding 3D models and AR experiences of products to online stores, aiming to improve customer engagement and sales metrics. It integrates with Shopify and supports exporting to the metaverse, with plans varying by model count and views, including a trial option. [Read more]
- KardsAI is a mobile app that quickly converts PDFs, text, or prompts into flashcards for both iOS and Android users. It leverages AI for creating flashcards efficiently, employs a spaced repetition algorithm for better memory retention, and is available in both free and Pro versions. The app is designed to make studying from various materials more effective. [Read more]
GenAI tools: LLM models
- Google's Gemma introduces a new range of open AI models for developers and researchers, building on Gemini model technology. It offers models in various sizes, a toolkit for responsible AI development, and compatibility with major AI frameworks. Optimised for diverse hardware, Gemma aims for high performance and responsible usage, reflecting Google's commitment to the open community. [Read more]
- OLMo 7B, an open-source large language model created through collaboration between Databricks and the Allen Institute for AI (AI2), aims to democratise AI development. Leveraging Databricks' Mosaic AI Model Training Platform, it enables users to build and fine-tune AI models with their data. This initiative reflects a shared commitment to open scientific research and making advanced AI tools accessible to the wider community. [Read more]
- Qwen1.5, the latest in the Qwen series, offers open-source base and chat models across sizes from 0.5B to 72B, including quantised versions for optimised deployment. Integrated into Hugging Face transformers for easier access, it marks a significant step in enhancing developer experience. [Read more]
- BGE-M3 by BAAI excels in Multi-Functionality, Multi-Linguality, and Multi-Granularity, supporting dense, multi-vector, and sparse retrieval across 100+ languages and handling documents up to 8192 tokens. It's integrated with Hugging Face transformers for easy access and facilitates a hybrid retrieval approach for enhanced accuracy. [Read more]
- Smaug-72B by Abacus AI has become the top-ranked open-source language model on Hugging Face, surpassing GPT-3.5 and Mistral Medium. As a fine-tuned version of "Qwen-72B" from Alibaba's Qwen team, it achieves an unprecedented average score above 80 across major benchmarks. This marks a significant milestone in open-source AI, showcasing advancements especially in reasoning and math tasks. [Read more]
GenAI tools: Visual models
- "InteractiveVideo" is a tool for generating controllable videos through user inputs like text and images, featuring a Synergistic Multimodal Instruction system for high-quality, customised video creation. This approach ensures detailed customisation in video generation by adjusting visual and motion aspects based on user preferences. [Read more]
- Stable Cascade, introduced by Stability AI, is a new text-to-image model that uses a three-stage approach for improved image quality and efficient consumer hardware fine-tuning. It utilises hierarchical image compression to enhance quality while reducing training costs and is available for non-commercial use, with tools for customisation. [Read more]
- Large World Models combine large-scale video and language data to enhance AI's understanding of human knowledge and the physical world. It features the RingAttention technique, enabling the processing of multimodal sequences up to a million tokens. This project aims at improving AI's ability to comprehend and generate content related to complex real-world scenarios. [Read more]
- Lumiere is a text-to-video diffusion model by Google Research, designed for creating realistic and diverse motion videos. It employs a Space-Time U-Net architecture to generate videos in a single pass, ensuring global temporal consistency. This model facilitates a range of content creation and video editing applications, such as image-to-video conversion, video inpainting, and stylised generation, demonstrating state-of-the-art results in text-to-video generation. [Read more]
GenAI tools: Everything else models
- Reka AI has launched "Reka Flash", a 21B model that excels in language and vision tasks, outperforming larger models. It includes "Reka Edge," a compact 7B variant, with both models showcasing superior performance in multimodal benchmarks and multilingual reasoning. In chat applications, Reka Flash surpasses competitors like GPT-3.5 and Claude 2.1. [Read more]
- Lag-Llama is the first open-source foundation model for time series forecasting, providing zero-shot forecasting capabilities. Recent updates include an enhanced Colab Demo for various time series formats. Future plans include fine-tuning scripts, an online demo for predictions and fine-tuning, and pretraining features, aiming to improve real-world application usability. [Read more]
- LoRA Land features 25 fine-tuned Mistral-7b models. Released by Predibase, these models surpass base models by 70% and GPT-4 by 4-15% in performance, depending on the task. Fine-tuned for under $8 each on average, these models demonstrate a cost-effective and efficient approach to deploying highly performant AI systems, leveraging LoRAX to serve hundreds of models on a single GPU.[Read more]
- Direct Principle Feedback and the "Pink Elephant Problem". DPF is a simplified approach to fine-tuning language models for specific control tasks, exemplified by the "Pink Elephant Problem." Using DPF, the authors significantly enhance a 13B LLaMA 2 model's ability to avoid unwanted topics (like "Pink Elephants") and focus on preferred subjects (like "Grey Elephants"), achieving better performance than base models and comparability with GPT-4 on curated tests. [Read more]
- Cog is an open-source project for Machine Learning that makes deploying machine learning models easy by packaging them into Docker containers. It automates dependency management, simplifies setup, and provides a RESTful HTTP API for model integration, enabling deployment across any Docker-supported platform. [Read more]
- RAGs is a Streamlit app designed to create a RAG (Retrieval-Augmented Generation) pipeline from a data source using natural language. It allows users to describe tasks, alter generated parameters, and query the RAG agent over data. This project, inspired by GPT models from OpenAI, offers a detailed setup process, including installation instructions, and supports various LLMs and embedding models. [Read more]
- llmware - A framework for developing LLM-based application patterns. The "llmware" GitHub repository provides an enterprise-grade LLM-based development framework, tools, and fine-tuned models. It includes a unified framework for developing LLM-based application patterns, such as RAG, offering an integrated set of tools for building knowledge-based enterprise LLM applications. It emphasizes easy integration of open-source specialised models and secure enterprise knowledge connection.[Read more]
- The “Hallucination-leaderboard" is a GitHub project that benchmarks LLMs on their propensity to "hallucinate," or generate false information, when summarising short documents. It serves as a comparative leaderboard to evaluate and understand the accuracy and reliability of different LLMs in summarising tasks, specifically focusing on the challenge of avoiding the inclusion of non-factual details. [Read more]
Let us solve your impossible problem
Speak to one of our industry specialists about how Artificial Intelligence can help solve your impossible problem