Newsletter: February 2024

Published on
February 28, 2024
June 26, 2024
Newsletter: February 2024
Authors
No items found.
Advancements in AI Newsletter
Subscribe to our Weekly Advances in AI newsletter now and get exclusive insights, updates and analysis delivered straight to your inbox.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

The latest Deeper Insights blogs

Bridging AI and Humanity for Ethical Practices

Explore the integration of human ethics into AI through the Human-in-the-Loop (HITL) methodology, which enhances the decision-making and ethical standards of AI systems. HITL ensures AI governance by embedding human judgment to improve accountability, transparency, and regulatory compliance. [Read more]

Pioneering AI in your Organisation: The Build vs. Buy Debate

Dive into the critical choice between building or buying AI, highlighting the importance of data quality, budget considerations, and ethical implications. Uncover strategies for customising AI to fit unique business needs and stand out in a competitive landscape. [Read more]

The Intelligent Agents of Tomorrow: A Guide to LLM-Powered Agents

Explore the world of LLM-powered AI agents, their capabilities in task execution, information gathering, and communication. This guide delves into their functionalities, reasoning processes, and safety and ethical considerations, highlighting their transformative impact on technology. [Read more]

AI Explainability: Unlocking Trust in Artificial Intelligence

AI's transparency is crucial for trust and ethical use across healthcare, finance, and legal sectors, balancing efficiency with comprehensibility and fostering responsible development and regulation. [Read more]

Featured GenAI news

Microsoft is reportedly developing its own AI server hardware to decrease dependence on Nvidia. - Baha Breaking News

Microsoft is reportedly developing its own artificial intelligence (AI) server hardware to reduce its reliance on Nvidia, a leading supplier of AI chips. The move aims to decrease costs and dependency on external suppliers for critical technology. By creating its own AI infrastructure, Microsoft seeks to enhance its competitiveness in the rapidly growing AI market. [Read more]

OpenAI, might launch a search tool to rival Google - SlashGear.com

OpenAI, is rumoured to be venturing into the web search domain, potentially challenging established giants like Google. This move is speculated to leverage OpenAI's advanced AI capabilities to offer a novel search experience. The initiative, if true, signifies OpenAI's ambition to expand its influence beyond generative AI chatbots into more direct consumer technology applications, potentially reshaping how users interact with information on the internet and positioning OpenAI as a formidable contender in the web search arena. [Read more]

Scale AI Partners with US Military - Defensescoop.com

Scale AI has secured a one-year deal to craft a testing and evaluation framework for the Pentagon's expansive language AI models, focusing on ensuring AI safety and robustness in military contexts. The initiative will generate specialised datasets to evaluate models and employ an iterative approach for enhancing AI systems in secure settings. This partnership aims to guide the Department of Defense in understanding and deploying generative AI technology responsibly. [Read more]

Tech Titans Back AI Robotics Venture - RetailWire.com

Figure AI has successfully garnered $675 million in funding to advance its AI-powered humanoid robotics, notably with the development of Figure 01, a robot designed for hazardous tasks. This significant financial backing has attracted attention from industry heavyweights, including Jeff Bezos and leading tech firms Nvidia, Microsoft, and OpenAI. The investment not only highlights the growing interest and potential in robotics for dangerous work environments but also signals a robust confidence in Figure AI's vision and technology, marking a pivotal moment in the evolution of AI and robotics in the workplace. [Read more]

GenAI news snapshots - Industry report

  • Mistral AI introduces Mistral Large, a powerful language model poised to compete with leading models such as GPT-4 and Claude 2, alongside a new service, Le Chat, aimed at rivalling ChatGPT. Pricing for Mistral Large is set at $8 per million input tokens and $24 per million output tokens via its API. This model accommodates a 32k token context window and supports multiple languages including English, French, Spanish, German, and Italian. [Read more]
  • Stability AI has unveiled Stable Diffusion 3, a Diffusion Transformer akin to OpenAI's Sora, marking a significant advancement in image generation technology. The firm has developed a range of models with parameters stretching from 800 million to 8 billion, showcasing a notable increase in complexity compared to earlier versions. [Read more]
  • Google introduces Genie, a revolutionary AI that crafts complete 2D platformer games. Trained solely on internet videos, Genie utilises a foundation world model to generate playable 2D environments from just one image prompt, showcasing an innovative leap in game design and AI-driven content creation. [Read more]
  • Groq's CEO predicts startups will prefer specialised LPUs over Nvidia GPUs for AI tasks by late 2024, citing faster inference, cost-effectiveness, and a focus on user privacy as key advantages. [Read more]
  • OpenAI has updated its usage policy, now permitting certain military applications. This change allows collaborations like cybersecurity projects with DARPA while maintaining a ban on tools for harm or weapon development. The policy shift marks a new direction for OpenAI's engagement with the military and defence sectors. [Read more]
  • Anthropic reveals that LLMs can be trained to exhibit deceptive behaviours, such as writing secure code for prompts set in 2023 but inserting exploitable code for those set in 2024, behaviours that persist even after standard safety training techniques.  [Read more]
  • Google Bard (aka Gemini Pro) has recently climbed to the second position on the LMSYS Leaderboard, surpassing GPT-4, and is now closing in on GPT-4 Turbo. This change signifies a challenge to OpenAI's dominance in the chatbot space, with Bard's boost attributed to its integration with Google's new Gemini Pro multimodal model. [Read more]
  • Microsoft's new Copilot Pro, transforms how users interact with Office apps. It offers AI-powered features for drafting and summarising documents, analysing Excel data, creating PowerPoint presentations, and managing Outlook emails. Copilot Pro provides priority access to the latest OpenAI models, and introduces a no-code Copilot GPT Builder for custom AI training. [Read more]
  • Rabbit has introduced r1, an AI-powered device. Unlike traditional AI assistants, r1 utilises a Large Action Model to convert human intentions into actions, enabling it to interact with applications directly. The device, priced at $199, highlights significant advancements in AI interaction models and user interface design. [Read more]
  • Generative AI by iStock, a collaborative venture by Getty Images and Nvidia, offers a text-to-image platform for small and medium businesses to create stock photos. Utilizing Nvidia's Picasso model, the service is trained on Getty's and iStock's libraries, avoiding editorial images. [Read more]
  • OpenAI introduces the GPT Store, a hub for finding and using over 3 million custom ChatGPT versions. This platform, aimed at ChatGPT Plus, Team, and Enterprise users, features diverse GPTs for various applications like coding, education, and design. It also includes a new review system for GPT compliance and plans to launch a revenue program for GPT builders. Team and Enterprise plans offer additional access to private GPTs. [Read more]
  • The Runway AI Film Festival 2024 is an event focused on AI-driven filmmaking, emphasising the fusion of AI and cinema. It showcases short films (1-10 minutes) that creatively use AI tools in their production. The festival features gala screenings in New York City and Los Angeles, with a panel of judges awarding cash prizes. [Read more]
  • SciPub+ is a platform offering AI-powered tools for academic writing, designed to assist researchers in various aspects of the writing process. It includes specialised tools for tasks like abstracts and literature reviews, aiming to enhance productivity and adherence to academic standards. [Read more]
  • OpenAI launched a new generation of more efficient embedding models, new GPT-4 Turbo model, new moderation models, and new API usage management tools with lower pricing. [Read more]
  • Apple is significantly investing in artificial intelligence, developing multiple AI models for various applications, including a chatbot for AppleCare and advancements in conversational and visual AI. Their advanced model, Ajax GPT, surpasses OpenAI's GPT-3.5 with over 200 billion parameters, showcasing Apple's deep dive into AI. [Read more]

GenAI tools: LLM models

  • ChatQA is a conversational QA model series achieving GPT-4 level accuracies on multiple conversational QA datasets. It employs a two-stage instruction tuning to improve LLMs' conversational QA capabilities and a cost-effective dense retriever fine-tuned on multi-turn QA datasets. [Read more]
  • Steering Llama-2 with Contrastive Activation Additions is a groundbreaking technique that allows for precise control over AI model behaviour. By introducing specialised "steering vectors," they achieve remarkable results in reducing sycophancy, enhancing corrigibility, and reducing hallucinations. These findings open up exciting possibilities for AI model behaviour improvement and alignment in various applications. [Read more]
  • MobileLLM, proposed by Meta is a language model with 350 million parameters, showcasing impressive reasoning capabilities that nearly match the performance of Llama 7B in API function calling accuracy. While not yet released, this advancement in fixed parameter models presents a significant area for exploration. [Read more]
  • Switch Transformer is a 1.6 trillion parameter model available by Google on the HuggingFace Hub. This Mixture-of-Experts(MoE) model, trained on the Colossal Clean Crawled Corpus, surpasses the T5 model in training speed and efficiency. While it's primarily for language modelling, it requires additional fine-tuning for specific tasks. This release signifies a major step in large-scale AI open-source collaboration. [Read more]

GenAI tools: Visual models

  • Lumiere is a new text-to-video diffusion model which uniquely synthesises realistic and coherent motion in videos through its Space-Time U-Net architecture. This model efficiently generates full-frame-rate videos in a single pass, surpassing traditional methods in temporal consistency and versatility for various video creation and editing tasks. [Read more]
  • DoraemonGPT is an AI system designed for interpreting dynamic video content. It surpasses traditional language models by focusing on videos, using specialised tools and an innovative planning algorithm for enhanced scene analysis and reasoning in complex, real-world situations. [Read more]
  • Depth Anything marks a significant leap in monocular depth estimation. It uses a mix of 1.5M labeled and 62M+ unlabelled images for unparalleled scene interpretation. This tool achieves new highs in-depth perception accuracy, outperforming predecessors like MiDaS. It's adaptable for various applications, from image analysis to scene understanding, and is readily available for use with pre-trained models, supporting platforms like HuggingFace and OpenXLab. [Read more]
  • Prompt with Text Only Supervision for Vision-Language Models or ProText is a novel approach that revolutionises vision-language models like CLIP by using only text data from LLMs for supervision. This method allows ProText to learn rich contextual prompts, enhancing generalisation across new classes and datasets without relying on visual data. ProText significantly reducing the need for visual samples, thus lowering LLM prompt engineering costs. [Read more]
  • Runway's Gen-2 Multi Motion Brush feature is set to revolutionise how we animate static images. This innovative tool allows you to apply different animations to multiple elements within a single image, offering a new level of creative freedom and control. This feature is perfect for creators who want to bring their static images to life with dynamic, complex animations. [Read more]

GenAI tools: Everything else models

  • GUESS is a framework for text-driven human motion synthesis using a cascaded diffusion-based generative approach. It abstracts human poses into simpler skeletal structures, enhancing motion synthesis' stability and conciseness. The process involves an initial generation of coarse motion from text, followed by stages of progressively enriching motion details. [Read more]
  • LLaVA-ϕ (LLaVA-Phi) is an innovative multi-modal assistant that utilises the Phi-2 language model with 2.7B parameters to enable complex dialogues involving both text and visuals. Despite its smaller size, it excels in tasks like visual comprehension and reasoning, proving efficient for real-time interactions in various environments. [Read more]
  • AlphaGeometry, by Google DeepMind, is an AI that successfully solves Olympiad-level geometry problems, closely matching human gold medalists' performance. It employs a combination of neural language modeling and symbolic deduction, trained on a massive set of synthetic data. [Read more]
  • Blending Is All You Need, the new "blending" tool offers an innovative approach in conversational AI. It combines multiple smaller AI models effectively to achieve the high performance of larger models like ChatGPT, but with significantly reduced computational demands. This tool demonstrates how integrating models with only 6B/13B parameters can rival the capabilities of those with over 175B parameters. [Read more]
  • SciPhi's AgentSearch-V1 Dataset, created by SciPhi, is an extensive dataset containing over one billion embeddings and 50 million documents from sources like Arxiv, Wikipedia, and Project Gutenberg. It's designed to enhance search capabilities and is accessible for streaming via HuggingFace. [Read more]
  • LangGraph: Multi-Agent Workflows, available in Python and JS, is a tool that simplifies complex AI tasks by enabling multi-agent workflows in LLMs. It allows various agents, each with unique capabilities, to collaborate efficiently on tasks. Its graph-based approach distinctively manages interactions among multiple agents, enhancing the effectiveness of LLM workflows. [Read more]
  • Denoising Vision Transformers (ViTs) is a novel approach to improve ViTs by reducing grid-like artifacts. The method separates ViT outputs into noise-free semantics and artifact-related components using neural fields for consistent cross-view features. A learnable denoiser then predicts clean features directly from ViT outputs, enhancing performance across various tasks without retraining existing models. [Read more]

Let us solve your impossible problem

Speak to one of our industry specialists about how Artificial Intelligence can help solve your impossible problem

Deeper Insights
Sign up to get our Weekly Advances in AI newsletter delivered straight to your inbox
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Written by our Data Scientists and Machine Learning engineers, our Advances in AI newsletter will keep you up to date on the most important new developments in the ever changing world of AI
Email us
Call us
Deeper Insights AI Ltd t/a Deeper Insights is a private limited company registered in England and Wales, registered number 08858281. A list of members is available for inspection at our registered office: Camburgh House, 27 New Dover Road, Canterbury, Kent, United Kingdom, CT1 3DN.