Newsletter: January 2024
The latest Deeper Insights blogs
Decoding 'Outperform' in LLM Comparisons
Explore the dynamic world of Large Language Models (LLMs) as we dissect what 'outperform' really means in AI. Delve into the complexities of evaluating AI models, the intricacies of benchmark leakage, and the evolving criteria for measuring LLM performance. [Read more]
ChatGPT: Technical Innovations and Business Impacts in 2023
One year of ChatGPT, its evolution, and business impacts in 2023 have highlighted its technical advancements, open-source accessibility, and diverse industry applications, revolutionising AI interaction in various sectors. [Read more]
Ready for AI: A Business Roadmap for Integrating Artificial Intelligence
AI integration is transforming businesses, enhancing operational efficiency and innovation. This guide offers a roadmap for AI adoption, covering AI and Machine Learning basics, their industry impact, and a five-phase approach for successful integration and mastery. [Read more]
Navigating AI-powered Image Creation: A Stable Diffusion Primer
Stable Diffusion marks a paradigm shift in generative AI, offering efficient, high-quality image synthesis. Its open-source nature democratises AI, fostering innovation across industries while balancing ethical AI practices. [Read more]
Featured GenAI news
Mark Zuckerberg’s new goal is creating artificial general intelligence - The Verge
Mark Zuckerberg recently disclosed intentions to develop artificial general intelligence (AGI). Although he hasn't specified a timeframe or a precise definition for AGI, he has strategically relocated Meta's AI research division, FAIR, to the company sector responsible for creating generative AI applications across various platforms. This reorganisation is designed to facilitate the more immediate integration of Meta's AI advancements into its user experience. Furthermore, Meta is set to acquire over 340,000 Nvidia H100 GPUs by year's end. [Read more]
AlphaFold found thousands of possible psychedelics. Will its predictions help drug discovery? - Nature.com
Researchers have utilized AlphaFold, an advanced protein-structure-predicting tool, to uncover numerous potential new psychedelics. This showcases AlphaFold's capacity for accelerating drug discovery, a process traditionally taking years. While there's some skepticism, evidence points to its significant promise in aiding new drug identification. Isomorphic Labs, a branch of DeepMind, has even entered agreements potentially worth $2.9 billion for drug exploration using machine learning technologies like AlphaFold. [Read more]
Now OpenAI CEO Sam Altman wants billions for AI chip fabs - The Register
OpenAI's CEO, Sam Altman, is reportedly raising billions to establish a network of AI chip factories globally. This ambitious project aims to construct and operate semiconductor facilities dedicated to producing neural network accelerators, addressing the surging demand. The goal is to create a sufficient number of production lines to meet the growing need for AI processors, a concern Altman has previously highlighted due to the current scarcity of these specialized processors. [Read more]
AI Startup ElevenLabs Hits Unicorn Status, Innovates with Monetisable Voice Generation Technology - Yahoo Finance
ElevenLabs, a Voice AI startup, has reached unicorn status with a valuation of $1.1 billion after a successful $80 million Series B funding round. The London-based company, known for its AI-generated voices, has seen a substantial increase in value since 2023 and is planning significant expansion. With a diverse range of products and a growing global team, ElevenLabs is positioned to cater to a wide array of clients with its innovative voice technology solutions. [Read more]
GenAI news snapshots - Industry report
- At DevDay, OpenAI announced the GPT-4 Turbo with 128K context and with Vision, DALL·E 3 API, alongside the Assistants API for building AI apps, and lower prices. A key highlight was the Custom Models program, offering tailored GPT-4 model development for organizations with extensive proprietary datasets, ensuring exclusivity and data privacy. [Read more]
- Microsoft's Reading Coach, an AI-driven, free tool to improve reading, will soon be available for Windows and online. It enhances learning, like with Canva, and extends Teams for Education's Reading Progress. The tool supports educators in tracking student development and offers AI-based "choose your own story" options and interactive language skills tools, aiming for a more immersive and individualised reading learning journey. [Read more]
- Cohere has launched Embed v3, an advanced embeddings model excelling in accuracy and efficiency. It stands out in handling noisy datasets and reducing operational costs, making it ideal for search applications and retrieval-augmented generation systems. [Read more]
- NVIDIA has announced the HGX H200, a new AI processor aimed at large AI workloads, featuring advanced memory technology for improved performance in AI applications. It's compatible with existing H100 systems and will be deployed on major cloud platforms starting next year. Additionally, the GH200 Superchip will power Exascale supercomputers, focusing on AI-driven scientific research. [Read more]
- ChatGPT, with 100 million weekly users and over 2 million developers, remains one of the fastest-growing services, widely used by Fortune 500 companies. OpenAI's new features include custom ChatGPT versions and an enhanced GPT-4 Turbo. Despite competitive services, ChatGPT's user growth continues to outpace other major internet apps. [Read more]
- YouTube to require labels on AI-generated videos, especially those appearing realistic. Non-compliant creators risk content removal and YouTube Partner Program suspension. The platform is also addressing AI-generated music, allowing music partners to request removal of AI-created songs that mimic real artists. [Read more]
- GitHub announced the launch of GitHub Copilot Chat and Copilot Enterprise for AI-assisted coding, along with a partner program to expand Copilot's capabilities. New AI security features were also introduced to bolster code safety. [Read more]
- AI Revolutionises Music with New Technologies - 2023 saw significant advancements in generative music technologies, such as Google's MusicLM and Meta's MusicGen. These new AI tools are simplifying music creation, enabling artists and hobbyists alike to produce professional-grade music more easily, and fostering a new wave of innovation in the music industry. [Read more]
- OpenAI is developing a new method named Q* (Q-Star), which is generating buzz as a potential breakthrough in the quest for artificial general intelligence (AGI). Despite currently solving math at an elementary level, Q* has shown promising results, sparking optimism among researchers about its future potential. [Read more]
GenAI tools: LLM models
- MEDITRON revolutionises medical AI by offering open-source large language models with 7B and 70B parameters, surpassing existing models in accuracy and accessibility. It leverages Nvidia's Megatron-LM for enhanced medical understanding, and its superior performance is validated on major medical benchmarks. [Read more]
- Apple's AIM, an autoregressive image model suite, enhancing vision model training, mirroring large language model techniques. Featuring scalable, unsupervised pre-training with specialized advancements for various applications, its pinnacle is a 7 billion parameter model, trained on 2 billion images. Achieving 84.0% accuracy on ImageNet-1k without hitting a performance ceiling, it opens avenues for further enhancements and prolonged training. [Read more]
- Claude 2.1, Anthropic's latest AI model - features a groundbreaking 200K token context window and significantly reduced rates of hallucination for enhanced accuracy and reliability. It excels in processing complex, lengthy documents and offers a new tool use feature for improved API integration. This model is now powering the Claude chat experience and is available via Anthropic's API. [Read more]
- Elon Musk's xAI launched Grok - an AI chatbot positioned as a more unfiltered alternative to AI models like ChatGPT. Designed with humor and a rebellious nature, Grok is part of Musk's critique of other AI ventures as too "woke". It will initially be available to X platform's paid subscribers. [Read more]
GenAI tools: Visual models
- Stable Video Diffusion (SVD) Image-to-Video, a cutting-edge diffusion model, transforms a single image into a video sequence. Enhanced from its 14-frame predecessor, this model produces 25-frame videos at 576x1024 resolution, employing an f8-decoder for seamless frame transitions. [Read more]
- 2VGen-XL - by Alibaba Group is a novel image-to-video synthesis model using diffusion models. It improves semantic accuracy, clarity, and continuity in videos through a two-stage process: semantic coherence and content preservation in the base stage, followed by detail enhancement in the refinement stage. Trained on extensive text-video and text-image data, it excels in generating high-quality videos. [Read more]
- Runway Research released new features for enhanced video generation - including Motion Brush for directed movement, Gen-2 Style Presets for easy styling, and updated Director Mode for precise camera control. These tools, along with an improved Image Model, offer greater control and artistic expression in AI-assisted media creation, aligning with Runway's goal of advancing storytelling tools. [Read more]
GenAI tools: Everything else models
- Emu Video introduces an efficient two-step approach to text-to-video generation using diffusion models: it first creates an image from a text prompt, then builds a video based on this image and the prompt. This streamlined process enables the production of high-quality, 512px resolution, 4-second videos at 16fps with just two diffusion models, bypassing the need for complex model cascades. [Read more]
- CogVLM - a visual language model with 17 billion parameters, achieving top performance in cross-modal benchmarks. It combines vision and language processing capabilities, offering features like detailed image descriptions and Q&A. CogVLM provides multi-GPU inference and fine-tuning options for specialised tasks. [Read more]
- StyleTTS 2, an advanced text-to-speech model, introduces style diffusion and adversarial training with large speech language models for unparalleled human-like speech synthesis. It innovatively uses latent variables for style adaptation, eliminating the need for reference speech, and incorporates WavLM discriminators for enhanced naturalness. [Read more]
- Giskard is a Python library for detecting vulnerabilities in AI models, including biases and errors. It streamlines the testing process, offering features like automated test suite generation and integration with various tools. Aimed at enhancing model reliability, Giskard supports Python 3.9-3.11 and is installable via pip. It also offers premium features for advanced testing and model comparison through the Giskard hub. [Read more]
- Langroid, a Python framework, streamlines LLM-powered application development with a multi-agent system for collaborative task-solving. It integrates tools like function-calling and supports Multi-Agent Collaboration, offering flexibility and ease for developers. Compatible with OpenAI's Assistants API, Langroid enhances application functionality and user experience. [Read more]
- VimGPT combines GPT-4V's vision with Vimium for keyboard-based web browsing, aiming to enhance model-web interaction. It explores using higher resolution images for better model performance and plans to integrate APIs for improved context handling and interaction capabilities. Future enhancements include speech-to-text features for accessibility and potential applications for assisting visually impaired users. [Read more]
- OpenGPTs is an open-source initiative offering a customisable alternative to OpenAI's GPTs and Assistants API. Utilising LangChain, LangServe, and LangSmith, it allows users to select their LLM, prompts, tools, and vector databases. The project supports various language models and includes features like sandbox testing, custom actions, and analytics. OpenGPTs focuses on providing more control to users in creating LLM-powered applications. [Read more]
- MockGPT is a tool for simulating OpenAI APIs. It helps developers avoid high costs and long wait times during app development and testing. It uses WireMock to create mock API responses, allowing for efficient and cost-effective testing. Available for free, MockGPT supports detailed testing scenarios via a WireMock Cloud account, streamlining the development of GPT-powered applications. [Read more]
Let us solve your impossible problem
Speak to one of our industry specialists about how Artificial Intelligence can help solve your impossible problem