Newsletter: August 2023

Published on
August 30, 2023
July 3, 2024
Newsletter: August 2023
Authors
No items found.
Advancements in AI Newsletter
Subscribe to our Weekly Advances in AI newsletter now and get exclusive insights, updates and analysis delivered straight to your inbox.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

The latest Deeper Insights blogs

Embracing the AI Revolution: A Personal Journey Toward Efficiency and Productivity

Read about one person's experience embracing AI and how it has helped them to improve their efficiency and productivity. They discuss the different ways that AI can be used to automate tasks, make decisions, and generate insights. [Read more]

The Synergy Between Natural Language Processing and Computer Vision

Explore how two powerful technologies, natural language processing (NLP) and computer vision (CV), can be used together to create even more powerful solutions. [Read more]

AI for Scope 3 Life Cycle Assessment

Discover how AI can be used to improve life cycle assessment (LCA), a process for evaluating the environmental impact of products and services. This blog post explores the use of AI for scope 3 emissions, which are indirect emissions that occur in the supply chain. [Read more]

Featured GenAI news

Google's Answer to AI in Healthcare

Google is intensifying its efforts in healthcare AI to compete with Amazon and Microsoft. Partnering with HCA Healthcare, Google aims to improve patient handoff procedures using generative AI. The tech giant is also launching Med-PaLM 2, a healthcare-specific large language model, in September. [Read more]

McKinsey Unveils Generative AI's Contribution to Economic Growth

McKinsey Global Institute's report highlights the transformative power of Generative AI. It could impact various fields such as banking, life sciences, and marketing. Early adopters stand to gain a competitive edge in the global economy. AI's potential extends to white-collar work, customer service, and software development. In research and development, it leads to better products and more alternatives. [Read more]

GenAI Content Does Not Have Copyright Protection

In a recent decision, a U.S. judge ruled that content created by Generative AI is not protected by copyright law. This means that GenAI-created content can be freely used by anyone, without permission from the creator. [Read more]

GenAI news snapshots - Industry report

  • GPT-3.5 Turbo Enhanced with Fine-Tuning for Tailored AI. This allows developers to customise models for specific applications, achieving better performance in instruction-following, output formatting, and tone consistency. This customisation potential, coupled with improved efficiency, offers a refined AI solution. [Read more]
  • Do Machine Learning Models Memorise or Generalise? Explore the grokking phenomenon observed in machine learning models – a shift from memorisation to generalisation after extended training. Through analysis of small models trained on simple tasks, researchers reveal insights into the complex learning dynamics of neural networks, contributing to the field of mechanistic interpretability and the ongoing debate of memorization versus true generalisation in AI models. [Read more]
  • Meta Open-Sources Llama 2. Discover the reasons behind this decision as we delve into the significance of commercially-available open-source models like Llama 2. Get an in-depth analysis of Meta's strategic move in our newsletter. [Read more]
  • Custom Instructions for ChatGPT Plus users enable tailored preferences for personalised responses, enhancing seamless and personalised communication. [Read more]
  • Apple is currently using an internal chatbot called "Apple GPT" to assist its employees. Utilisation is on tasks like prototyping future features and answering questions based on trained data.The company is also exploring the possibility of expanding the use of generative AI within its organisation. However, any customer-facing deployment is likely to be cautious and deliberate. [Read more]
  • Wix introduces an AI Site Generator tool that allows users to create entire websites by describing their intent. The tool generates homepages, inner pages, and content, including business-specific sections like events and bookings. [Read more]
  • OpenAI's Trademark Move: "GPT-5" Signals What's Next? OpenAI sparks speculation with a trademark application for "GPT-5" on July 18, 2023. The scope covers artificial speech, audio-to-text conversion, voice recognition, language processing, and software for neural networks. Could this mark the dawn of AGI? [Read more]
  • Faster, Smarter Text Generation! Anthropic, the AI startup founded by former OpenAI executives, is making waves with the latest iteration of its lightning-fast, cost-effective text-generating model, Claude Instant. Accessible via an API, Claude Instant opens the door to seamless and efficient text generation. [Read more]
  • Devar, a tech front-runner, introduces the world's first AR-focused generative AI neural network. Craft 3D objects and assets seamlessly, transforming the AR landscape. The cutting-edge Generative AR Platform merges neural networks and cloud solutions, ushering in AI-driven AR content like MyWebAR. [Read more]
  • Google's Bard is challenging ChatGPT in conversational AI, now handling text and visual inputs. A recent study across 13 scenarios exposed limitations in Bard's vision-based comprehension. This highlights the path for future enhancements to bridge this gap and advance multi-modal Generative models, which seamlessly merge language and vision. [Read more]
  • Researchers work on a unified agent with foundation models. This research investigates the integration of language models and vision language models into Reinforcement Learning agents to enhance their reasoning abilities. This framework demonstrates significant performance improvements in exploration efficiency, data reuse, and skill learning. [Read more]
  • Interactive simulacra of human behaviour as generative agents. Stanford's groundbreaking research group has unveiled an open-source code for their fully simulated village, brimming with realistic inhabitants. They've engineered prompts for individuals and the system, seamlessly integrating with the gpt-3.5-turbo API. [Read more]

GenAI tools: LLM models

  • DoctorGPT - A model fine-tuned on medical data to provide a private, offline medical dialogue experience. Available on iOS, Android, and Web, this open-source project offers expertise without data privacy concerns. Its comprehensive medical knowledge stems from a blend of supervised training and reinforcement learning, revolutionising medical AI accessibility. [Read more]
  • MetaGPT: The Multi-Agent Framework - MetaGPT transforms prompts into user stories, analyses, and more, with a dynamic team orchestrating the software process, including carefully crafted SOPs. [Read more]
  • BrainyPDF - Summarise and answer questions for your PDFs using ChatGPT - BrainyPDF unlocks the power of ChatGPT for your PDFs by effortlessly summarising and getting answers from your documents. [Read more]
  • ToolBench: An open platform for training, serving, and evaluating LLMs for tool learning - Empower open-source LLMs with versatile tool-use skills using the top-quality instruction tuning SFT dataset. Elevate your potential with ToolLLaMA, fine-tuned on ToolBench, and master real-world APIs. [Read more]
  • EasyLLM - This tool is an open-source Python package for streamlined interaction with open LLMs, offering tools and methods for seamless navigation through the realm of LLMs. [Read more]
  • AgentBench: A Comprehensive Benchmark to Evaluate LLMs as Agents - The first benchmark tailored to assess LLM-as-Agent performance across 8 diverse environments. Experience a comprehensive evaluation across 5 newly crafted domains, redefining LLM assessment. [Read more]
  • PanGu-Coder2: Boosting LLMs for Code with Ranking Feedback - PanGu-Coder2 uses ranking feedback and Evol-Instruct prompts to achieve 62.20% SOTA on HumanEval, bridging Code LLMs to larger models like GPT-4. [Read more]
  • FalconLite - Efficiently process long sequences (11K tokens) with 4x less GPU memory than Falcon 40B. Using 4-bit GPTQ quantisation and dynamic NTK RotaryEmbedding, it optimises latency, accuracy, and memory. Handle contexts 5x longer for summarisation, QA, deploy on AWS g5.12x + TGI 0.9.2 for peak performance. [Read more]
  • UC Berkeley's Vicuna - An open LLM alternative to ChatGPT, introduces new business-ready versions based on Llama 2. The model has 7B & 13B parameters, up to 16k token context, and trained on 125k ShareGPT conversations. All models are now commercially available and excel in benchmarks and human preference. [Read more]
  • Tongyi Qianwen - Alibaba's latest language model with AI content creation capabilities in English and Chinese, it comes in various sizes, including the 7B parameter Qwen-7B. Alibaba's groundbreaking step? Open-sourcing Qwen-7B and introducing Qwen-7B-Chat for conversational apps. [Read more]
  • LISA - The Large-language Instructed Segmentation Assistant. This dynamic model inherits the prowess of the multi-modal LLM while seamlessly generating segmentation masks. [Read more]
  • InstructBLIP - A visual instruction-tuned version of BLIP-2, achieving remarkable zero-shot and fine-tuning performance, surpassing BLIP-2 and larger Flamingo models. [Read more]
  • Llama 2 - The next-gen open-source language model, now available for research and commercial use. This release offers pretrained and fine-tuned models with parameters ranging from 7B to 70B, along with model weights and starting code. Trained on 40% more data than Llama 1 and boasting double the context length. Llama 2 opens up new horizons for language processing [Read more]
  • LLaMA-2-7B-32K - A 32K context model, powered by Position Interpolation and Together AI's innovation, including FlashAttention-2. Fine-tune this model for multi-document understanding, summarisation, and QA, with 3x speedup on 32K context inference. [Read more]
  • LLaMa2-Accessory - An Open-source Toolkit for LLM Development. Explore the possibilities of LLMs and multimodal LLMs with the open-source toolkit, LLaMA2-Accessory. This dynamic repository, an evolution of LLaMA-Adapter, comes packed with advanced features for seamless pre-training, fine-tuning, and deployment [Read more]
  • Stable Beluga 1 and Stable Beluga 2 - Introduced by Stability AI, these are two open access LLMs demonstrating exceptional reasoning ability across various benchmarks. [Read more]
  • AutoChain: Build lightweight, extensible, and testable LLM Agents - Simplify and accelerate generative agent development with this lightweight framework. Automatically evaluate user scenarios through simulated conversations, making customisation a breeze. Experience rapid iteration with AutoChain and large language models for seamless agent building. [Read more]
  • LangSmith - LangChain has announced LangSmith, a unified platform designed to assist developers debug, test, evaluate, and monitor their LLM applications. It provides deep visibility into model inputs and outputs, making it easier for teams to experiment. [Read more]
  • DemoGPT - Auto Gen-AI App Generator, powered by GPT-3.5-turbo, auto-generates LangChain code, transforming it into a Streamlit application, optimising development, and democratising the process. [Read more]
  • Ollama - A tool for running, building, and sharing LLMs on macOS, focusing on Apple Silicon devices. It simplifies using LLMs like Llama2, eliminating compatibility or performance concerns. [Read more]
  • RAGstack - This tool enables a private ChatGPT alternative within an organization's VPC, using Retrieval Augmented Generation (RAG) to augment LLMs, ideal for enterprises [Read more]

GenAI tools: Visual models

  • BoxDiff - A training-free method for text-to-image synthesis using simple user-provided conditions like boxes or scribbles. The proposed method seamlessly integrates spatial constraints into diffusion models. This allows control over object and context placement in synthesized images without the need for additional training or extensive annotated layout data. [Read more]
  • StableVideo - Introduces a text-driven video editing technique using ControlNet, showcased at ICCV 2023. The approach ensures stable edits in videos like boat, car, and black swan. Installation involves cloning the repo, creating an environment, and downloading models and samples. The tool generates edited videos and keyframes in the "log" directory after rendering. [Read more]
  • Stable Diffusion XL (SDXL) 1.0 - Released by Stability AI SDXL is the latest and most advanced text-to-image model in its suite. SDXL is available as open access to developers and featured on Amazon Bedrock. The model offers vibrant and accurate colours, quicker results, and a fine-tuning beta feature. [Read more]
  • CFSum - A Coarse-to-Fine contribution network for multimodal summarisation, aiming to address the unclear contribution of the visual modality in the process. CFSum incorporates a pre-filter module to eliminate irrelevant images and two levels of visual complement modules to accurately utilise useful images. It showcases the ability of useful images to influence the generation of non-visual words implicitly represented in the image. [Read more]
  • PromeAI - An AI-powered design assistant offering a wide range of creative tools. The tools include sketch rendering, photo-to-sketch conversion, image variation generation, AI supermodel rendering, etc. [Read more]

GenAI tools: Everything else models

  • CM3leon - A highly efficient multimodal generative model for text and images. It achieve state-of-the-art performance with just a fraction of the computing power used by previous methods. [Read more]
  • SeamlessM4T - An all-in-one multimodal translation model, pioneers speech-to-speech and speech-to-text translation with unprecedented language coverage and accuracy. It supports translation of nearly 100 input languages and 35 output languages. This breakthrough model simplifies communication across languages, delivering high-quality results and reducing reliance on multiple models. [Read more]
  • PlayHT1.0 - A text-to-speech model that creates remarkably lifelike and emotion-infused speech, including laughter. With its unique self-supervised approach, the model offers diverse voices and styles, even enabling voice cloning from just 30 seconds of audio. This advancement holds potential for applications across industries, revolutionising content creation and voice production. [Read more]
  • AI Academy’s FlagEmbedding - Bge-large-en is here! With a robust 1024 embedding dimension, this open-source model is your ticket to enhanced commercial applications. Plus, it's already claimed the leaderboard throne, surpassing OpenAI ada-text-002! [Read more]

Let us solve your impossible problem

Speak to one of our industry specialists about how Artificial Intelligence can help solve your impossible problem

Deeper Insights
Sign up to get our Weekly Advances in AI newsletter delivered straight to your inbox
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Written by our Data Scientists and Machine Learning engineers, our Advances in AI newsletter will keep you up to date on the most important new developments in the ever changing world of AI
Email us
Call us
Deeper Insights AI Ltd t/a Deeper Insights is a private limited company registered in England and Wales, registered number 08858281. A list of members is available for inspection at our registered office: Camburgh House, 27 New Dover Road, Canterbury, Kent, United Kingdom, CT1 3DN.