ChatGPT is not an LLM

Share
Disclaimer:  I work for Dell Technology Services as a Workforce Transformation Solutions Principal.    It is my passion to help guide organizations through the current technology transition specifically as it relates to Workforce Transformation.  Visit Dell Technologies site for more information.  Opinions are my own and not the views of my employer.

This is something that I have heard many times over the last few months that needs a bit of explanation…

ChatGPT (Chat Generative Pre-Trained Transformer) is an application of a Large Language Model (LLM). It is an AI-powered application that enables human-like conversations and other text-based tasks. ChatGPT and other bots like it (see below) are examples of LLM-enabled applications, and they represent a substantial advancement in the state of the art when it comes to AI’s ability to understand and generate text, compose ideas, and perform reasoning.

Specifically, ChatGPT is an AI chatbot developed by OpenAI. It is based on a GPT-3 (Generative Pre-trained Transformer 3) large language model that allows users create interactive conversations and conduct, refine and steer conversations towards desired parameters such as length, format, style, and language.

Unlike traditional Large Language Models (LLMs), ChatGPT is optimized for chat-based applications and aims to create a seamless user experience during dynamic, back-and-forth interactions and excels at maintaining context, providing helpful responses, and adapting to user prompts.  ChatGPT builds upon and extends OpenAI’s GPT-3 model to create a tailored large language model better suited for conversational AI applications.

GPT-3 is the powerful language model that OpenAI unveiled in 2020. It was trained on a massive dataset of internet text and achieved unprecedented performance on many natural language processing tasks. However, GPT-3 as originally released had some limitations – it was primarily a text-in, text-out model without additional training capabilities. ChatGPT takes the GPT-3 base model and augments it in several key ways:

  1. It was further trained using Reinforcement Learning from Human Feedback (RLHF) on a large corpus of conversational data.
  2. It integrates additional capabilities like long-term memory and following complex instructions.
  3. Techniques were applied to make its outputs more focused and coherent over long exchanges.

So, in essence, ChatGPT takes the powerful GPT-3 language model as its foundation but adds specialized training and architectural tweaks to optimize it for open-ended dialogue and interactive assistantship.

ChatGPT represent a glimpse of that future – or an interim step towards Artificial General Intelligence (AGI) that blends the best of LLMs with safety and ethical constraints. Either way, it’s an important advancement taking LLMs in a new, more practical direction.

Was ChatGPT the first Chatbot?

No. The first chatbot is widely considered to be ELIZA, created by Joseph Weizenbaum at MIT Laboratories in 1966. ELIZA made a meaningful attempt to beat the Turing Test, a test of a machine’s ability to exhibit human-like intelligence. It simulated conversation using a pattern-matching and substitution methodology, giving users the illusion of understanding.

ELIZA’s most famous script, DOCTOR, acted as a psychotherapist, reflecting back users’ words and responding with non-directional questions. Although ELIZA couldn’t truly understand, many early users attributed human-like feelings to it, surprising its creator.

ELIZA laid the groundwork for subsequent chatbots and remains a significant milestone in the history of artificial intelligence and human-computer interaction.  ELIZA remains a fascinating piece of AI history, and its impact continues to resonate even in the era of advanced language models.

More on ELIZA here: ELIZA – Wikipedia

Are there any other ChatGPT-like applications?

Yes! While they may share the core language modeling technology, each of these AI systems have been customized by their creators with unique training data, fine-tuning approaches, and intended use cases – whether open-domain chat, writing, coding, search, or domain-specific tasks. Some like you.com and Claude.com let you switch back end LLM models.

Examples of AI chatbots and assistants that are similar in nature to ChatGPT:

  1. Claude (Anthropic): Claude is another AI assistant created by Anthropic using similar constitutional AI principles as ChatGPT, but with some distinct traits and capabilities.
  2. Bard (Google): Google’s conversational AI that aims to combine the breadth of their language model with intelligent discourse and factual grounding.
  3. Microsoft Copilot (formerly known as Bing Chat): Built on the same base language model as ChatGPT (GPT-3), Microsoft fine-tuned it with a multi-billion-dollar investment to create their own conversational search assistant.
  4. You.com (youChat)is an AI assistant that is conversational and continuously learning. It enhances web search, writing, coding, digital art creation, and solving complex problem.
  5. ChatSonic is recognized as a top alternative to ChatGPT, offering a range of features and capabilities. It is known for its conversational abilities and is considered a strong contender in the AI chatbot space.
  6. Specialized Commercial Assistants: Many tech companies like Salesforce, Amazon, and Nvidia offer their own proprietary conversational AI assistants either publicly or to enterprise customers.

Prompt-Engineering

Prompt engineering refers to the process of carefully crafting the input prompts or instructions given to large language models (LLMs) and AI systems in order to guide and optimize their outputs. Prompt engineering is a vital skill for optimizing responses from AI applications like ChatGPT. It involves crafting and optimizing text prompts to achieve desired outcomes, and it is poised to become a vital skill for IT and business professionals.

Are Chatbots like ChatGPT the ultimate human interface to LLMs and AI?

I do not think so… Chatbots like ChatGPT are not the ultimate interface for large language models (LLMs). While they represent an important step forward, but they have limitations. While they are an accessible way to interface with LLMs today, the ultimate interface may involve tighter cognitive integration and multimodal grounding as the technology matures. We will eventually interact with AI and LLMs in ways that feel as natural as human conversations, including visual clues, multi-modality, recall and memory of past conversations, recall of personal preferences and specialized multi-agent collaboration to arrive to solutions.

This last piece of the last sentence is what I personally look forward to in the next few years (or perhaps months!)

See Also:

What Are Large Language Models (LLM)

Similar Posts