 {"id":510814,"date":"2024-10-25T09:13:25","date_gmt":"2024-10-25T16:13:25","guid":{"rendered":"https:\/\/jorgep.com\/blog\/?p=510814"},"modified":"2025-01-17T12:02:57","modified_gmt":"2025-01-17T19:02:57","slug":"what-makes-the-many-llms-different","status":"publish","type":"post","link":"https:\/\/jorgep.com\/blog\/what-makes-the-many-llms-different\/","title":{"rendered":"What Makes the many LLMs different?"},"content":{"rendered":"\n<div class=\"wp-block-columns has-theme-palette-7-background-color has-background is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<p>Part of: <strong> <a href=\"https:\/\/jorgep.com\/blog\/series-ai-learnings\/\">AI Learning Series Here<\/a><\/strong><\/p>\n\n\n<style>.kadence-column395113_43ef2d-d5 > .kt-inside-inner-col,.kadence-column395113_43ef2d-d5 > .kt-inside-inner-col:before{border-top-left-radius:0px;border-top-right-radius:0px;border-bottom-right-radius:0px;border-bottom-left-radius:0px;}.kadence-column395113_43ef2d-d5 > .kt-inside-inner-col{column-gap:var(--global-kb-gap-sm, 1rem);}.kadence-column395113_43ef2d-d5 > .kt-inside-inner-col{flex-direction:column;}.kadence-column395113_43ef2d-d5 > .kt-inside-inner-col > .aligncenter{width:100%;}.kadence-column395113_43ef2d-d5 > .kt-inside-inner-col:before{opacity:0.3;}.kadence-column395113_43ef2d-d5{position:relative;}@media all and (max-width: 1024px){.kadence-column395113_43ef2d-d5 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}@media all and (max-width: 767px){.kadence-column395113_43ef2d-d5 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}<\/style>\n<div class=\"wp-block-kadence-column kadence-column395113_43ef2d-d5\"><div class=\"kt-inside-inner-col\"><style>.wp-block-kadence-advancedheading.kt-adv-heading510545_6813a5-28, .wp-block-kadence-advancedheading.kt-adv-heading510545_6813a5-28[data-kb-block=\"kb-adv-heading510545_6813a5-28\"]{font-size:var(--global-kb-font-size-sm, 0.9rem);font-style:normal;}.wp-block-kadence-advancedheading.kt-adv-heading510545_6813a5-28 mark.kt-highlight, .wp-block-kadence-advancedheading.kt-adv-heading510545_6813a5-28[data-kb-block=\"kb-adv-heading510545_6813a5-28\"] mark.kt-highlight{font-style:normal;color:#f76a0c;-webkit-box-decoration-break:clone;box-decoration-break:clone;padding-top:0px;padding-right:0px;padding-bottom:0px;padding-left:0px;}.wp-block-kadence-advancedheading.kt-adv-heading510545_6813a5-28 img.kb-inline-image, .wp-block-kadence-advancedheading.kt-adv-heading510545_6813a5-28[data-kb-block=\"kb-adv-heading510545_6813a5-28\"] img.kb-inline-image{width:150px;vertical-align:baseline;}<\/style>\n<p class=\"kt-adv-heading510545_6813a5-28 wp-block-kadence-advancedheading\" data-kb-block=\"kb-adv-heading510545_6813a5-28\">Quick Links:&nbsp;<a href=\"https:\/\/jorgep.com\/blog\/resources-for-learning-ai\/\">Resources for Learning AI<\/a> | <a href=\"https:\/\/jorgep.com\/blog\/keeping-up-with-ai\/\">Keep up with AI<\/a> | <a href=\"https:\/\/jorgep.com\/blog\/list-of-ai-tools\/\" data-type=\"post\" data-id=\"402818\">List of AI Tools<\/a><\/p>\n<\/div><\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\"><div class=\"wp-block-template-part\"><style>.wp-block-kadence-advancedheading.kt-adv-heading395113_c650df-47, .wp-block-kadence-advancedheading.kt-adv-heading395113_c650df-47[data-kb-block=\"kb-adv-heading395113_c650df-47\"]{text-align:center;font-size:var(--global-kb-font-size-md, 1.25rem);line-height:60px;font-style:normal;background-color:#f5a511;}.wp-block-kadence-advancedheading.kt-adv-heading395113_c650df-47 mark.kt-highlight, .wp-block-kadence-advancedheading.kt-adv-heading395113_c650df-47[data-kb-block=\"kb-adv-heading395113_c650df-47\"] mark.kt-highlight{font-style:normal;color:#f76a0c;-webkit-box-decoration-break:clone;box-decoration-break:clone;padding-top:0px;padding-right:0px;padding-bottom:0px;padding-left:0px;}.wp-block-kadence-advancedheading.kt-adv-heading395113_c650df-47 img.kb-inline-image, .wp-block-kadence-advancedheading.kt-adv-heading395113_c650df-47[data-kb-block=\"kb-adv-heading395113_c650df-47\"] img.kb-inline-image{width:150px;vertical-align:baseline;}<\/style>\n<p class=\"kt-adv-heading395113_c650df-47 wp-block-kadence-advancedheading\" data-kb-block=\"kb-adv-heading395113_c650df-47\">Subscribe to <a href=\"https:\/\/go.35s.be\/jtb\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>JorgeTechBits  newsletter<\/strong><\/a><\/p>\n<\/div><\/div>\n<\/div>\n\n\n\n<div style=\"font-family: Verdana, Geneva, sans-serif; font-size: 11px; line-height: 1.6; color: #333;\">\n    <p>\n        <strong>Disclaimer:<\/strong> \n        <em>I personally love to share my learnings, thoughts, and ideas; I get great satisfaction knowing someone has read and benefited from an article. This content is created entirely on my own time and in a personal capacity. The views expressed here are mine alone and do not represent the positions or opinions of my employer.<\/em>\n    <\/p>\n    <p>\n        In my professional role, I serve as a Workforce Transformation Solutions Principal for \n        <a href=\"https:\/\/www.dell.com\/en-us\/work\/learn\/by-service-type-deployment\" style=\"color: #007db8; font-weight: bold; text-decoration: none;\" target=\"_blank\" rel=\"noopener noreferrer nofollow\">Dell Technology Services<\/a>. \n        I am passionate about guiding organizations through complex technology transitions and \n        <a href=\"https:\/\/www.delltechnologies.com\/en-us\/what-we-do\/workforce-transformation.htm\" style=\"color: #007db8; font-weight: bold; text-decoration: none;\" target=\"_blank\" rel=\"noopener noreferrer nofollow\">Workforce Transformation<\/a>. \n        <a href=\"https:\/\/www.delltechnologies.com\/en-us\/index.htm\" style=\"color: #007db8; font-weight: bold; text-decoration: none;\" target=\"_blank\" rel=\"noopener noreferrer nofollow\">Learn more at Dell Technologies<\/a>.\n    <\/p>\n    <hr style=\"border: 0; border-top: 1px solid #ddd; margin: 12px 0;\">\n<\/div>\n\n\n\n<p>I did a general &#8220;understanding AI&#8221; session yesterday and one of the participants asked me an interesting question, which I do not think I have been asked before&#8230;  <\/p>\n\n\n\n<p><em><strong>What is the different between LLMs and what makes them unique and different from each other.<\/strong><\/em><\/p>\n\n\n\n<p>I thought it was a very valid question as Huggin face alone has over 1 million models in its library (although a lot of them are old already)  <\/p>\n\n\n\n<p>Hugging Face hosts a vast number of models because it aims to democratize access to state-of-the-art machine learning models for a wide range of tasks. The platform provides a centralized hub where developers and researchers can share, discover, and use models for various applications, including natural language processing (NLP), computer vision, and more.<\/p>\n\n\n\n<p>The models are different because they are designed to address specific tasks and use different architectures and training methods. For example, to take the models in Hugging Face, they have different categories (see HuggingFace: <a href=\"https:\/\/huggingface.co\/docs\/transformers\/v4.18.0\/en\/model_summary\" target=\"_blank\" rel=\"noopener noreferrer nofollow\">Summary of the models<\/a>)<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Autoregressive Models<\/strong>: These models, like GPT, are trained to predict the next token in a sequence, making them suitable for text generation tasks<a href=\"https:\/\/huggingface.co\/docs\/transformers\/v4.18.0\/en\/model_summary\" target=\"_blank\" rel=\"noopener noreferrer nofollow\">1<\/a>.<\/li>\n\n\n\n<li><strong>Autoencoding Models<\/strong>: Models like BERT fall into this category. They are trained to reconstruct the original input from a corrupted version, making them ideal for tasks like sentence classification and token classification<a href=\"https:\/\/huggingface.co\/docs\/transformers\/v4.18.0\/en\/model_summary\" target=\"_blank\" rel=\"noopener noreferrer nofollow\">1<\/a>.<\/li>\n\n\n\n<li><strong>Sequence-to-Sequence Models<\/strong>: These models use both an encoder and a decoder, making them suitable for tasks like translation, summarization, and question answering. T5 is an example of such a model<a href=\"https:\/\/huggingface.co\/docs\/transformers\/v4.18.0\/en\/model_summary\" target=\"_blank\" rel=\"noopener noreferrer nofollow\">1<\/a>.<\/li>\n\n\n\n<li><strong>Multimodal Models<\/strong>: These models can handle multiple types of input, such as text and images, and are designed for specific tasks that require this capability<a href=\"https:\/\/huggingface.co\/docs\/transformers\/v4.18.0\/en\/model_summary\" target=\"_blank\" rel=\"noopener noreferrer nofollow\">1<\/a>.<\/li>\n\n\n\n<li><strong>Retrieval-Based Models<\/strong>: These models are designed to retrieve relevant information from a large corpus of data, making them useful for tasks like information retrieval and question answering. <\/li>\n<\/ol>\n\n\n\n<p>Each model is optimized for different tasks and use cases, which is why there are so many models available on Hugging Face. This diversity allows users to find the best model for their specific needs and applications.<\/p>\n\n\n\n<p>The following table is my first attempt at providing model guidance to the task at hand: <\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Category<\/strong><\/td><td><strong>Basic Description<\/strong><\/td><td><strong>Model<\/strong><\/td><\/tr><tr><td><strong>Autoregressive Models<\/strong><\/td><td>A powerful model for text generation, capable of producing human-like text.<\/td><td>GPT-4, GPT-3, Mistral, Llama3<\/td><\/tr><tr><td><strong>Autoencoding Models<\/strong><\/td><td>Designed for tasks like sentence classification and token classification.&nbsp; RoBERTa is a version of BERT for better performance on NLP tasks.<\/td><td>BERT, RoBERTa<\/td><\/tr><tr><td><strong>Sequence-to-Sequence<\/strong><\/td><td>Suitable for translation, summarization tasks, and question answering.<\/td><td>T5, BART<\/td><\/tr><tr><td><strong>Multimodal Models<\/strong><\/td><td>Handles text, images, videos, and audio, suitable for various complex tasks.<\/td><td>Gemini, GPT-4,CLIP,<\/td><\/tr><tr><td><strong>Image Creation<\/strong><\/td><td>Generates images from textual descriptions, combining text and image modalities.<\/td><td>DALL-E, Stable Difusion,MidJourney<\/td><\/tr><tr><td><strong>Retrieval-Based Models<\/strong><\/td><td>Optimized for retrieving relevant information from large datasets.<\/td><td>DPR, BM25<\/td><\/tr><tr><td><strong>Financial Forecasting<\/strong><\/td><td>models are designed to handle various financial forecasting tasks and provide valuable insights for financial institutions.<\/td><td>FinGPT,BloombergGPT,LLM finance,<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>Again &#8211; this is an initial post which I will be exploring more in the future.&#8211; GREAT question THANK YOU<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">See Also: <\/h2>\n\n\n\n<p><\/p>\n\n\n\n<p><a href=\"https:\/\/jorgep.com\/blog\/what-are-large-language-models-llm\/\">What Are Large Language Models (LLM) <\/a><\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>I did a general &#8220;understanding AI&#8221; session yesterday and one of the participants asked me an interesting question, which I do not think I have been asked before&#8230; What is the different between LLMs and what makes them unique and different from each other. I thought it was a very valid question as Huggin face&#8230;<\/p>\n","protected":false},"author":2,"featured_media":427864,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_kad_blocks_custom_css":"","_kad_blocks_head_custom_js":"","_kad_blocks_body_custom_js":"","_kad_blocks_footer_custom_js":"","ngg_post_thumbnail":0,"_kad_post_transparent":"","_kad_post_title":"","_kad_post_layout":"","_kad_post_sidebar_id":"","_kad_post_content_style":"","_kad_post_vertical_padding":"","_kad_post_feature":"","_kad_post_feature_position":"","_kad_post_header":false,"_kad_post_footer":false,"_kad_post_classname":"","footnotes":""},"categories":[441],"tags":[471,930,842,871,876],"class_list":["post-510814","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-tech-talk","tag-ai","tag-ai-series","tag-chatgpt","tag-genai","tag-llm"],"taxonomy_info":{"category":[{"value":441,"label":"Tech Talk"}],"post_tag":[{"value":471,"label":"AI"},{"value":930,"label":"AI Series"},{"value":842,"label":"ChatGPT"},{"value":871,"label":"GenAi"},{"value":876,"label":"LLM"}]},"featured_image_src_large":["https:\/\/jorgep.com\/blog\/wp-content\/uploads\/FeaturedImage-Topic-AI-1024x512.png",1024,512,true],"author_info":{"display_name":"Jorge Pereira","author_link":"https:\/\/jorgep.com\/blog\/author\/jorge\/"},"comment_info":0,"category_info":[{"term_id":441,"name":"Tech Talk","slug":"tech-talk","term_group":0,"term_taxonomy_id":451,"taxonomy":"category","description":"","parent":0,"count":688,"filter":"raw","cat_ID":441,"category_count":688,"category_description":"","cat_name":"Tech Talk","category_nicename":"tech-talk","category_parent":0}],"tag_info":[{"term_id":471,"name":"AI","slug":"ai","term_group":0,"term_taxonomy_id":481,"taxonomy":"post_tag","description":"","parent":0,"count":154,"filter":"raw"},{"term_id":930,"name":"AI Series","slug":"ai-series","term_group":0,"term_taxonomy_id":940,"taxonomy":"post_tag","description":"","parent":0,"count":157,"filter":"raw"},{"term_id":842,"name":"ChatGPT","slug":"chatgpt","term_group":0,"term_taxonomy_id":852,"taxonomy":"post_tag","description":"","parent":0,"count":19,"filter":"raw"},{"term_id":871,"name":"GenAi","slug":"genai","term_group":0,"term_taxonomy_id":881,"taxonomy":"post_tag","description":"","parent":0,"count":84,"filter":"raw"},{"term_id":876,"name":"LLM","slug":"llm","term_group":0,"term_taxonomy_id":886,"taxonomy":"post_tag","description":"","parent":0,"count":18,"filter":"raw"}],"_links":{"self":[{"href":"https:\/\/jorgep.com\/blog\/wp-json\/wp\/v2\/posts\/510814","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/jorgep.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/jorgep.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/jorgep.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/jorgep.com\/blog\/wp-json\/wp\/v2\/comments?post=510814"}],"version-history":[{"count":0,"href":"https:\/\/jorgep.com\/blog\/wp-json\/wp\/v2\/posts\/510814\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/jorgep.com\/blog\/wp-json\/wp\/v2\/media\/427864"}],"wp:attachment":[{"href":"https:\/\/jorgep.com\/blog\/wp-json\/wp\/v2\/media?parent=510814"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/jorgep.com\/blog\/wp-json\/wp\/v2\/categories?post=510814"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/jorgep.com\/blog\/wp-json\/wp\/v2\/tags?post=510814"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}