 {"id":510814,"date":"2024-10-25T09:13:25","date_gmt":"2024-10-25T16:13:25","guid":{"rendered":"https:\/\/jorgep.com\/blog\/?p=510814"},"modified":"2025-01-17T12:02:57","modified_gmt":"2025-01-17T19:02:57","slug":"what-makes-the-many-llms-different","status":"publish","type":"post","link":"https:\/\/jorgep.com\/blog\/what-makes-the-many-llms-different\/","title":{"rendered":"What Makes the many LLMs different?"},"content":{"rendered":"\n<div class=\"wp-block-columns has-theme-palette-7-background-color has-background is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<p>Part of: <strong> <a href=\"https:\/\/jorgep.com\/blog\/series-ai-learnings\/\">AI Learning Series Here<\/a><\/strong><\/p>\n\n\n<style>.kadence-column395113_43ef2d-d5 > .kt-inside-inner-col,.kadence-column395113_43ef2d-d5 > .kt-inside-inner-col:before{border-top-left-radius:0px;border-top-right-radius:0px;border-bottom-right-radius:0px;border-bottom-left-radius:0px;}.kadence-column395113_43ef2d-d5 > .kt-inside-inner-col{column-gap:var(--global-kb-gap-sm, 1rem);}.kadence-column395113_43ef2d-d5 > .kt-inside-inner-col{flex-direction:column;}.kadence-column395113_43ef2d-d5 > .kt-inside-inner-col > .aligncenter{width:100%;}.kadence-column395113_43ef2d-d5 > .kt-inside-inner-col:before{opacity:0.3;}.kadence-column395113_43ef2d-d5{position:relative;}@media all and (max-width: 1024px){.kadence-column395113_43ef2d-d5 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}@media all and (max-width: 767px){.kadence-column395113_43ef2d-d5 > .kt-inside-inner-col{flex-direction:column;justify-content:center;}}<\/style>\n<div class=\"wp-block-kadence-column kadence-column395113_43ef2d-d5\"><div class=\"kt-inside-inner-col\"><style>.wp-block-kadence-advancedheading.kt-adv-heading510545_6813a5-28, .wp-block-kadence-advancedheading.kt-adv-heading510545_6813a5-28[data-kb-block=\"kb-adv-heading510545_6813a5-28\"]{font-size:var(--global-kb-font-size-sm, 0.9rem);font-style:normal;}.wp-block-kadence-advancedheading.kt-adv-heading510545_6813a5-28 mark.kt-highlight, .wp-block-kadence-advancedheading.kt-adv-heading510545_6813a5-28[data-kb-block=\"kb-adv-heading510545_6813a5-28\"] mark.kt-highlight{font-style:normal;color:#f76a0c;-webkit-box-decoration-break:clone;box-decoration-break:clone;padding-top:0px;padding-right:0px;padding-bottom:0px;padding-left:0px;}.wp-block-kadence-advancedheading.kt-adv-heading510545_6813a5-28 img.kb-inline-image, .wp-block-kadence-advancedheading.kt-adv-heading510545_6813a5-28[data-kb-block=\"kb-adv-heading510545_6813a5-28\"] img.kb-inline-image{width:150px;vertical-align:baseline;}<\/style>\n<p class=\"kt-adv-heading510545_6813a5-28 wp-block-kadence-advancedheading\" data-kb-block=\"kb-adv-heading510545_6813a5-28\">Quick Links:&nbsp;<a href=\"https:\/\/jorgep.com\/blog\/resources-for-learning-ai\/\">Resources for Learning AI<\/a> | <a href=\"https:\/\/jorgep.com\/blog\/keeping-up-with-ai\/\">Keep up with AI<\/a> | <a href=\"https:\/\/jorgep.com\/blog\/list-of-ai-tools\/\" data-type=\"post\" data-id=\"402818\">List of AI Tools<\/a><\/p>\n<\/div><\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\"><div class=\"wp-block-template-part\"><style>.wp-block-kadence-advancedheading.kt-adv-heading395113_c650df-47, .wp-block-kadence-advancedheading.kt-adv-heading395113_c650df-47[data-kb-block=\"kb-adv-heading395113_c650df-47\"]{text-align:center;font-size:var(--global-kb-font-size-md, 1.25rem);line-height:60px;font-style:normal;background-color:#f5a511;}.wp-block-kadence-advancedheading.kt-adv-heading395113_c650df-47 mark.kt-highlight, .wp-block-kadence-advancedheading.kt-adv-heading395113_c650df-47[data-kb-block=\"kb-adv-heading395113_c650df-47\"] mark.kt-highlight{font-style:normal;color:#f76a0c;-webkit-box-decoration-break:clone;box-decoration-break:clone;padding-top:0px;padding-right:0px;padding-bottom:0px;padding-left:0px;}.wp-block-kadence-advancedheading.kt-adv-heading395113_c650df-47 img.kb-inline-image, .wp-block-kadence-advancedheading.kt-adv-heading395113_c650df-47[data-kb-block=\"kb-adv-heading395113_c650df-47\"] img.kb-inline-image{width:150px;vertical-align:baseline;}<\/style>\n<p class=\"kt-adv-heading395113_c650df-47 wp-block-kadence-advancedheading\" data-kb-block=\"kb-adv-heading395113_c650df-47\">Subscribe to <a href=\"https:\/\/go.35s.be\/jtb\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>JorgeTechBits  newsletter<\/strong><\/a><\/p>\n<\/div><\/div>\n<\/div>\n\n\n\n<div style=\"font-family: Verdana, Geneva, sans-serif; font-size: 11px;\"><b>Disclaimer<\/b>: \u00a0I work for <a href=\"https:\/\/www.dell.com\/en-us\/work\/learn\/by-service-type-deployment\">Dell Technology Services<\/a> as a Workforce Transformation Solutions Principal.\u00a0 \u00a0 It is my passion to help guide organizations\u00a0through the current technology transition specifically as it relates to <a href=\"https:\/\/www.delltechnologies.com\/en-us\/what-we-do\/workforce-transformation.htm\">Workforce Transformation<\/a>.\u00a0 Visit <a href=\"https:\/\/www.delltechnologies.com\/en-us\/index.htm\">Dell Technologies<\/a>\u00a0site for more information.\u00a0 Opinions are my own and not the views of my employer.<\/div>\n<div><\/div><br>\n\n\n\n<p>I did a general &#8220;understanding AI&#8221; session yesterday and one of the participants asked me an interesting question, which I do not think I have been asked before&#8230;  <\/p>\n\n\n\n<p><em><strong>What is the different between LLMs and what makes them unique and different from each other.<\/strong><\/em><\/p>\n\n\n\n<p>I thought it was a very valid question as Huggin face alone has over 1 million models in its library (although a lot of them are old already)  <\/p>\n\n\n\n<p>Hugging Face hosts a vast number of models because it aims to democratize access to state-of-the-art machine learning models for a wide range of tasks. The platform provides a centralized hub where developers and researchers can share, discover, and use models for various applications, including natural language processing (NLP), computer vision, and more.<\/p>\n\n\n\n<p>The models are different because they are designed to address specific tasks and use different architectures and training methods. For example, to take the models in Hugging Face, they have different categories (see HuggingFace: <a href=\"https:\/\/huggingface.co\/docs\/transformers\/v4.18.0\/en\/model_summary\">Summary of the models<\/a>)<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Autoregressive Models<\/strong>: These models, like GPT, are trained to predict the next token in a sequence, making them suitable for text generation tasks<a href=\"https:\/\/huggingface.co\/docs\/transformers\/v4.18.0\/en\/model_summary\">1<\/a>.<\/li>\n\n\n\n<li><strong>Autoencoding Models<\/strong>: Models like BERT fall into this category. They are trained to reconstruct the original input from a corrupted version, making them ideal for tasks like sentence classification and token classification<a href=\"https:\/\/huggingface.co\/docs\/transformers\/v4.18.0\/en\/model_summary\">1<\/a>.<\/li>\n\n\n\n<li><strong>Sequence-to-Sequence Models<\/strong>: These models use both an encoder and a decoder, making them suitable for tasks like translation, summarization, and question answering. T5 is an example of such a model<a href=\"https:\/\/huggingface.co\/docs\/transformers\/v4.18.0\/en\/model_summary\">1<\/a>.<\/li>\n\n\n\n<li><strong>Multimodal Models<\/strong>: These models can handle multiple types of input, such as text and images, and are designed for specific tasks that require this capability<a href=\"https:\/\/huggingface.co\/docs\/transformers\/v4.18.0\/en\/model_summary\">1<\/a>.<\/li>\n\n\n\n<li><strong>Retrieval-Based Models<\/strong>: These models are designed to retrieve relevant information from a large corpus of data, making them useful for tasks like information retrieval and question answering. <\/li>\n<\/ol>\n\n\n\n<p>Each model is optimized for different tasks and use cases, which is why there are so many models available on Hugging Face. This diversity allows users to find the best model for their specific needs and applications.<\/p>\n\n\n\n<p>The following table is my first attempt at providing model guidance to the task at hand: <\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Category<\/strong><\/td><td><strong>Basic Description<\/strong><\/td><td><strong>Model<\/strong><\/td><\/tr><tr><td><strong>Autoregressive Models<\/strong><\/td><td>A powerful model for text generation, capable of producing human-like text.<\/td><td>GPT-4, GPT-3, Mistral, Llama3<\/td><\/tr><tr><td><strong>Autoencoding Models<\/strong><\/td><td>Designed for tasks like sentence classification and token classification.&nbsp; RoBERTa is a version of BERT for better performance on NLP tasks.<\/td><td>BERT, RoBERTa<\/td><\/tr><tr><td><strong>Sequence-to-Sequence<\/strong><\/td><td>Suitable for translation, summarization tasks, and question answering.<\/td><td>T5, BART<\/td><\/tr><tr><td><strong>Multimodal Models<\/strong><\/td><td>Handles text, images, videos, and audio, suitable for various complex tasks.<\/td><td>Gemini, GPT-4,CLIP,<\/td><\/tr><tr><td><strong>Image Creation<\/strong><\/td><td>Generates images from textual descriptions, combining text and image modalities.<\/td><td>DALL-E, Stable Difusion,MidJourney<\/td><\/tr><tr><td><strong>Retrieval-Based Models<\/strong><\/td><td>Optimized for retrieving relevant information from large datasets.<\/td><td>DPR, BM25<\/td><\/tr><tr><td><strong>Financial Forecasting<\/strong><\/td><td>models are designed to handle various financial forecasting tasks and provide valuable insights for financial institutions.<\/td><td>FinGPT,BloombergGPT,LLM finance,<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>Again &#8211; this is an initial post which I will be exploring more in the future.&#8211; GREAT question THANK YOU<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">See Also: <\/h2>\n\n\n\n<p><\/p>\n\n\n\n<p><a href=\"https:\/\/jorgep.com\/blog\/what-are-large-language-models-llm\/\">What Are Large Language Models (LLM) <\/a><\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>I did a general &#8220;understanding AI&#8221; session yesterday and one of the participants asked me an interesting question, which I do not think I have been asked before&#8230; What is the different between LLMs and what makes them unique and different from each other. I thought it was a very valid question as Huggin face&#8230;<\/p>\n","protected":false},"author":2,"featured_media":427864,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_kad_blocks_custom_css":"","_kad_blocks_head_custom_js":"","_kad_blocks_body_custom_js":"","_kad_blocks_footer_custom_js":"","ngg_post_thumbnail":0,"episode_type":"","audio_file":"","podmotor_file_id":"","podmotor_episode_id":"","cover_image":"","cover_image_id":"","duration":"","filesize":"","filesize_raw":"","date_recorded":"","explicit":"","block":"","itunes_episode_number":"","itunes_title":"","itunes_season_number":"","itunes_episode_type":"","_kad_post_transparent":"","_kad_post_title":"","_kad_post_layout":"","_kad_post_sidebar_id":"","_kad_post_content_style":"","_kad_post_vertical_padding":"","_kad_post_feature":"","_kad_post_feature_position":"","_kad_post_header":false,"_kad_post_footer":false,"_kad_post_classname":"","footnotes":""},"categories":[441],"tags":[471,930,842,871,876],"class_list":["post-510814","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-tech-talk","tag-ai","tag-ai-series","tag-chatgpt","tag-genai","tag-llm"],"taxonomy_info":{"category":[{"value":441,"label":"Tech Talk"}],"post_tag":[{"value":471,"label":"AI"},{"value":930,"label":"AI Series"},{"value":842,"label":"ChatGPT"},{"value":871,"label":"GenAi"},{"value":876,"label":"LLM"}]},"featured_image_src_large":["https:\/\/jorgep.com\/blog\/wp-content\/uploads\/Topic-ArtificialIntelligenceBanner-900x200-1.png",900,200,false],"author_info":{"display_name":"Jorge Pereira","author_link":"https:\/\/jorgep.com\/blog\/author\/jorge\/"},"comment_info":0,"category_info":[{"term_id":441,"name":"Tech Talk","slug":"tech-talk","term_group":0,"term_taxonomy_id":451,"taxonomy":"category","description":"","parent":0,"count":670,"filter":"raw","cat_ID":441,"category_count":670,"category_description":"","cat_name":"Tech Talk","category_nicename":"tech-talk","category_parent":0}],"tag_info":[{"term_id":471,"name":"AI","slug":"ai","term_group":0,"term_taxonomy_id":481,"taxonomy":"post_tag","description":"","parent":0,"count":141,"filter":"raw"},{"term_id":930,"name":"AI Series","slug":"ai-series","term_group":0,"term_taxonomy_id":940,"taxonomy":"post_tag","description":"","parent":0,"count":144,"filter":"raw"},{"term_id":842,"name":"ChatGPT","slug":"chatgpt","term_group":0,"term_taxonomy_id":852,"taxonomy":"post_tag","description":"","parent":0,"count":18,"filter":"raw"},{"term_id":871,"name":"GenAi","slug":"genai","term_group":0,"term_taxonomy_id":881,"taxonomy":"post_tag","description":"","parent":0,"count":78,"filter":"raw"},{"term_id":876,"name":"LLM","slug":"llm","term_group":0,"term_taxonomy_id":886,"taxonomy":"post_tag","description":"","parent":0,"count":14,"filter":"raw"}],"_links":{"self":[{"href":"https:\/\/jorgep.com\/blog\/wp-json\/wp\/v2\/posts\/510814","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/jorgep.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/jorgep.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/jorgep.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/jorgep.com\/blog\/wp-json\/wp\/v2\/comments?post=510814"}],"version-history":[{"count":0,"href":"https:\/\/jorgep.com\/blog\/wp-json\/wp\/v2\/posts\/510814\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/jorgep.com\/blog\/wp-json\/wp\/v2\/media\/427864"}],"wp:attachment":[{"href":"https:\/\/jorgep.com\/blog\/wp-json\/wp\/v2\/media?parent=510814"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/jorgep.com\/blog\/wp-json\/wp\/v2\/categories?post=510814"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/jorgep.com\/blog\/wp-json\/wp\/v2\/tags?post=510814"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}