Best Free Large Language Models

A User-Friendly LLM Comparison

Written by

Lili Marocsik

October 25nd, 2024

Last updated

language models comparison

Large language models are like super-smart text generators, using vast amounts of written text to understand and generate human-like language.

They are trained on massive data, so the more they learn, the better they understand and generate text.

We made sure to make this review non-technical and user friendly, as we assume that you are using the models rather than building them. 

All large language models had to master the same tasks, so we could directly compare the results:

  • For understanding context we asked: Which model am I using right now?

  • For measuring accuracy we asked for an impossible task: How do I connect the app splitwise to my Revolut account? (The correct answer would have been that it is not possible to connect these two accounts) 

  • To test its language capabilities, we asked for an introduction about aitoolssme.com (the prompt was: You are a copywriter, and you explain the benefits of the site in two simple sentences to the audience. This audience is non-technical, and the tone of voice is friendly, informal, and warm.)

  • And last and also least, we've checked if LLMs have developed humour in the meantime, which sadly deserves a definite 'NO'.

The biggest surprises were:

  1. Gemini is still extremely careful to not say anything wrong and 'isn't able to access such information' when asked about the model. 

  2. ChatGPT's language capabilities have gotten much better with me prompting it better as well. Where as Claude (aka copywriters darling) sounded not like a copywriter at all.

  3. The biggest difference between ChatGPT 3.5 and ChatGPT 4 is that model 3.5 hallucinates much more. 

LLMs have gotten much better since I've first reviewed them in March 2024, but I'm also a bit relieved that my first love ChatGPT, is still leading the LLM comparison.

And even though these tools are amazing at text generation, I promise this was written 100% by me. 

For our big LLM comparison, we grouped the tools into the following categories:

  • Top 5 free LLMs (with ChatGPT-4 taking the lead, although it is generally a paid tool, OpenAI provides limited free daily access to it)

  • Paid tools (such as Gemini Advanced or Copilot are further below)

  • Older models (like Llama 2 and Claude 3 Opus in the very end) 

Top 5 free LLMs:

Tool Name Great For Access via Price Per Month
ChatGPT 4 Context Awareness chatgpt.com $20
Claude 3.5 Sonnet High Accuracy Claude.ai Free
ChatGPT 3.5 Context Awareness chatgpt.com Free
Gemini 1.5 Pro Accuracy (less hallucinations) gemini.google.com Free
Llama 3 Fluency groq.com Free

Older Models

Llama 2

Claude Opus 3

The Free LLM Comparison:

1

Setup/Onboarding

5/5

User Experience

5/5

Tool Performance

4/5

Overall score:

4.75/5

Use Cases

Use ChatGPT4 for information retrival, coding, image generation, translation, summaries. Short for everything that you don’t want to spend much time on. Most AI copy writing tools language capabilities hardly have an advantage over ChatGPT.

Support

Multi-Language Support

Key Features

  • Advanced Voice Mode
  • Background session feature' for voice mode
  • Creative inspiration
  • Tailored advice
  • Learning opportunities
  • Professional input
  • Conversation continuity
  • Improved helpfulness
  • Memory management
  • In-app features: image recognition
  • Image generation
  • Dall E 3: Editing Images
  • Custom settings
  • Improved Understanding
  • Greater Knowledge Base
  • Advanced Language Skills & Broader Language Coverage
  • Enhanced Multitasking
  • Reduced Biases
  • Higher Efficiency

Pros & Cons

Pros:

  • smoother and more accurate answers than version 3.5 indeed
  • image generation superb (Dall E 3 integration)
  • formulas and codes are much better with this version
  • great context understanding
  • high accuracy
  • surprisingly good language when prompted well
  • sub version 4o can interpret and read documents, presentations and images accurately

Cons:

  • even with Web access Chrome extension, system claims to not have real time data
  • Very annoying pop-up (maxai.me) that blocks almost any query
  • struggling with humor
  • unsure of free scope
  • multimodality (in Model 4o) was first advertised including audio and video

Price Point

ChatGPT Plus
$20
Free Trial
When asked about the free trial scope, ChatGPT didn't want to give a definite answer: occassionally Open AI offers trials or temporary access to GPT-4.

Conclusion

When Gemini is insecure and Llama hallucinates, ChatGPT 4 will deliver reliably. The only issue I have is that I'm never sure how much of the free tokens I've used up already and when my little daily love relationship will end. Also, if you decide to sign up, Dall E is still in the race for the best image generator! When I've tested the new advanced voice mode, ChatGPT mistakenly thought I'm prompting and it's replying via text, when we were actually communicating via the new voice feature. Just a small hickup, but in general it works great. And when asked about the free trial scope, ChatGPT didn't want to give a definite answer: occassionally Open AI offers trials or temporary access to GPT-4."I feel very spoiled if I complain about little things like this, what I want to say is I’m that version 3.5. is doing a decent enough job too. Maybe that’s enough for you. Also very cool: the custom settings where you can add things you would like for the system to always consider and information you can add about yourself and your demands.

ChatGPT 4

2

Setup/Onboarding

3

User Experience

4

Tool Performance

4

Overall score:

3.75

Use Cases

Strong performance in reasoning, summarizing information and solving complex coding tasks

Support

Helpdesk, Message Request

Key Features

  • Good reasoning
  • High undergraduate-level knowledge
  • Advanced coding proficiency
  • Quality writing capabilities
  • Natural tone in communications
  • Enhanced understanding of humor and complex instructions
  • Highest intelligence among Claude 3 models
  • Mulitmodal (text & images, no video or audio)
  • Customer Styles
  • Profile Preferences
  • Google Docs Integration

Pros & Cons

Pros:

  • High accuracy
  • Good understanding of context
  • Anthropic offers their newest version for free

Cons:

  • Requires a lot of personal data to login
  • language weaker than expected

Price Point

Claude 3.5 Sonnet
Free
Free Trial
Free, but scope unsure

Conclusion

Good work Anthropic, 2nd place in our free LLM comparison is pretty good! But, even though copywriters always praise the language capabilities of Claude, I didn't like how it did with the language test (an introduction about this website). It was too much fluff and not enough focus on the benefits. The good part is, that the Claude 3.5 Sonnet understands me very well and doen't hallucinate (at least not in this test). The login process is a bit annoying: Next to my google login, it requires extra data such as phone number and birth date and also sends a verification code to my mobile. Also it makes you acknowledge its usage policy before logging in. Why does it need so much of my data?

3

Setup/Onboarding

5/5

User Experience

3/5

Tool Performance

3/5

Overall score:

3.75/5

Use Cases

Use ChatGPT as a chatbot on your website (developer required), to plan your next vacation or simply to ask it anything. But be aware of hallucinations.

Support

Multi-Language Support

Key Features

  • Natural Language Understanding
  • Conversational Context
  • Multilingual Support
  • Customization
  • Versatility
  • API Integration
  • Scalability
  • 24/7 Availability

Pros & Cons

Pros:

  • understands queries well
  • feels most human
  • decent context understanding
  • free always

Cons:

  • answers not always accurate, hallucinations
  • struggles with providing correct formulas and code

Price Point

Version 3.5
free
Free Trial
Version 3.5 is completely free

Conclusion

You will mostly not be using this model anymore, as ChatGPT 4 is available on the free plan. Well, mostly. OpenAI is not very clear about the free usage scope of GPT 4, every now and then I receive a message that the limit is reached. The biggest difference of the two models to me are the many hallucinations of the older model and coding tasks are not reliable with it either. Also, I feel like it has been hallucinating a lot lately, I feel like almost more than before.

ChatGPT 3.5

4

Setup/Onboarding

5/5

User Experience

3/5

Tool Performance

2.75/5

Overall score:

3.5/5

Use cases

Can generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way. Research and free image generation.

Support

Help Center

Key Features

  • AI chatbot
  • Advanced AI
  • Interpreting user prompts
  • Generating accurate answers
  • Conversational capabilities
  • Responding to queries

Pros & Cons

Pros:

  • will say if unsure (less hallucinations)
  • no login required

Cons:

  • Responses take longer to generate
  • Responses often generic
  • Unable to answer many requests

Price Point

Gemini
free
Free Trial
Completely free. The free tier allows you to use Gemini 1.5 Pro with a limit of 2 requests per minute (RPM) and 50 requests per day

Conclusion

Since I have signed up to ChatGPT 4, I didn’t even think about Gemini for an instance! Also because when I last used Gemini (in March 2024) it always answered ‘I’m a language model, I can’t help you with that’. Even for the simplest things like a text summary. It looks like the little scandal about their too-woke image generator (it generated black Vikings to be unbiased) hit them hard. Now Gemini is afraid to say anything. Let’s see what Google will do next to shake things up, I’m pretty sure they’ll throw something new on the market soon. What can I say, Gemini is still the shy little bird that is afraid to chirp. When I last tested it (March 24) it often was referred to being a language model and not being able to access such information. I am surprised it's still the same in November 24. I don't understand how Google is loosing the LLM game so easily. Completely misunderstood the language task: thought the website is about copywriting and pitched getting compelling copy. (I can't hear the word compelling anymore, that is sooo ChatGPT 3). Gemini is still pretty weak compared to ChatGPT or Claude. But it was the only LLM which was a bit funny, but sadly it didn't follow the prompt and the joke was not about AI. (A man walks into a library and asks the librarian for books about paranoia. The librarian whispers, “They’re right behind you!”

Gemini 1.5 Pro (former Bard)

Did you know that you can directly access Llama 3 via www.groq.com?

5

Setup/Onboarding

3

User Experience

4

Tool Performance

1.75

Overall score:

3/5

Use Cases

• Content creation • Research assistance • Language translation • Writing assistance • Chatbots • Customer service • Education • Marketing • Technical documentation • Brainstorming

Support

The bot itself

Key Features

  • Language understanding
  • Sentiment analysis
  • Conversational flow
  • Error tolerance
  • Limited domain knowledge
  • No personal opinions
  • Continuous improvement
  • Entity recognition
  • Knowledge graph
  • Contextual understanding
  • 128K Context Length
  • 8-Language Support
  • 35% Faster Response Speed
  • Advanced Llama Guard Security

Pros & Cons

Pros:

  • cut-off date for internet access about mid 2022 (according to the tool itself)
  • free
  • writing style is more approachable and sounds better than ChatGPT's
  • very fast

Cons:

  • doesn't have a direct interface, so either use groq.com or install the program
  • not 100% accurate information (I asked about canvas background remover)
  • understands only about 70% of my queries correctly
  • many hallucinations
  • weak language capabilities
  • can't access directly

Price Point

Free
Free Trial
via groq.com

Conclusion

At times Meta's newest language model is extremely fast, other times we are queuing a bit. I thought it was really funny that the LLM itself has listed 'Limited domain knowledge' and 'No personal opinions' as it's key features. Maybe this is a little punch at Google's woke Gemini. When I first tested Llama 3 a few months ago, I was much more impressed. It was very fast and I liked that the language sounded better then the then current version of ChatGPT. Now, I'm disappointed, because it failed the accuracy test and also the language was very complicated and sounded swollen. Looks like Mark Zuckerberg has to make his homework to keep playing among his top LLM homies.

Large Language Model Use Cases and Anything Beyond ChatGPT

Lili

who calls ChatGPT her “new best friend”.

I admit I might be a bit biased as ChatGPT was the first LLM I used and with that a huge revelation for me.

It's like my first digital teddy bear, obviously hard to criticize or favor another. In fact still the tool I use every day for research or to generate beautiful pictures via Dall E 3 (only with the paid version).

But it’s not the best LLM for every use case, so let’s take a look what Llama 3, Gemini, and Claude can do better.

LLM Alternatives to ChatGPT

Google's Gemini has started a bit bumpy: The responses were very generic, and the response time was long. And let’s not even talk about their funny bias debacle: overly woke answers and image generations made the internet community joke about the Google team. So much so that they switched off the image generation entirely for now.

Today, Gemini response times has improved a lot. It’s still the first LLM to admit if it isn’t sure about the answer and rather opt-out of answering. This can be annoying, as some of the questions are very simple, but if you want to be absolutely sure and have been burned by ChatGPT’s hallucinations, Gemini might be a good alternative.

I’ve tried Gemini Advance in Google Workspace apps like Gmail, Docs, etc. It’s supposed to organize your sheets and contacts and automatically open the appropriate Google apps. Didn’t work for me at all, e.g. I asked it to open maps and show me the way to a location, but it didn’t understand the intent. Re-organising a sheet caused an absolute chaos, I can not recommend it yet. I’m sure it will improve soon.

Beyond Google And OpenAI

I was mocking Meta for Llama 2’s complicated setup (you have to run the model locally on your computer, which is not easy for noncoders). I’m pacified with the newest version Llama 3, because www.groq.com let’s you access the tool easily via an interface. So test and use the tool there if you like. It’s super fast and mostly accurate and, therefore a true alternative to ChatGPT. I also hear of more and more people using it.

LLama 3 Access: via www.groq.com for free

Anthropic’s Claude is the marketer’s and copywriter’s favorite LLM and rightfully so: it sounds much more natural than ChatGPT.

Claude 3.5 Sonnet Access: via www.poe.com for free

Since the launch of Microsoft’s Copilot, which is built on top of ChatGPT 4 and the free version is limited to 30 prompts per month.

Perplexity.ai is as if Google Search and ChatGPT had a baby: it provides you with the most accurate answer to your question, including videos, pictures, and generated text. The more specific the question, the better Perplexity.ai can provide a relevant answer, so just like with ChatGPT, give the tool some background information if you want it to excel.

Perplexity.ai Access: via www.perplexity.ai

Microsoft’s Copilot is based on ChatGPT 4 but offers ChatGPT Turbo access to paying premium users.

It’s perfectly bridging Microsoft with ChatGPT through these features and integrations:
• AI assistance in Word, Excel, PowerPoint, OneNote, Outlook
• Unified AI experience across Windows 11, Edge, Bing
• Image Generation: DALL-E 3 for AI image creation and editing
• AI-powered multimodal results from text, voice, and images

Paid Large Language Models

-

Setup/Onboarding

5

User Experience

3

Tool Performance

3

Overall score:

3.75/5

Use Cases

Use Gemini Advanced via the Google app on your phone and it will manage all applications.

Support

Google One customer support

Key Features

  • One million token window (vs. standard 128,000 token window)
  • Access to the standard Gemini model in Gmail, Docs, and Sheets
  • 2TB of cloud storage
  • Stronger Reasoning and Comprehension
  • Improved Creative Performance
  • Enhanced Mobile Accessibility: features a dedicated mobile app
  • Smart Scan
  • Merging Previous Data Values

Pros & Cons

Pros:

  • Less hallucinations than ChatGPT (tool will more likely tell if it doesn't know)
  • Much better in copywriting than ChatGPT

Cons:

  • Often doesn't have an answer
  • Integration into other apps is not seamless as promised
  • Rather slow
  • At times answers are quite unspecific

Price Point

Gemini Advanced (part of Google One Premium)
$20
Free Trial
first 2 months

Conclusion

Similar to ChatGPT 4o, Gemini Advanced comes along with good intentions, but fails to deliver. Their claim 'Use Gemini Advanced via the Google app on your phone and it will manage all applications' doesn't work seamlessly in reality. For example, I told Gemini Advanced that I want to go home and it opened the Google maps app for me, but it gave me direction for driving by car. I asked it to remember that I don't have a car and that it should consider this in the future. But the next time around, it made the same mistake again. Slightly smoother language skills than ChatGPT, sounds more straight forward.

-

Setup/Onboarding

4/5

User Experience

4/5

Tool Performance

4/5

Overall score:

4/5

Use cases

Great for coding and therefore web developers love Copilot. The technology behind Copilot is powered by OpenAI's Codex model, which assists developers by suggesting complete lines or blocks of code as they type. On top you can download provided data immediately. It also provides follow up questions so you don't have to type them in.

Support

The tool itself ;)

Key Features

  • Answers complex questions
  • Suggests and autocompletes lines or blocks of code
  • Generates creative content
  • Multilingual
  • Can access third party services via plugins
  • Summarises information
  • Re-write content
  • Research
  • Enhanced Integration with Excel
  • Improved File Handling
  • Enhanced Data Protection
  • Copilot in Outlook
  • Loop Integration

Pros & Cons

Pros:

  • Answers complex questions
  • Suggests and autocompletes lines or blocks of code
  • Generates creative content
  • Multilingual
  • Can access third party services via plugins
  • Summarises information
  • Re-write content
  • Research

Cons:

  • Microsoft login required
  • limited to 30 queries per month (free plan)
  • no easy option to change display language
  • formulations still sound clunky
  • weak for social media content generation

Price Point

Bing Chat Enterprise
$5/month/user
Free Trial
30 free prompts per month

Conclusion

If you are on a budget and don’t want to spend $20 on a monthly subscription, this is your (limited) ticket into the advanced LLM party. I like the interface better than ChatGPTs and it’s really helpful to get follow up questions from the system. If you are wondering if this is the github ‘Copilot’, yes it is. Github subsidiary of Microsoft and the tool was first released there.

Copilot (Microsoft)

Interview banner ai tools sme

Do you have any tips, tricks or hacks for using AI?

Fedor pak interview

Fedor Pak, CEO Chatfuel: Recognize the potential of AI and consider it like a young, inexperienced, yet brilliant employee who can significantly enhance or even replace your entire team. Don't set high expectations immediately, and start utilizing it as soon as possible in areas where it can already be beneficial.

Krish Ramineni, CEO Fireflies.AI: Be as objective as possible when interacting with AI. AI picks up on your biases and can sometimes BS answers. It's very good at making things seem accurate. Over time this will get better, but we have to be mindful in the way we provide instructions.

Thomas Bornheim, CEO 42 Heilbronn: Be super friendly, and your results will turn out better. It's surely a psychological effect - but it is also proven that these tools work better when you bribe them.

Older Large Language Models

-

Setup/Onboarding

3

User Experience

3

Tool Performance

2

Overall score:

2.75

Use Cases

According to the tool itself: Providing informative and substantive responses to questions across diverse topics - Ability to understand and analyze complex queries or prompts - Offering clear and well-structured answers formatted as needed - Adapting communication style to the user's needs and preferences (not in our eyes though) - Maintaining an objective, impartial and helpful stance - Constantly expanding knowledge base through training on high-quality data

Support

Official Troubleshooting Guide, Community Forums

Key Features

  • Markdown formatting
  • Content creation
  • Prompt engineering
  • Model variants
  • Self-awareness
  • Capabilities overview

Pros & Cons

Pros:

  • rather fast
  • Formulations sound a bit better than ChatGPT (no adjective stuffing)

Cons:

  • Requires phone number and age verification despite Google login
  • Quite long answers for simple questions
  • Has memory, but doesn't build up on it
  • Displays system prompt and clutters the interface with it

Price Point

Claude 3
Limited to approx. 10 prompts daily
Claude 3 Opus / Pro
€18/month + VAT
Claude for Enterprise
Tailored Pricing
Free Trial
daily message limit based on demand (I had about 10 messages per day)

Conclusion

I wasn't able to test Claude until now, because it hasn't launched in Europe and it didn't sound promising enough to get a VPN just for that. And in my opinion it's much weaker than ChatGPT or Llama 3. I don't understand how it clearly has a memory and knows what it answered last but refuses to build up on it. Also really had issues understanding my questions, where other LLMs wouldn't anymore. I assume that with new data flowing in from Europe, where the LLM was newly released, the performance may decline in the short term but will improve again over time. Reviewed May 16th, 2024

-

Setup/Onboarding

1/5

User Experience

3/5

Tool Performance

4/5

Overall score:

2.75/5

Use case

Llama 2 is designed for research and commercial purposes. It is able to run without internet and can access your local computer data.Interesting to know: Microsoft is providing resources and supports Llama.

Support

Tutorials, knowledge base

Key Features

  • AI chatbot
  • Advanced AI
  • Interpreting user prompts
  • Generating accurate answers
  • Conversational capabilities
  • Responding to queries
  • Enhanced AI Capabilities
  • Advanced Query Handling
  • Adaptive Learning

Pros & Cons

Pros:

  • works also offline

Cons:

  • requires signup
  • issues with download

Price Point

Llama 2
free
Free Trial
Completely free

Conclusion

The only reason you might prefer the complicated setup of Llama2 compared to ChatGPT or Gemini is if you are experienced and are able to code, then maybe LLama can add value. For everyone starting out with LLMs, there is no need for the extra hassle.

LLama 2 (from Meta)