With Claude 2 and improvements made to Bard, let's see which one does various tasks better.

Discover More From Me:
🛠️ Explore hundreds of AI Tools: https://futuretools.io/
📰 Weekly Newsletter: https://www.futuretools.io/newsletter
😊 Discord Community: https://futuretools.io/discord
🐤 Follow me on Twitter: https://twitter.com/mreflow
🧵 Follow me on Threads: https://www.threads.net/@mr.eflow
🐺 My personal blog: https://mattwolfe.com/

Outro music generated by Mubert https://mubert.com/render

#AINews #AITools #GenerativeArt

Time Stamps:
0:00 Intro
1:20 Cost
2:00 Token Limit
3:14 Web Browsing
5:16 Summarizing Long Content
11:02 Image Recognition
14:14 Data Analysis
18:52 Creativity
23:07 Basic Coding
26:12 Accuracy
30:31 Availability
30:59 What I Use
33:27 Conclusion

So recently, ChatGPT has gotten some competition, specifically with Anthropx Cloud 2 coming out and also with Bard getting a bunch of upgrades.

So I figured now would be a good time to sort of pit them against each other.

Now, before I get too deep into this, this is by no means scientific whatsoever.

Obviously, there's an infinite combination of potential prompts that you can try and different large language models are going to perform differently based on all sorts of criteria.

There are obviously so many other criteria that I could test it on that I'm probably just not thinking of.

And as more large language models get released and as these models get improved, I'll probably remake a video like this in the future to better compare them as they improve over time.

Now for this video, I'm going to be comparing them on cost token limit, the ability to browse the web, summarizing long content, image recognition, data analysis, creativity, the ability to code.

coder myself, but I will be able to test it

The accuracy, which again, is a very, very difficult test to do because different models are going to be more or less accurate in different areas that I could potentially test them in.

So this is going to be a very simple analysis of accuracy.

And then finally, these large language models don't help you if they're not available to you.

So we're also going to compare them on availability.

So let's start with what we know right off the bat.

ChatGPT, if you're using the 3.5 model, you can access it for free.

You don't get access to code interpreter.

You don't get access to all the plugins, and when the servers are bogged down you're less likely to get in if you're on the free version, but it is available for free.

If you want to use GPT 4 it costs 20 bucks a month, but you also have access to all the plugins, you also have access to code interpreter, and when web browsing is available you have access to web browsing.

Cloud 2 is also currently available for free for anybody to use.

And then there's BARD which is also currently available for free for anybody to test and use.

If you're using the free version of chat GPT or 3.5, you have access to 4, 096 tokens.

Now this is not an exact science, but as a rule of thumb, the amount of words that you can combine to input and receive as an output from the large language model is roughly 75% of the token limit.

So with GPT 3.5 with a token limit of 4, 096, you roughly will get an input slash output of about 3000 words.

If you're using GPT for the model we have available right now instead of chat GPT gives us access to 8, 192 tokens, which is roughly about 6, 000 words combined input and output text and CLOD-2.

Now this 1 really, really shines in that it gives us access to a hundred thousand tokens or roughly 75, 000 words between input and output which is amazing if you want to copy and paste an entire book in there and have it summarize something like that for you and then there's BARD which the token limit has not been made publicly available.

So we don't actually know how many tokens we have access to inside of BART.

to be fairly subjective as we start to test them.

So let's start to test web browsing first.

GPT 3.5 does not currently have access to web browsing.

GPT 4 had access to web browsing for a little while, but they took it away from us because people were able to summarize websites that were behind paywalls.

And they're trying to fix that before they give us access to web browsing again.

a bunch of various plugins and some of these plugins do have access to the internet.

For example, there's this web requests plugin that we can enable here and we can type something like

what is the site future tools.io about it starts to use the web requests.

The website future tools.io is a platform that collects and organizes AI tools.

of the site is to help users find the exact AI tool that they need.

And then it lists a bunch of tags that are available.

The site also features a section called Matt's picks, and you can see it gives us a screenshot of what

It's just not the built in browsing that we had before.

It seemingly does kind of the same thing.

Let's go ahead and use this same prompt here.

Let's jump over to Claude and let's try the same thing.

And it says unfortunately I do not have enough context to provide a detailed summary of the website futuretools.io as I do not have access to view the full content of

No, I don't have the capability to autonomously browse the web or access websites directly.

So let's jump over to Bard and paste in the same prompt here.

What is the site future tools.io about future tools.io is a website that lists and reviews AI tools.

It was founded by Matt Wolf, who is a writer and entrepreneur.

The website's goal is to help people find the best AI tools for their needs.

Then it lists awesome tags and Bard we do know has access to search Google.

So jumping back to our comparison chart here, GPT 3.5, no web browsing.

And BARD, yes, it just has web browsing directly built in.

All right, So let's talk about summarizing long content.

There's this great article on not boring.co by Paki McCormick here called the great online game.

It's an awesome article, but it is quite a hefty long read here.

So let's go ahead and copy the entire text of this article here.

Let's paste it into GPT 3.5 here, and then ask it how many words is this article?

So the word count on this article is 2, 661 words.

It should fall within our token limit of at least these 3.

So let's go ahead and type, summarize this article into bullet points.

The author discusses the concept of the great online game and how it's played by billions of people.

Social media platforms are seen as a manifestation of this game, but extends beyond them to various online and offline spaces.

have excelled like Elon Musk, Lil Nas X, and the creators in the crypto space.

Crypto is described as a supercharger for the game.

The author explains how to play the game, emphasizing experimentation, engagement, and building connections,

We'll paste the same prompt in here and let's add summarize the above article into bullet points.

Interestingly, when I paste this into GPT-4, although it should technically have a higher token limit than GPT-3.5, GPT-4 is telling me the message I submitted was too long.

Please reload the conversation and submit something shorter.

When I used GPT 3.5 turbo, it allowed me to paste that in and get a response.

So I actually believe what this means here is that we're actually not using this 4096 token limit with chat GPT 3.5.

We're probably using the 16, 000 word token limit, which would give us roughly 9, 000 words.

I'm just noticing that when I use the GPT 4 model It says it's too long to work with when I use the GPT 3.5 model.

It actually summarizes the text properly interestingly at 2661 words It should work in both of them.

So let's go ahead and confirm that real quick.

Let's go ahead and copy all of this text here.

I want to paste the whole thing into a Google doc and then let's do a word count here.

So it's actually 4, 350 words, which should still be enough for a token limit of 8, 192, but for some reason, chat GPT 4 won't actually summarize this for me.

Also, this shows that there's a little bit of an accuracy issue here with the word count, if you ask it to count

the words, let's jump over to Claude real quick from Anthropic and let's just paste this whole thing in.

And 1 thing you'll notice with Claude is when you paste a large chunk of text in and automatically converts it to a text file instead of just pasting it as 1 giant wall of text.

How many words is this article based on a word count.

This article contains approximately 3, 149 words.

I don't know why these have an issue counting words, but it seems like Claude has that same issue.

Let's go ahead and type summarize this article into bullet points.

So I'd say that's a plus for Claude there.

And the first thing I notice is it actually cuts off.

It won't let me paste the whole thing in.

I can't type anymore, so I can't even copy and paste the whole thing into BARD.

Now there are some things that we can do to get some workarounds for both BARD and GPT-4.

We can actually come over here, click on code interpreter.

We can paste this entire text into a notepad file and let's go ahead and save this as a text file called the great online game T-Gog.

We'll save it as the great online game like that.

I can drag this text file right inside of GPT 4.

And let's say, how long is this article and using code interpreter, you can see this article contains approximately 4, 155 words.

So using code interpreter, it figured it out.

And now I can say, summarize this article into bullet points.

but now it's saying, I'm sorry, but I currently can't import the necessary module transformers as the current environment does not have internet access.

However, I can generate a manual summary based on my understanding of the texts and then it tried again.

Unfortunately, the environment I'm running on currently does not have genism library installed and does not have internet access to download it.

So it won't actually summarize it for me with GPT 4 and code interpreter.

Let's see if we can find another workaround.

Let's go to plugins this time and let's use the web request plugin.

Copy the URL, summarize this article into bullet points, and let's paste the URL in.

All right, here's the summary of the article.

We couldn't just copy and paste it in, which I'm a little confused by, because it seems like the context window should have been large enough, but we did eventually get there.

Now, Bard, I should be able to use this exact same prompt here.

Summarize this article into bullet points.

Go ahead and delete what I pasted in earlier.

to do it, but it says I can't assist you with that as I'm only a language model and don't have the capacity to understand and respond.

So jumping back to our comparison here, summarizing long content, chat GPT 3.5 seem to do it well.

I think it's using the 16, 000 token limit now.

Now I don't know for sure if that's available on the free plan or if only plus users get 16, 000 token window.

Yes, it worked, but I had to use a workaround where I used to plug in to do it.

It didn't work well with code interpreter and it didn't work well right out of the box because

All right, let's try image recognition now.

So I already know GPT 3.5 isn't going to be able to do it.

Now GPT 4, let's first try with code interpreter.

Let's go ahead and use this image which I generated for a previous video that I did.

We'll drag and drop it straight into the message here and let's ask it, can you describe this image?

The image appears to be a drawing or illustration, not a photograph.

The main subject is a character who seems to be a prisoner wearing an orange jumpsuit.

He has a distinctive large head with exaggerated features including a broad nose, big round eyes and a wide mouth.

He is bald and his ears are large and stick out.

He is holding a placard with numbers on it suggesting a mug shot.

Prisoner has a somewhat sad and resigned expression on his face.

It noticed that it was a prisoner wearing an orange jumpsuit, but other than that it just made stuff up.

All right, so let's jump over to Claude and Claude does not actually give me the ability to add an image.

So Claude does not seem to have the image recognition ability.

We'll upload the image and I'll say please Describe this image and it says sorry.

Here's an image of a wild west town Let's see what it does with this.

the image The image shows a small town in the wild west the town is made of wood and has lots of buildings.

Okay the buildings are all different shapes and sizes and they are all made of wood.

The town is surrounded by mountains yes and there are trees in the background not really.

A horse-drawn carriage is parked in front of 1 of the buildings.

Carriage is black with red wheels and has a white canopy.

There is a man standing in the carriage and he is wearing a cowboy hat and long coat.

The text of the image says HDCS, which should could stand for horse drawn carriage saloon.

The buildings in the town are made of wood.

The mountains in the background are tall and jagged.

The trees in the background are green and lush and they provide a contrast to the dry, dusty landscape.

All right, so I mean, it hallucinated half of it.

Let's go ahead and see what GPT4 says about this same image.

The image seems to be a digital illustration depicting a wild west town with a whimsical twist.

The buildings have a characteristic western architecture with wooden facades, balconies and saloon like appearance.

Interestingly the inhabitants of this town are not humans but robots.

These robots have a humanoid form and are dressed in typical Western style attire including cowboy hats, bandanas and boots.

1 of the robots standing in the foreground has a sheriff star pinned to its chest indicating it might be the law enforcement in this town So there you go.

It's a digital illustration depicting a wild west town, but then it just kind of goes off the rails with the description So updating our chart here chat GPT 3.5 not available GPT 4.

Yes, but not the best accuracy Clawed 2 not available bard.

it can't recognize humans and it's also fairly inaccurate and just make stuff up.

I listed a handful of dates, handful of food I ate, and then how I felt afterwards, only giving it the criteria of headache or felt fine.

I just wanted to see how it would interpret this to see if it found a correlation.

You can see anytime it has the word fried in it, I had a headache and going through that seems to be consistent.

And so I was just hoping that it would simply find the correlation that fried foods equal headache.

3.5, I can't actually upload the spreadsheet, but let's go ahead and see if I can copy and paste the CSV in there.

you can see here's a file of the comma separated values.

ahead and copy this, paste it into GPT 3.5.

to type the above list is comma separated values with dates, food I ate and how I felt after eating them.

Can you find correlations between the foods I ate and how I felt after eating them?

Alright so to find correlations between the foods you ate and how you felt afterwards we can analyze the data blah blah blah foods associated with headache there they all are foods associated with fine There they all are based on the data.

It appears that fried foods such as fried chicken, fried cheese, et cetera, are consistently associated with headaches with this basic data analysis.

If I copy and paste the CSV in, it can do it.

Now what's nice about using GPT-4 is I can click on code interpreter and just pull in the CSV file in the same way.

Please find a correlation between foods I ate and how I felt after eating them.

So it's actually repeating the data back to me and it separated it into a different style with sort of different columns from my original 1.

Now, interestingly, it didn't actually necessarily find the correlation between it being fried foods.

It said you consistently reported headaches after eating fried calamari, fried cheese, fried chicken, fried macaroni and fried pickles.

But it didn't say anything about like, you seem to get headaches when eating fried foods.

So I might have to prompt it a little bit further.

Although when I've done this experiment in

What is the consistent factor for getting a headache?

So from the data, it appears that all the foods associated with a headache are fried.

So it finally found the correlation with a little bit of extra prompting.

But again, this is very, very basic data using code interpreter.

You can actually really analyze a lot of data.

And this 1 is just a little too simple to show

but let's go Ahead and pull it into Anthropic and give it the same prompt it quickly found the correlation based on the data you provided It seems like you tend to get headaches after eating fried foods So this quickly and easily found the correlation when uploading the CSV file, something that when I just now did it with Code Interpreter, I actually had to give it some extra prompts.

I go to upload a file in BARD, it only accepts JPEGs, PNGs, and WebP files, so only image files.

So I have to do what I did with GPT-3 and literally copy and paste the CSV file.

So I'm going to go ahead and copy this same prompt, paste it into BARD and see how BARD does.

Interestingly, BARD took a different approach.

The correlation coefficient between the 2 variables is negative 0.23, which means there's a slight negative relationship between them.

This means that if you eat certain foods you

are more likely to feel a headache afterwards.

be most associated are fried chicken, fried cheese, fried pickles.

These foods are all high in fat and calories and can also be high in sodium.

These factors all contribute to headaches.

So it didn't actually find the correlation between being fried foods, but it found a correlation between the foods all being high in fat and calories.

So instead of just analyzing the data from the columns, it actually analyzed the types of foods that are high in fat and calories and associated that with headaches instead of just the word fried which is kind of cool I guess.

So again this was a pretty weak demonstration of the data analysis.

I have ideas to go much more in depth on this in future videos but because I'm trying to break down so many criteria I didn't want this video to be 10 hours long so this is just scratching the surface.

Code Interpreter and Clod 2 are actually much more capable than what I'm showing off in this video but just for a quick recap all 4 of them are capable of data analysis.

With GPT 3.5, you do have to copy and paste the CSV file.

With BARD, you also have to copy and paste it.

And it also analyzed it from a different approach.

Got GPT 4 and CLOD 2, you can literally just upload the CSV file and it will interpret from the file without having to copy and paste the text from within the file.

Now, creativity is a hard 1 to test because obviously it's subjective and I'm fairly sure all of these are gonna be fairly creative.

write a poem and let's see its ability to tell jokes.

And that'll sort of be our rough analysis for creativity for these large language models.

write a poem that rhymes, that ties Artificial intelligence together with eating pizza.

So let's see if it could make a poem that includes artificial intelligence, eating pizza,

That was like boom instantaneous in a realm where circuits twine, artificial minds align, Silicon dreams and coded grace meet with taste in a pizza place.

See all Matt Wolfe transcripts on Youtube

ChatGPT vs. Bard vs. Claude 2 (Which is best?)