Large language models (more commonly known as AI) have provided marketers with a wide range of opportunities to improve and make their marketing/PR activities more effective.

But which platforms should you be using? And how do your goals or priorities shape which is the best for you?

In our on-demand webinar, ‘Which AI Should Your Boss Hire’, we explore how different AI and large language models (LLMs) perform and compare the results for different LLMs for a range of B2B marketing tasks. We will cover:

  • What are LLMs?
  • How LLMs work
  • The differences between LLMs
  • Our tests – a comparison of different LLMs
  • How to get the best out of AI and LLMs in marketing

Register to view our webinar on demand by clicking here, and why not get in touch to let us know if our insights helped you.

Napier Webinar: ‘Which AI Should Your Boss Hire?’  Transcript

Speakers: Mike Maynard

Okay, good afternoon, and welcome to the latest Napier webinar. Thank you all for attending. It’s great to have you here today. We’re going to have a bit of fun, and we’re going to talk about artificial intelligence, and in particular, we’re going to try and find out which AI your boss should hire to replace you. So hopefully we’re going to enjoy this, and it’s going to be quite entertaining. I would encourage you, if you’ve got any questions, to put questions into the Q and A that’s a tab by your chat window. So if you select Q and A and let me know whether that’s whether you’ve got any questions that would be perfect. Thank you. What I will do is I will cover the questions at the end, so if you could just let me know that you’ve got questions, and then we’ll pull them out from the list of the Q and A, okay, so what we’ll do is we’ll kick off and we’ll start looking at AI. So this is an interesting question. What we really wanted to do was to find out how close we are to replacing marketers with AI. I think there’s a lot of discussion, and certainly, we see different opinions.

So some people will say AI can replace a large percentage of marketers activities. Other people today are much more skeptical, and to be honest, what we’ve seen at Napier is there are areas where AI work and areas that AI doesn’t work, and in particular when it comes to generating technical content, AI struggles quite a lot, particularly content around new products where there’s no existing training data. So obviously, AI is effectively coming in, as you know, almost if you think of it as a person, like someone who doesn’t know anything about a product, and then we’re asking it to write about a product without really giving it training data. So we’re going to talk a little bit about how we can address that and maybe what we can do. But I think the most important thing about this webinar is really that we’re going to actually benchmark a number of different AI tools so you can see you know, what the level of differences between those tools are and also which tools are working the best for us.

It’s a somewhat arbitrary test, but hopefully it’ll prove useful. So before we start, let’s have some fun. And the first thing we want to say is we’re not actually going to go out and sack all the Napier people that are listening to the webinar at the moment, hopefully we’re going to keep you all there. And AI is not brilliant. It can do some amazing things, and it can really struggle. So here’s a good example for the last webinar, which was an API digital playbook. I wanted to create a kind of playbook image for American football. So this is a classic image I’ve taken from an image library, and we thought we’d just ask chat GPT to create it. So I said, you know, could you draw a picture of a play from an NFL playbook? And it came up with this, which doesn’t really look like that, Oz and x’s play at all, you know. And also it has some issues, you know, one of them being 20 players on a team, which, if you can get 20 players out there on the field for American football, probably is a great way to play. But I’m sure there’ll be lots of penalty flags being thrown for that. So anyway, so we don’t want the images of the players. So let’s ask chat GPT to remove the players. Make this picture much simpler with that image of the players. Fairly simple prompt, yeah, that didn’t work.

So then we thought, well, let’s try and explain it. So we had a prompt, draw a football play in the style of this image and provide an image that was a good example image, and it still didn’t work. So then we said, well, just use the OS and x’s style. That’s got to work. It’s got to understand that it still doesn’t work. And I think, you know, here we’re up to, like, 22 players on the team, or something. So you know, more and more players, less and less realistic. So eventually we start getting frustrated, and we start telling chat GPT what we think your diagrams are nothing like we’re asking for. Suddenly, there’s a moment of illumination, and chat GPT responds and says, I understand you want something more in line for the OS and xs. Clean, minimal, yes. This is it. This is fantastic. Exactly what we want, and then it created exactly the same thing again. So whilst AI is amazing, and a lot of those diagrams are quite fun and quite impressive to be created by AI, to be quite honest, and sometimes it doesn’t get what you want. And in the end, for the last webinar, we gave up and we just took a stock. Image as being the best approach. So anyway, there are there are issues, but I think there’s also some things that AI can do quite well, and one of them is generating content around topics that AI already understands, or topics that you give information about.

So we’re going to have a look, particularly around written content. For this this exercise, we’re going to have a quick look at what llms are, how they work. We’ll talk a little bit about the difference between llms, and we’re going to do some tests so comparing different llms, and ultimately, we’re going to summarize with how to get the best out of AI and llms. Now, in this case, llms, if you don’t know, are large language models. So what are large language models? Well, they’re basically AI models that are trained on lots and lots of data, actually, vast amounts of data. I mean really huge amounts of data. So literally, the internet, all the public domain books in the world, you know, social media sites, Wikipedia, all of this is fed into the into the neural network, so a mass amount of data, almost all the written data that’s available. But what happens is, you generate a model that understands and can generate natural language as well as others types of data.

So how do they work? I mean, it’s pretty complex, but let’s try and get a bit of an understanding of how they work. Well, the first thing to say is, there is an amazing tutorial on how generative AI works on the Financial Times website. It’s not behind the paywall. So if you want to know how AI and particularly large language models, work, I recommend going to this web address here. We’re going to pick out a couple of key concepts. We’re going to talk about tokens, we’ll talk about mapping and vectors, and we’re going to talk about prediction of the next word. And I promise this is a typo from me and not a typo from Ai. It’s not the next work, it’s the next word. So tokens are really important. So the first thing to say about large language models is they actually don’t understand words. They understand tokens, and a token is quite often a word. So as an example here, we’ve we’ve typed in was Mike Maynard, an international speed skater, into the tokenizer that chat GPT uses, and it shows you how it breaks it up. And it’s very interesting, because has Mike and international and speed are all tokens in themselves.

So they’re all considered single tokens, but may and nod are split into two separate tokens, and SK and ATA are split into two as well, which is interesting. So I have no idea why it splits like that. That’s how the algorithm works and but you can see some words are tokens of themselves, and some words are split into multiple tokens. And this token tokenization allows processing. So you know, you can much more easily process compound words because you split them into tokens. So you build these maps of words by using or maps of tokens by using these words and working out what’s related to something else. So we put the tokens into the system, and it tries to map the words. It tries to say where words are related. So this is, you know, an example of the sort of thing might happen. We’ve put in the section on the top right, ride, cycle, fly and drive. They’re all fairly close together.

Obviously, ride and cycle are much closer than drive is to ride or fly is to ride and cycle. You know, equally, car taxi, bus and train would all be together, because they’re all kind of ways of creating, effectively the same sort of thing. So car bus, taxi and train are all forms of transportation. So you get these groups. So this is how, if you like the large language, Your Honor, begins to understand what things mean, because it understands how one word relates to another. So now we’ve built a model. Very quickly we understand how it works. Now the real thing is to predict what’s most likely to happen. And so you know, as an example, you know, if you’re predicting words, ride and cycle could be options in the same situation, but also similar words would follow ride and cycle in a sentence.

So as an example, we could enter into chat. GPT Mike Maynard is a speed skater who and the question is, what would be a likely word to follow? Now don’t forget chat. GPT is not as such, looking up facts. It’s trying to work with probabilities. So if you start saying Mike Maynard is a speed skater who quite often you know Mike Maynard might have represented. Represented a team. And so, in fact, it might come up with represented then Canada is quite likely as somewhere to represent in speed skating. And then you might have actually represented Canada if you skate for Canada at several World Cups. So anyone who knows me knows that I’m a speed skater, but I am definitely not an international speed skater. And so these words all feel quite likely, but factually they’re wrong. And this is the issue that we see with large language models with AI generated content, is you get what people call hallucinations. And genuinely, this is a hallucination that previous versions of chat GPT actually had, and because these algorithms don’t just produce what they think is the most likely web but it was a little bit of randomness as well, around 50% of the time with older versions of chat GPT, and it used to claim very confidently that I was an International speed skater who represented Canada, sadly, now with chat GPT four and four, oh, I just get back Mike Maynard, who never heard of him, so my opportunity to fame has gone. But you can see how we get these hallucinations, but you can also see how words are predicted, and so this prediction is how content is generated. If we look at, you know, the differences between large language models, there are some things that are really key at differentiating so the number of parameters, which basically the size of the model. And we’re talking about, you know, many millions of different vectors that are used within it to create the model. So these are different floating point numbers. So the parameter size is very important. The context window size is basically the number of tokens that a large language model can take in at one go. So that represents how much data you can give it to process. The bigger the context window size, the more complex the prompt you can generate, and also, generally, the longer the amount of content you can produce that’s effective. The training data is very important. The producers it, you know, and in particular, we’re starting to see some large language models being trained on synthetic data.

So as I said, we basically used all the real data in the world to train these models. So one of the approaches is that AI is being used to generate training data to train AIS, which is very bizarre and has huge potential problems. The geeky people will immediately be saying, there’s a potential over fitting problem here, it’s very difficult to train AI on AI generated content and get good results. And also I mentioned parameters. So effectively, a model is a very large number of floating point numbers, and so this requires a huge amount of storage. And actually, what people do is they’ll take the model they generate and they’ll compress it, or they’ll do what’s called post training quantization. And for those who are a bit mathematical, literally, what it’s doing is taking a very precise floating point number and it’s rounding it and it’s rounding it to a certain number of decimal places, so it’s getting an approximation. And this actually works. So it actually keeps most of the data, but it compresses the size, so you’re not keeping, you know, exactly the same level of information in the AI model, but it still works effectively, and it means you can run it on a smaller system. And then, of course, one of the other big differences between large language models is whether they run in the cloud, something like a chat GPT, or whether they run locally. And the important thing to remember, if it’s running in the cloud and you’re sending data to the cloud, is that potentially, the AI can train on that data. And there are various opt outs with different systems, but there is a risk, once you start providing data to the cloud, that your confidential information will then be used by the AI and become part of the AI’s knowledge, and that, obviously, is potentially dangerous in terms of protecting trade secrets. So a lot of people like to use local AIS, and a lot of companies now are mandating the use of local AI. So that’s a really quick run through about how large language models work. I appreciate it’s a very, very simplistic I’ll once again refer you back to the Financial Times and their very interesting, interactive presentation. Let’s go on to the main bit of this webinar, which is really to talk about how we tested AI and compared some so we’re going to look at the different ais that we used.

So we used five different models. We used chat, GPT, the latest version, app, four. Oh. We use the latest version of Claude, which is anthropics AI, and the latest version of Gemini, and all of these were run within the last week or so, so all very much up to date. They’re all cloud based AIS, so they’re all ais that potentially have some degree of security risk. We also ran a couple of local models. So these are models literally run on my laptop. If anyone’s interested in learning about local AI models and being able to run your own model, please do ask me. We used a tool called GPT for all, which is great, and we use basically two AI models, Lama, which is a Facebook model, and five, three, which is a Microsoft model and a very small model. So if we look at this, the cloud based models are much, much bigger than the local models. And the reality is, is that the world of agency is not hugely profitable. So it’s run on a very old laptop with pretty moderate specifications. It’s about a five or six year old core. I seven model. It did have 16 gigabytes of RAM. One of the challenges is these models, as I mentioned, are quite large. If you try and run them, for example, on you know, a Mac Mini with eight gig, you will run into trouble. So running on a laptop with decent specification. The memory is quite important. And as I say, the most important thing is to make sure that you understand you’re running smaller models locally just because of the limited processing power. If we look at the speed, I mean, llama RAM is considerably slower than the cloud based models in terms of interaction time. And obviously these cloud models, they’re being run on very high performance server farms. Lama ran on my laptop. It ran much slower. Phi three actually was fairly similar in terms of speed, but as we’ll see, is not as good in terms of what it produced. One last thing to mention is that we did do some processing of data sheets, and one of the big issues was running locally, is incorporating documents into your models. And actually we uploaded a very large data sheet, around 300,000 words, and it took about two and a half hours to process on the laptop. So some things do take quite a while to process locally.

So let’s move on, and let’s find out which of these models is going to be replacing us tomorrow. So our first test was fairly simple to write an 800 word blog post about how variable speed motor drives will increase energy efficiency in the future. One thing worth mentioning is, obviously, this is a presentation about AI. The images are all AI generated, apart from when we get to one of our clients, images, and this was a low voltage variable speed drive, according to chat GPT in terms of image generation, those of you who know anything about variable speed drives and motor drives will know that maybe there’s something around this, but there’s a lot wrong with this image, from size to number of breakers to All sorts of things. So this is not a great image. It shows again that once you want to do something specific and technical, AI sometimes isn’t the greatest thing.

So anyway, we want an 800 word blog post. That’s the first thing we wanted to do. Should be fairly simple, variable speed drives as a pretty generalized product. All of these models will be trained on content about variable speed drives. So it should be okay. Well, the first thing we came with, up with is an obvious love for bullet points, Claude, which is quite often recommended as the best AI for writing content, pretty much just writing bullet points. There’s almost no pros. It really wasn’t like a blog post. Interestingly, Gemini, at the other end, actually wrote directly in prose from the first one. It was the only one to not fill the blog post with bullet points. So Gemini came out best on that in terms of the style, more of a blog style, but certainly some issues with that first pass, none of the models really discussed the future. All of the models were pretty high level and pretty non technical. And actually, for something, you imagine, computers were good at generating a right the correct word count. Only fine Claude were within 10% of that 800 word word count, and Gemini was like your lazy writer on a Friday afternoon looking to get home. Then he gave us 477 words for us for an 800 word blog. So a little bit disappointing there. Fay also had a thing where it seemed to like inserting word count. Accounts that were random and completely wrong for sections. So there were some bad artifacts going on here. So what we did was we actually went through another two revisions.

So the first revision we asked to remove bullet points and make the article more prose. The second was to then take the content and to target variable speed drive experts, and we have mixed results. Fi, basically just produced an outline, not a blog post. At the end of this, we were still way off the target word count. Gemini was still the laziest. AI, producing only 40% Oh, sorry, producing 60% of the words, 40% below the target count. And chat GPT was super enthusiastic. Actually produced a third more. So not necessarily good for what we want, because you want to know 100 words. But you know, chat GPT was again, way out, but the other way. And the focus was, was very bizarre. It was kind of all over the place. So 20% of chat GPT blog post was around harmonics, which is not a particularly major factor going forward. And also the articles can contradict themselves. So depending on which section of the article that Claude wrote, you’d either be reading that drives that don’t need sensors are absolutely the most important thing, or you could read that the development of sensors for drives that need sensors was the most important thing. So again, this, this is something that quite often we see with AI, is that the overall narrative is kind of, you know, not very well structured. And I think this is a good example where it talks about senseless drives and also pregnant sensors. Both are great, but Claude was unable to create a narrative that explained, you know, why some would would go for sensors less, and some would use censors, and it kind of just put two opposing opinions in the same article. So not exactly the greatest blog post, I would say. However, the content wasn’t terrible.

So if we look at the chat GPT one, which I personally found the best and was generally seen to be the best quality article, it produced a pretty coherent, pretty good article in terms of talking about some of the trends going on. And, you know, really gave you a bit of a background. Claude also did a pretty good job, as I say, overall, the structure wasn’t great, you know, and it was really almost as though you had random paragraphs just banged together. But the actual content of those paragraphs was pretty good, and it identified, again, a lot of the most important things. So we had some reasonable results. I mean, whether anyone would want to put the blog post either of these, you know, two better blog posts out on their website without having a, you know, a sub edit by someone who’s human, to improve some of these issues is a question. And certainly we’re seeing with SEO, and particularly where people are looking to rank on SEO queries where AI gets involved, it’s all about ranking at the top, and frankly, the quality of the blog post, it’s okay for someone to read, but it’s not going to be seen as the most authoritative blog posts. And so, you know, there are certainly issues in taking blogs directly and not editing them. So one thing you can do is something called ragging. So ragging basically, rather than training an AI on something that lets the AI look things up, so we can provide content for an AI, and then that AI will actually use it as a reference. So what we did was we took a family of three products, the fabulous NRF 50 4l series, which are new products from our clients, Nordic, and we provided the full data sheet. As I mentioned, it’s a big data sheet, around 300,000 words, and we asked the AIS to write a press release about the date sheet. Now, press releases are fairly standard in structure, but unfortunately, it didn’t produce a press release that could be used again. Jake chat, GPT and Claude went bullet crazy, and this time, Gemini also just absolutely flooded the press release with bullets as well. Now it may be because the data sheet has a lot of bullets in it kind of copied that style, I don’t know, but certainly it wasn’t the sort of thing you would say looked really like a press release. That’s basically three key messages that need to be pulled out. And if you look at the human written press release that’s been done for this product, it pulls those out really clearly, performance, efficiency or low power and security.

Only one of the AIs really did a good job of identifying which was chatGPT, Lama and phi three. These are the smaller, locally run ones. They missed the security message completely. Claude almost missed security it was buried towards the end. It just about squeezed it in, but not very well. And then Gemini completely missed the performance message, which is interesting, they’re missing different things, even though they’re processing the same content. Bizarrely, Gemini also talked about the structure of the data sheet and gave kind of a summary of the table of contents of the data sheet, which seemed a bit weird. And then we had some hallucinations. So chat GPT and llama invented availability of the products, and literally wrote, you know, these products available, or these products are sampling now, with absolutely no reference to whether they were or not. So completely made it up. And then Lama made up a spokesperson as well, which I thought was awesome. So Sven Norden. Sven Norden Toft, apparently, is their spokesperson, so completely made up. Person doesn’t exist within Nordic but Lama decided to create this person to give quotes, so hugely risky in terms of creating false information. So what we decided to do was do another test, and we thought we’d do something a bit simpler, so we’d write a LinkedIn post about the launch of these products. So taking the same products again, the NRF 50 4l and writing a LinkedIn post. Interestingly, all the LLN choose quite long posts. I mean, Gemini was the shortest, at nearly 1000 characters. That’s quite long for a LinkedIn post. So that’s kind of an interesting thing. I would have expected some of the produce much more shorter, succinct posts.

Chat GPT and Claude had rather dubious attention electronics engineers type openings. I can’t remember that those words were literally from chat GPT or Claude, but one of them literally started out with a post which felt a bit cringy. I’m not sure I’d be posting that on my LinkedIn. One positive thing, though, was clap chat GPT and Claude included emojis, so that did look quite cool. Llama started well and really crashed and burned at the end. Just basically wrote an ad at the end. Fi obviously continued its focus on completely the wrong features and interesting. Llama didn’t have any hashtags. The others did, so it was okay if you wanted a longer blog post. It’s not too bad. But you know, and this is the chat GPT blog post on the left, they weren’t great again, not something you’d necessarily want to use directly without some editing, but certainly you might want to pull some of this content directly into a post. So lastly, we thought we’d go and we do Google search ads, because that’s got to be an easy thing to do, right? So it’s shorter, it’s easier, hopefully we can get some good results.

So first thing that happened was all the llms ignored the maximum character count for Google ads. Now this is interesting because Gemini actually told us what the maximum character count was and then promptly ignored it on the content it generates it. So it’s kind of interesting. None of them did well, it is correct that you can actually go in and then ask to recreate those headlines and descriptions under the character count if you re prompt it normally. That works with these AIs. But for first pass, it wasn’t great. Claude was probably the best. The headlines were okay, but some of the descriptions were too long, as I said. That also highlighted the maximum characters. We also got some bizarre extras as well. So Gemini actually gave us a short tutorial on how to optimize Google ads. Claude talked about Call to Action phrases, which I would have hoped was actually in the ads they wrote, rather than something separate, Lama started talking about the audience we should target, and also gave some not particularly great keywords, and then fi, and poor old fi, I mean, the smallest, least comprehensive model. They actually gave us an ad for different Nordic products as well. And I’m not entirely sure where that came from, but their power management ICS got a free ad there. And also, interestingly, Gemma and fight only gave three options for each of the headlines and the descriptions, which is relatively small for for Google ads.

So we’ve done that. I mean, one question that I, you know, I know people who ask is, yeah, okay, so you’re criticizing this. But can you jet, you know, can you genuinely tell it’s AI? Well, the answer is, there’s lots of tools to tell whether content is AI. And interestingly, we looked at the LinkedIn post, we used a tool called. Copy leaks, which uses Open Source AI generator and Claude, Gemini, Lama and faiz, LinkedIn post with detectors being 100% AI, only 66% so two thirds of chat GPT post was detectors AI, but again, the vast majority of it was so quite clearly, if people are penalizing AI, or if you’re submitting to a publication that’s not happy to take AI generated content, do be warned, because people can detect this, and it is scarily good, much better than humans at detecting AI. And the other risk, of course, is plagiarism. Interestingly, we had very little plagiarism identified, and in fact, the only plagiarism we had was the first draft of the chat GPT blog post, and 11% of that was identified as plagiarized. So one good thing that does seem to be happening is these tools do seem to be moving away from dumping phrases and sentences that are directly copied, and they do tend to be generating much more things in their own words. So that is certainly a positive.

So anyway, I know I’ve covered a lot, I’ve talked about a lot of data. I think you know, the easiest way to summarize this is to basically score the different AIS and give them a rating. So what we did was we looked at rating. So basically we rated based on privacy, ease of use, and then on the first test the blog post, which was quite a long exercise, we actually measured based on quality and whether it met the brief and then the three subsequent tests, we just gave an overall score. And it’s quite interesting. We looked at this and almost went back and changed the numbers so they weren’t also similar. But what we got in this particular test I mentioned a couple of times, chat GPT did quite well in generating content that’s not that common. Generally speaking, most people testing for written content, flying Claude generates great results. In our case, it didn’t. So it may be different for you, but we found chatgpt slightly better than the others. Fi being slightly worse, which is not surprising. It’s the smallest model by far. So I wouldn’t be I wasn’t surprised that that works less well, and then really nothing to choose between the others.

As I said, almost went back and edited this to make it look less like a dead heat. But the reality was, Gemini Claude and Lama all have pros and cons. Do well some places less well in others. So you know, really the answer is, is that there’s not one that’s winning in that group. So overall, I mean, what do we think? Well, I mean, it is amazing when you see content being written. It is less amazing when you try and optimize it, or you see content from multiple llms being produced. That all sounds pretty much the same. It begins to feel a bit bland when you see multiple pieces of content on the same topic, from from different llms, or from the same LLM so it can become bland. It becomes samey, but it’s still impressive, and it’s still good for ideas. I mean, that certainly is the case in terms of picking up a structure, identifying main points without doubt. You know, the AI tools today are incredibly powerful and useful, but I think if we’re really trying to get the best out of AI and lions, you know, we’ve got to be honest, they’re incredibly useful, but probably not today for producing the final copy of technical content. And I think that’s important. You know, an early draft, a structure and outline some ideas. They are awesome, but trying to produce a final copy, and the more technical you try and get, the less good they are. Because it’s more specific, they aren’t particularly creative, as I say, you know, everything began to feel the same. You began to see similarities. When you looked at what happened. Hallucinations absolutely do happen, and we were quite surprised at how many, how much, you know, problem we had with hallucinations. They’re terrible at writing to word counts, which is surprising. You’d think computers would be good at that, but they are terrible at that.

There are other ways to get around that. We’ll mention that in a minute. But I think the most important thing to say is they’re incredibly fast. The speed they can produce drafts is, you know, many times quicker than a human. And so their use as kind of a brainstorming tool, or as an ideation tool, or as a, you know, basically just a tool to get over writer’s block, I think is going to be essential for a lot of people. And then the last thing I’d say is, you know, we’ve tested general purpose, large language models here. We haven’t tested specialized tools. So there are specialized tools, for example, for generating Google. Lads and those tools will guarantee to hit the maximum word count. So specialized tools can give better results. Also, there are tools such as market mate, which are designed to help you create much more complex prompts that should, in theory, produce much better quality output. I mean, as we saw, Ragging is another way to do that, where you provide files that are used as reference. And that didn’t always work very well, but generally speaking, these specialized tools will produce better results. So hopefully that’s helped. Hopefully it’s giving you some insight into the effectiveness of AI tools.

If you’re interested in other webinars, the next webinar we’re running is all about B to B research, so completely different topic. And if you’re planning for next year’s marketing campaigns, this could be a great webinar, because we’ll be talking about how to get customer research that really helps you understand what is the thing that should be guiding your marketing for the next year, so it’ll help you with your annual planning. So I’ll leave that up there for a minute. It’d be great if you’re able to attend. Just scan the QR code or put in the short code at the bottom, and hopefully I can talk to you at the next webinar. So thank you very much for listening. I know this has gone a little over our 20 minutes target that we normally 2025 minute target. I’m really interested to know if anyone has any comments or questions. I’ll just leave it open for a couple of seconds to see if anybody’s got anything,

Okay, so the first question I’ve got is around cost. So the question is, were any of the tools we used free? And the answer is yes. Actually, everything we used was free of charge apart from chat GPT. And the only reason chat GPT is paid is because I just have a paid account already, so I use existing paid account, but all the other tools were the free of charge versions. So it’s something you can definitely use yourself. It’s not expensive at all. Okay, let me just see if there’s any other questions coming through.

We’ve got one question here. This is a good question. So are there any tools that could potentially outperform those tested today? I think it’s a great question. If you talk to anybody in AI, the answer is, is there are amazing tools that are so good that they will terrify you, that are just around the corner, but this has been the case for, you know, really a couple of years, I think the answer is, is that what happens, generally speaking, is with any AI tool, and for simpler tasks, this has been mapped and documented very clearly by academics, the improvement in performance tends to be pretty fast. So a tools improve very, very quickly, and then they hit a plateau, and they round off and almost stay flat. They don’t improve, and they stay flat just below what academics call the average person. Now, the average person writing content around, for example, variable speed motor drives is not the average person. They’re clearly someone who’s specialized. They’ve got knowledge. They’re a good writer. So, you know, just below average is still going to be scarily good. But I think what we’re seeing, and most will probably agree, is that we’ve seen this rapid ramp up to, you know, what feels like almost human qualities, but now we’re not seeing a particularly big increase. So unless there’s something that we can do to either change the way that AI works in terms of the models, or somehow find masses of new training data that doesn’t exist, or to find training data that’s of much better quality, I think we’re going to stick it at a bit of a plateau. So my gut feel is, what’s going to happen is that actually it’s going to be much less around the improvement of the AI engines themselves, and it’s going to be much more about how AI is embedded into specific tools. So I mentioned earlier, for example, Google ads, you know, a standard, general purpose, large language model like chat GPT isn’t great at writing Google ads, but if you use that engine and do some very specific coding around it and generate some very specific prompts and embed that into a Google Ads tool, then it can become incredibly powerful. And to me, that’s where we’re going to see some of the biggest improvements. Is AI that’s embedded and optimized to do particular tasks, and I think that’s going to be the most exciting thing for marketing over the next year.

Well, thank you very much everyone. I really appreciate your time on the webinar. If anyone is interested in more information about what we did or seeing some of the content we produced, please contact me. My email address, Mike at Napier, B to b.com. Is there on the screen, and hopefully we’ll see you all for our next webinar in December. Thank you very much. Bye.

 

Author

  • Hannah Wehrly

    Hannah’s role will include supporting the team in a variety of areas including lead nurturing, email marketing and content writing. Hannah is extremely enthusiastic and is keen to expand her knowledge, whilst gaining valuable insight into the B2B Technology sector.

    View all posts