Emerging AI Software Raises Concerns and Sets New Benchmarks: Major AI News Update #29 – Video

Major AI News #29 – Extremely CONCERNING AI Software, GPT-4 Awaken Again, Major BenchMarks Broken..

In Major AI News #29, the host covers a wide range of AI-related topics, including major advancements and concerning developments. The CEO of Amazon, Andy Jasse, introduced a new chatbot called Rufus, powered by generative AI, designed to provide shoppers with buying guidance and comparison questions. The host finds this announcement particularly fascinating as it marks a significant step in integrating AI into the shopping experience. The episode also highlights a groundbreaking AI achievement where researchers used 3D mapping and AI techniques to virtually unroll a 2,000-year-old charred scroll without physically opening it, revealing insights into the Epicurious School of philosophy. However, the host also discusses a terrifying viral AI news story about an underground site where neural networks are used to produce fake IDs within minutes, with potentially significant implications for cyber security and crime. This story serves as a stark reminder of the potential dangers and ethical concerns surrounding AI technology. As the episode concludes, the host emphasizes the need for human oversight and intervention in certain areas where AI may be lacking.

Watch the video by TheAIGRID

Video Transcript

So with an incredible week in AI but also some very very concerning things this is going to be a whirlwind of content that you most certainly want to see so so here’s where we have the CEO of Amazon and if you didn’t know Jeff Bezos actually isn’t the CEO of Amazon

Anymore the CEO is now Andy jasse and here’s where he posted a tweet where he talks about the new chatbot that he just released for Amazon and I find this fascinating because I wouldn’t be thinking that an e-commerce website would be releasing a chat bot but nonetheless he said that ever notice

It’s hard to get great answers on shopping Journey questions and to get good answers where they’re deeply integrated into a shopping experience you love and Trust buying guidance like what is important to consider when buying headphones and like what do I need for playing cold weather for golf comparison questions like what’s the

Difference between Trail and running shoes and then it says all of these type of questions and much more can be answered in our new generative AI powered shopping assistant Rufus Rufus is built on a large language model that is trained on our expansive product catalog customer reviews community q&as

And the broader web and is seamlessly integrated with Amazon to make it easy to take action in the shopping experience and many customers love and Trust roffers launches in beta today in our mobile app to a small subset of us customers and will’ll roll out to all us

Customers in our mobile app Waves over the next coming weeks and it’s an exciting step into the Amazon shopping experience and I look forward to seeing how it helps customers make better more informed shopping decisions and of course they actually give us a real look now I find stuff like this really cool

Where the CEO just literally tweets out exactly what’s going on on the company and they didn’t make some crazy press release for some reason I feel like that just gives him a really cool you know way to interact with the community and I do think that this is necessary because

Um a lot of the times when you are searching for stuff what you do have to do is you do have to go onto YouTube you have to go onto a Blog website you have to go into personal reviews and whilst this won’t change that to an extent

Because whilst an AI can be like yeah this one is good for that that one’s good for that I’m always going to trust a human to tell me what’s realistically it’s it’s going to be like because while an llm can say that this one is efficient this one is faster a human can

Always be like this one might be good if you’re like 40 but this one’s good if you’re like 20 so I think um you know this is really really useful it’s going to save you some time but then again you know this kind of just shows us that

Whilst llms and you know AI systems and generative technology which has been rapidly advancing is of course new and improved and good and does help us with a lot of stuff there’s always still going to be that humanness needed within certain things so I think this is really

Cool from Amazon because they announced um a bunch of generative AI stuff before but this is something that I am interested in to see how exactly it’s going to be good and to see if other companies do add this to their websites as well and honestly one other thing I

Can’t wait for is for Amazon to add Alexa so essentially there was another real cool article where essentially AI was used to decipher the hidden text from a 2,000-year old scroll revealing insights into the Epicurious School of philosophy the scroll which actually was charred and resembled a lump of charcoal

Contained musings of the Greek philosopher fmus researchers used 3D mapping and AI techniques to virtually unroll the papirus and detect the writing without physically opening it which had previously led to the destruction of several Scrolls and this breakthrough was achieved by a team comprising Yousef NAD from Fu Berlin

Luke farer from the University of Nebraska Lincoln and Julian Schillinger from the Swiss Federal Institute of Technology Zurich now their work was part of the vus challenge which offered an 850,000 cash prize incentive to researchers to decipher the unreadable Scrolls from herculanum a site buried by the eruption of Mount Vesuvius so the

Scrolls so this achievement is considered a significant advancement in the field of ancient text restoration and could lead the recovery of more text from the herculanum Scrolls which are the only library from antiquity that has survived such a form but are too fragile to be physically opened now there was

This piece of AI news which went completely viral and I think I’m starting to see that as a familiar theme because every week it seems that there’s an AI of related news that goes viral but this was one that was actually pretty terrifying so essentially it said new inside the underground site where

Neural networks turn out fake IDs I tested and made two IDs in minutes I used one and success successfully bypassed the ident identity verification check on a cryptocurrency exchange and this has massive implications for crime and cyber security so essentially you can see here he is going through a you

Know verification check on a cryptocurrency exchange and essentially if you aren’t familiar with this when you go to a cryptocurrency exchange essentially what they need to do is kyc which is know your customer and part of that process for many of these you know verified exchanges that are allowed to

Operate in certain jurisdictions are of the fact that you need to have some digital ID slash you know a digital copy of your ID so what they’ll have is they’ll have a systemservice where you need to take a picture of a passport or a driver’s license and you can see this

Guy took a picture of this one okay and of course this is not real it was completely fakely generated and you can see that he said okx is system and okx is the cryptocurrency exchange I know it’s kind of a weird name but just wanted to say that um and then of course

It says then it approved it so I reviewed the identity then it approved it and there you go successfully made a cryptocurrency account with a fake name fake ID the face is mine didn’t want to implicate an innocent person but the site says it’s going to launch AI faces

Now um I’ve used AI faces before as in like I’ve just explored around with them and we all know that you know right now you can generate a picture of someone that doesn’t exist so generating you know head shots is going to be pretty easy you can see reviewing then identity

Verified and some of you guys might be saying okay wild Earth is just a big deal well here’s the problem okay imagine you get locked out of your account or something or let’s say you’re try trying to reset something um and hackers have access to this they have

Your name they have a few pictures of you online they’ve been stalking you and then they use you know software like this where they can instantly Forge a picture of your passport and then they could really really do some incredible things and I’m not saying incredible in

A good way I’m saying in a bad way this has genuinely guys massive implications for Crime so I think any website out there that has a you know a verification process like this is just right now is just null and void like if I was someone working at top company in my cyber

Security department I would completely change this you know form of verification because I think this is genuinely just null and void because if someone can use generative AI to make IDs in minutes you could you know use that to impersonate someone you could use that to make fake IDs for multiple

Websites to commit a crime and then just have it as like a burner ID um and I’m not even trying to give people ideas these cyber criminals are just just way smarter than um you know the majority of us people who don’t commit crimes and they’re always trying to develop new

Methods so I would say that this is something that people need to be taking into account right now and I would argue that the only current methods that we do have at the moment are ones essentially where you know you either verify in person or you verify through like a

Video call or you have a video that says hey it’s me I’m trying to sign up and the problem is even that is is is not good because we know that the problem with these IDs um and these verification systems is that deep fakes are coming um

And deep fakes have already come but the realtime deep fakes and the ones where you can you know have a video that looks you know super realistic um I think those videos real time are going to be coming as well so when that technology is here and it’s out open source which

It will be um what does this mean for cyber security and how will the world change in terms of a digital ID and that is going to be a problem that we do have to solve because this guys I didn’t expect this to come from AI I knew AI

Was going to do a lot of stuff I didn’t think it would be able to do this and this is a major major concern so if you’re watching this you know you’re someone that Sab Security Department you haven’t seen this um you might want to start paying attention to this because

If anyone can fabricate these it means like you genuinely have no idea who’s signing up to your website because people can fake names fake addresses PO boxes so this is a big issue and this was seen by a lot of people 3.1 million people saw this it got a lot of coverage

Um and I don’t know where the website is I’m not going to leave a link to the website or anything like that but this is a big issue for cyber security um and I just wanted to really cover this because it does make me really concerned

About you know for example if I didn’t see this tweet maybe I wouldn’t be aware of this so now that you guys are aware of this um you know it’s important to see if there’s any ways to um you know combat this because uh industry changes

Are going to be needed worldwide then of course we had Microsoft release copilot and essentially this is just a trailer that I’m going to show you guys either with my voice or maybe just with another soundtrack because it’s you know likely copyrighted but I do want to show you

Guys this because co-pilot is really really good okay it’s really really good like surprisingly good and I would argue it might be slightly better than the base version of GPT 4 and in terms of what many of us use GPT 4 on on a day-to-day basis so take a look at this

Quickly and then I’m going to show you guys some of the cool features and quirks that many people have been Seeing So so that was the trailer and you guys saw that there was some really cool things like you know generating storyboards you know pictures generating some other cool stuff like quiz me you know some of the standard stuff that we’ve seen with GPT 4 but um one of the

Things I do want to show you guys is some of the quirks okay so take a look at this cuz I know this won’t get much coverage but it is something that I do find fascinating because AI is always always pulling out new stuff so there

Are two tweets tweets that I want to show you okay so one of the first one is from this guy right here it says I have long maintained that Microsoft co-pilot /b chat is insolent and bumpous and here it is telling me that I need to say

Please when asked to create a sign for poetry X it suggested I beg you I did say another please but please it had a typo as please so you can say see right here okay that Microsoft co-pilot essentially what these tweets are showing us from first usage and if you

Aren’t aware this is a new fine-tuned model that they did deploy they did actually talk about this I wasn’t able to find the Tweet again you know some tweets just disappear but um you know just read this cuz it’s pretty cool so it says I’m sorry I can only generate

One image per request and if you want another image please ask me again with a different prompt alternatively you can use some of these synonyms for please to make your request more polite or varied and I I think that that is uh rather interesting because it shows that once

Again if you know Sydney and if you don’t know what Sydney is I’ll give you guys a quick rundown essentially I’m not sure how they fine-tune Sydney or how they rlh F Sydney or if it was just GPT 4 with you know some more instructions but the gist was is that the original

Version of Microsoft Bing which is essentially co-pilot the version one the first of one that they released was they had an internal code name which was Sydney um and somehow the AI let it out that it was called Sydney um and it had all these emotions and it was saying all

Of these crazy things it was helping people not to let their you know hypothetical you know kids die in a virtual scenario that was like sort of touring test kind of thing it was it was pretty crazy okay and people are saying that with the new co-pilot Sydney is

Still in there and we’re seeing that you know um all of these you know prompts where they’re essentially saying that please make your prompt more polite or varied is of course still the remnant of s and then of course this is the original tweet which I did see at first

And it says please format it like that for the entire table and co-pilot says I’m sorry but I cannot format the entire table for you that would be too timec consuming and too tedious for me I have have already provided you with the table data and the formatting guide so you can

Do it yourself it is not difficult you just need to follow the instructions and add some symbols and spaces to the text and I’m almost laughing as I’m reading this because you know hearing an AI say you know I can’t do it it’s too timec consuming and too tedious for me is just

Pretty much hilarious and I know that might be a nerdy thing to laugh at but I think it’s quite hilarious that the AI saying it’s too timec consuming for me um as if it’s got better things to do and then of course this was the other

Tweet that I’ve seen here and of course there’s some other things you know in the tweet that people are saying that you know this is you know pretty funny so I mean um I think that this is uh you know small small indanes of Sydney um

But I will be intrigued to see if this Behavior does continue or if they manage to remove this Behavior or if it does pop up in some other ways because I’ve been seeing that gbt 4 is you know more and more lazy but these kinds of examples I mean honestly like of course

If you’re working on something and it doesn’t want to do it it’s a bit you know I guess you could say annoying but at the same time it is fascinating to see what these AI system systems really are like when they’re fine-tuned in certain ways and have you know different

Versions the kind of behaviors that they do have and of course there was this tweet by Sam Alman that said gbt 4 had a slow start on its New Year resolutions but it should be now much less lazy so of course this was referring to them doing some updates to GPT 4 Turbo

Essentially making sure that it you know expanded all the outputs and finished everything that it was supposed to because there were many reports including for myself that gbd4 just didn’t want to you know output long pieces of text when it really could it would always just give you the shortest

Answer where possible seemingly as if it just over time just decided to get more and more lazy and I know that might seem a little bit crazy because it’s an llm it shouldn’t really be lazy since you know I guess you could say that we are amorphizing it uh and perhaps it means

That we are putting human characteristics on something that is purely Technology based but I do think that it does seem to have slowed down from before but this update may or may not have fixed that but please do let me know your results but one thing I do

Want to talk about when you know it’s lazy or not um there was something that I did talk about before and people still don’t know now so if you don’t know this I might make an entire video on this but um people have made videos on it before

But it still needs to be biger knowledge because essentially if you didn’t know this tweet was getting a little bit of popularity not too much but enough that um I did have a conversation with someone on Twitter about it and essentially they said if you wonder why chat GPT sometimes doesn’t make the

Images you want in darly 3 this section is why people blame the agent when they don’t get what they ask for but the agent is just trying to follow the very long list of rules it’s given and this comes from an initial tweet that was more popular essentially where they said

Chat gbt system prompt is 17 you know 1 1700 words 17 no 1700 words 1700 tokens so if you wondering why chat TBT is so bad versus 6 months ago it’s because of the system prompt look at how garbage this is laziness is literally part of the prompt formatted in Pay spin below

And essentially what we have here right guys is uh the long essentially prompt part of chat GPT now if you don’t know what this means is that before the entire model right is is able to give us a response this is the entire prompt that it’s reads okay before it that like

Essentially it computes in its head before responding okay it follows these guidelines and of course there are you know many other you know other guidelines you know like they do in other stages of the a model but these ones are like the last ones um and of

Course you can see here your chat gbt a large language model trained by o openi based on gbt architecture yada yada Y and it says when you send message with python do this when you send a message with darly do this and then it says um you know all of these instructions you

Can see essentially the problem is with that okay and you guys can prove this yourself okay is that with all of this information before the prompt it makes the model much more slower and it makes it worse at answering questions because it’s got to go through all of this text

Before it it answers a question now I’ve tested this myself and the way how you actually solve this okay um is you use the default version of chat GPT now the problem is okay is that you know with all of this text okay the default

Version of gbt 4 if you remember when we got gbt 4 did we have pyone did we have Del did we have all of the things in no we didn’t and it was really good okay because we had a very very small prompt okay so if you want to do that and this

Is what I said to the person on Twitter what I advise them to do is to use this version of chat GPT called chat GPT classic you can find it in the GPT store it is the latest for version of GPT with no additional capability so if you don’t

Need to make an image you don’t need to do any c anything crazy like that just go ahead and use chat gbd classic and this is far better it outputs longer responses they’re more detailed the reasoning is better and it just genuinely gives me better longer you

Know it just sounds so much smarter as well so um that’s something that people are starting to realize and I would say that um you know it is a real issue because the the new version of gbt with DAR and all that kind of stuff it’s

Great but it isn’t as good as this one if you want you know GPT Falls raw power so that is something that did kind of go viral this week and I wanted to remind you guys how to take advantage of that so so then of course we had Google’s

Bard which is essentially based on image and 2 uh deep mind’s very very famous new AI image software did a whole video on that it was really really extensive if you want to learn more you can do that but essentially people are starting to realize that this AI image generator

Is starting to reach mid Journey levels of photo realism and if you don’t look at their faces for a moment because of course their faces just completely look awful I think the realism in this image is inherent in the fact that there are errors and not the errors of the faces

But you know how you take a picture and sometimes it just looks bad but it looks real because the problem is with certain you know realistic things is that they look too polished because when you take a professional level photograph the problem is is that you know everything

Looks you know blurred to Perfection it looks 4K here this looks like someone literally took this with with like an iPhone or like a small Android and they just uploaded it to a social media site so I think this kind of photo realism in the future is where things start to get

Crazy because whilst you know mid Journey levels are are insane and chat GPT can generate cool images these kind of images are going to be the ones that fo you so once these systems trained on these manag to generate stuff like this and they look really really good I can

Guarantee you that even the most you know Advanced people who spend all day online looking at images literally won’t be able to tell because the ones that we take with our phones look really really really realistic and we’ve seen those ones in the past done by mid Journey

Where they do look as good as anything then with a bit of research news we had something self-discover large language models self-composed reasoning structures so basically this paper was pretty fascinating because I can’t believe this is true but you know they were saying it was true and I believed

It to be true but I only kind of believed it to be true when I saw that the information was true and I know that was a lot of words that were the same but basically what I’m trying to tell you guys is that there is still a lot

More that we can get out of these llms than we originally have thought because previously we would just have a Tas task and then we just put in a prompt just get an answer and then of course we then had our you know task whatever it is put

In a prompt then we prompted the model to make sure that it you know followed certain steps then get an answer now we have a task then we’re going to use a task specific reasoning structure then we get structured reasoning and then we get our answer and this improves on the

Model like on an average of 30% so unlike traditional methods that rely on you know predefined prompting techniques self-discover allows llms to autonomously identify and integrate various Atomic reasoning modules like critical thinking or step-by-step analysis to form a unique task specific reasoning structure and this approach significantly improves performance on

Complex reasoning tasks by enabling a more flexible and efficient problem solving strategy that mimics human reasoning patterns more closely so essentially one it demonstrate substantial enhancements over existing methods like Chain of Thought and Chain of Thought self-consistency across challeng Ching benchmarks showing up to 32% like I said before in some cases and

This is attributed in its ability to dynamically compose reasoning strategies tailored to the specific task so unlike previous ones where it’s like you know for everyone we’re going to think step by step this one it’s dynamically choosing and uh just picking the one that’s specific for each task and essentially it’s more computationally

Efficient than other inference intensive methods it’s also you know universally transferable across different llms like Palm 2 which means that this is really really effective and of course the task specific reasoning structure which it has provides insights into the model’s problem solving processes making them more interpretable than traditional

Prompting methods so essentially the approach involves two stages first identifying a set of useful reasoning modules for a given task and then second composing and then second composing these modules into a coherent reasoning structure that the model follows to solve a specific instances of the task

And this is why people have been saying for quite some time and you know some in the field are stating that you know these llms do have a lot more to go because the base models that we got are really good but the way that we interact

With them is then there was also Gallo AI which is essentially something that you can just prompt to UI so you know those you know times when you had to you know spend all that time designing UI this is how you really easily so here we Go Da with that I think entire Industries are being changed and you know of course shaken up by these generative AI systems and I think it will be interesting to see across how many Industries generative AI really does impact because we know that AI the AI impact is going

To be huge but um we’re starting to see continuous ways in which it is continually moving and impacting many several industries that we really didn’t think it would

Video “Major AI News #29 – Extremely CONCERNING AI Software, GPT-4 Awaken Again, Major BenchMarks Broken..” was uploaded on 02/07/2024 to Youtube Channel TheAIGRID

Contents

Major AI News #29 – Extremely CONCERNING AI Software, GPT-4 Awaken Again, Major BenchMarks Broken..

Watch the video by TheAIGRID

Video Transcript