The video titled Open AI Q* (Q-STAR) Exposed – NEW Hidden Details Of Q* delves into the leaked information regarding a breakthrough made by Open AI. The video explores the confirmation by Sam Alman, former CEO of Open AI, regarding the authenticity of the leak. The leak suggests that a breakthrough, possibly involving the qar model, has occurred at Open AI, potentially leading to advancements in AI technology. References to Project Tundra, collaboration with DARPA, and the implications for cybersecurity and encryption are discussed in detail. The video also addresses the skepticism surrounding the leaked information and the implications for national security. While the veracity of the leak is still debated, the video provides a deep dive into the potential implications of such a breakthrough in the AI field. Stay tuned for further updates on this developing story.

So today we’re going to be taking a look at the more details of qar because there has been some more information that I’m pretty sure if you are interested in this Secret model you’re going to want to take a look at so what I’m going to

Be doing in this video is of course looking at the irrefutable facts as well as some rumors so that we can actually get some balanced information out there to see what kind of picture we’re looking at when we are talking about qar so one of the first crazy things that

Happened recently was that Sam Alman actually confirmed that the qar leak was true and before this this meant that every video around qar was definitely some intense speculation but now in the recent interview in The Verge you can see that Sam mman did state that the qar

Model breakthrough he said it was an unfortunate leak which essentially means that per his wording this means that the information that came out was unfortunately true so that essentially means to us that some stuff about the qar is most certainly true but that means if samman has acknowledged this

Then that means that they must be other information out there that means that this is true so one thing that we do know about the qar leaks is that there was most certainly some kind of breakthrough at open aai now of course this isn’t surprising considering that

They are always pushing the Frontiers on what is possible with AI technology but it definitely shows us that although yes on the internet a lot of leaks do come out to be false this leak from qart that was first broke the news in reiters is definitely holding some a weight to it

And I did find the statement from him very interesting because it continued to stay that he expects rapid progress in this technology which means that this breakthrough definitely accelerates the timelines for certain Technologies to be developed I’m not saying AGI but definitely pushes us forward in that

Aspect too his further comment and you likely did see this if you watched the previous video he says without commenting on any specific thing or whatever we believe that progress is research you can always hit a wall but we expect that progress will continue to be significant now one of the things

That I did want to talk about was of course the letter and the some further information and some further documents have come out that have rounded up some pieces of information there’s one thing that I did want to include in this video because it’s something that everyone is forgetting earlier this year everyone

May have forgot that meta’s powerful AI language model actually leaked on the same area that open AI supposedly leaked letter leaked as you can see here this article from The Verge says that meta’s powerful air language model has leaked online what happens now meta’s llama model was created to help researchers

But leaked on forchan a week after it was announced some worries the technology will be used for harm others say greater access will improve AI safety so essentially the reason why this is a key piece of detail is because of course this letter the one that many

People are talking about as a conspiracy or some are regarding as fake which is completely understandable by any means necessary we shouldn’t jump to conclusions it did leak in the same way that meta’s llama did leak and the Llama model leak was actually true so it does

Mean that there is a possibility that this leak the letter is actually true and later on in the video I’ll show you some more information that does show us that it could be true but before we do a deep dive onto the so essentially here

You can see that I’m going to take a look at some of the irrefutable facts because it’s important to look at exactly what is true then compare that with the leaks to see if there’s any bridge between the two so one of the first things is that of course Sam mman

Did confirm the leak is true Sam mman did talk about a major breakthrough two weeks before samman did get fired for a vague reason search and planning is the future with Asia I they are working on gbt 5 and ruers did claim it would harm humanity and open aai have been working

On cyber security with the White House so before we do this deep dive let’s take a look at these facts and then we can remember this as we do analyze the information because it presents the context for analyzing the letter and the various sources throughout the internet

To see if this information about qar holds up with any weight whatsoever and we’re going to come back to this page here because it’ll make sense once we’re done so you can see here essentially this is a post on forchan and this is an anonymous Message Board where users

Identities are hidden it’s just essentially the way how the website was designed this is a post from a user and this person says it’s strangely technically accurate and alarming more than anons think and anons is just referring to other users on the website it says selects optimal policy action

Policies in deep Q networks exhibiting metacognition we did talk about this last time but he said this means that it selects the policy rather than the just next best action which is what Q learning consists of it goes beyond and then it continues to say apply this for Accelerated cross-domain learning that

Is it was able to change the policy depending on what it was supposed to learn which is really cool which means across many different domains and many different categories it’s going to be really effective at learning then of course custom search parameters scrambling the goal state it can decide

On the optimal parameters for learning which is more meta so it’s you know just essentially even smarter at learning then of course we have unsupervised learning analyzed millions of PL Tech Cipher fars this is the kind of data it would process after learning how to attack crypto systems finding patterns

In it and then of course we have cracking as 192 to analysis project Tundra so this is where we do get a bit of information because previously although we did analyze the letter there were certain things that were in this letter that were essentially hidden deep on the internet and this reference to

The claim that is inside the nsa’s war on security by Spiegel the a tundra product investigated a potentially new technique the T statistic to determine its usefulness in codebook analysis so essentially if you didn’t know when it says NSA the NSA is the National Security Agency in the United States and

Of course this is a secret project that they work on to of course crack the AES 192 and this was essentially an old project that they did have before so like I said I’m going to show you guys some key pieces of findings from this document where it houses together all

This information I’m going to take a look at this article as well cuz it’s really interesting and then of course we do have targeted unstructured pruning on itself which does seem like something that’s quite impossible but if we’ve seen anything with these new models we do know that models gaining superhuman

Abilities isn’t completely unheard of although being able to like change its complete re wiring is something that we haven’t heard of then of course it says using a metamorphic engine to adapt the pruned Transformer model and context memory and then links to a Wikipedia page that shows a polymorphic engine

Also known as a mutation engine and you can see on the Wikipedia page here which shows a polymorphic engine sometimes called a mutation engine is a software component that uses polymorphic code to alter the payload while preserving the same functionality and it says polymorphic engines are exclusively used

In malware with the purpose of being harder for antivirus software to detect so they do so by either encrypting or by obfuscating the malware payload and of course the thing is with a polymorphic engines or a mutation engine or mutating engine is essentially means that this thing is changing all the time which

Means that its code isn’t the same and the reason that it does this is because it wants to essentially hide from the antivirus software so that’s why there’s a reference to this in this document which is definitely pretty interesting and then this last comment here says if

Alman sat on that info and didn’t inform the board it’s not a mystery that he got the can and this would completely explain the crazy behavior and the reason that this last Point does make a lot of sense is because I can’t find single reason that they Bor would have

Fired a CEO like Sam Alman if something huge wasn’t at stake here and you have to also remember that it was Elon Musk that also said that he trusts ilas Sask quite a lot and he knows that ilas sasa’s judgment is really really good so whatever Ilia satova saw that prompted

Him to take such drastic action to firing Sam Alman from the board means that it wasn’t something that was a slip up but that it was clearly something that was quite dangerous or something substantial then of course here we’re going to take a look at the cracking AES

To analysis and project Tundra because previously I did not see this information on the internet but now that we do have it it is a pretty interesting read then of course you can see here a grave threat to security and this is on the article so it talks about how for

The NSA breaking of the encryption methods presents a constant conflict of interest the agency and its allies do have their own secret encryption methods for internal use but the NSA the National Security Agency is also tasked with providing the US National Institute of Standards and Technology with technical guidelines in trusted

Technology that may be used in costeffective systems for protecting sensitive computer data in other words checking cryptographic systems for their value is part of the nsa’s job and then of course it continues to talk about the thing AES so it says one encryption standard the nisp explicitly recommends

Is the advanced encryption standard the standard is used for a large variety of tasks from encrypting the PIN numbers to yada yada yada and then of course you can see here that project Tundra which the letter and like I was saying this is of course reference to this part right

Here where it talks about given an aes1 192 Cipher text by using to analysis which is essentially the method that they previously used in Project Tundra which is essentially why some people on the internet was saying that this could have been possible because it didn’t

Just do this out of random it did this using previous techniques that were being worked on the reason that this was really interesting was the fact that we also did have to dig up this information quite a bit and you can see that project Tundra here and this is essentially from

A United States leak so the only way we know about this is because it was leaked from the government several years prior but of course that does mean that someone could have just added this to this letter if it was completely fake and it does also say Tundra electronic

Code books such as the advanced encryption standard are both widely used and difficult to attack Crypt analytically and it says that the NSA only has a handful of In-House techniques the tundra product investigated a potentially new technique the T statistic to determine its usefulness in codebook analysis and this

Project was supported by redacted of r21 and I’m guessing that’s why it talks about this and essentially says that it used the same method achieving the goal which is of course cracking the as 192 in a way that we fully yet do not understand then of course you can see

Here that it says the fact that Larch amounts of the cryptographic system that underpin the entire internet have been intentionally weakened or broken by the NSA and its allies poses a great threat to the security of everyone who relies on the internet from individuals looking to privacy to institution and companies

Relying on a cloud computing many of these weaknesses can be exploited by anyone who knows about them not just the NSA so this is something that isn’t new information like project Tundra and all this stuff was information that was revealed ages ago in leaked documents from the United States and these are

From the Snowden leaks and Edward Snowden basically leaked a bunch of stuff that he thought that the public should know about so that’s why people are having this question question that is it possible that since there was all of this leaked data that with these new

AI systems did open aai use a new model and use previous techniques that we further hadn’t developed on and then use them again with the combination of new generative AI or potentially a new method to achieve project Tundra’s alleged goal and of course there’s an article that basically breaks down why

This would be terrible of course it goes into detail and basically talked about this if it is true it renders pretty much all encryption meaningless and it basically says Ed difficult to explain just how much we rely on encryption in today’s world whether the wires or on wireless or some kind of another

Everything you can send out can be heard by everyone literally so your signal is not aimed in one place or another it’s broadcast to the world and what makes it work when you make your purchase on Amazon what makes your purchase not someone else’s and what keeps all the

Thieves away from being able to Simply jot down your credit card number as they ease dropped from where they sit so of course encryption is important because it essentially Powers the entire digital economy and and without it everything falls apart like literally everything which means that the entire world is of

Course at risk and could this be the reason that they said that it was a threat to humanity like could this be one of the reasons because I know that many people do talk about super intelligence and AI getting really really smart and of course that being

The risk when it comes to killing us or completely wiping it out but of course are there going to be other issues that we do face earlier down the line that aren’t from Super intelligence that is just from it Computing things in a completely different way so it says

We’re relying on ethics here in Practical terms it’s not unlike learning that a small group of humans somewhere now has access to teleportation or invisibility or invulnerability combined with immortality you have to worry about what they might do with such capabilities and of course if it is true

Like we previously stated in the video and has many others have stated this is not something that we do want to be true because if a yes is down then this isn’t good for anyone at all then of course we do have to talk about the unlikeliness

Of this so we have Joshua back here so essentially he says here in a practical sense a proof of P equals NP may be unlikely which is a major major issue which is why he said it’s unlikely but cracking a major encryption algorithm could mean that AI essentially has

Bootstrapped itself to a level of mathematical understanding far beyond the best human mathematicians and I leave the implications as an exercise so of course this is essentially addressing the ramifications of this and I do find it rather fascinating that many are starting to realize that implications of this because previously in many

Different AI documentaries and many different AI videos we always were met with the question what happens when AI is able to on its own able to just sit there and analyze mathematics able to sit there and analyze physics and then able to come up with new formulas new

Computations new ways of thinking and new outof the-box methods of analyzing the world that we live in what is about to change and this is of course one of the largest implications that many people have talked about because it’s going to change our fundamental understanding of many things that we do

Know and the reasons why I don’t think that this is something that is impossible in the future once we do have a super intelligence is because everything we do have right now came through the sort of breakthroughs that we saw through some of the Geniuses of

Our time so for example what if in the future we do have people like Isaac Newton where they make multiple and multiple discoveries like an AI robot maybe Google Deep Mind maybe open AI has a robot that just keeps making insane discoveries in many different categories

Maybe it does stuff like like Nica Tesla who was able to Pioneer the electrical system that essentially formed the basis of modern electrical power that essentially allows you to be watching this video from anywhere in the world just imagine if we do have that constantly like all the time and we only

Get people like that that are able to make certain breakthroughs like once a decade and we only get people like that like once every 100 years or so so imagine having something like this but it’s able to do that every single day we’re going to be put light years and

The problem is we only have people that are that smart that make those kinds of breakthroughs probably around every 100 years or so and I’m talking about the major breakthroughs that absolutely change the entirety of how we live and how we do things on a day-to-day basis

But what if we could have that every week or every month or even every day of course this does sound incredible but if that does happen how fast are we going to move to a technological advancement to where the future doesn’t even seem like reality then of course we do have a

Few Reddit comments that the article SL doent does point to and one of them does say that people are saying that this leak isn’t cohesive but it actually is if this is a post it’s a masterfully researched ship post it says that undirected used concept that it

Learned to apply them in useful ways in order to better itself and run faster and it was reinforced with rewards by completing tasks after finding articles on crypto analysis it broke the encrypted md5 algorithm at a record speed and efficieny it also mentions data leaked by Snowden which in it says

That NSA has tons of encrypt data on all of us that they haven’t yet decrypted yet because they don’t have the tools to decrypt it and then it mentions that they then reached out to the nearest NSA Center to let them know that they were close to having their tools basically

The last part is acknowledgement of self-preservation and that in a way by saying it has extraneous crap that’s wrong in its trained data set it could remove it without having to remove whole sections and recommended that it be and then of course re-encode itself with the higher security and design of its own

Making then of course we do have another comment here that actually did lead us to another piece of information it says open ey does since very recently collaborate with DARPA on cyber security Google it it’s called The Dara AI cyber security Challenge and it says that the

Letter seems to be written as an internal status report maybe from some cryptography center of the Dara I suspect that open aai was willing to train a special version of qar on Dara’s computers and they probably gladly gave them the computation time which is then fine-tuned on research on cryptography

And it says as far as I have researched things the letter seems to be coherent and plausible but written by someone who just doesn’t have an AI background but also one in cryptography the way they describe what qualia did like selecting optimal policies in various dqn seems to

Align with what people speculate about qstar an llm trained on math combined with Q learning a well-known reinforcement learning technique and openi has a lot of expertise in reinforcement learning and in fact this is what they did before llms and they really did do that a lot before l

Because in previous videos we did talk about how all of their reinforcement learning exercise led to superhuman capabilities in pretty much any field that they did where if you manag to remember what they did with Starcraft whatever game it was at the time I’m not sure exactly which one it was but the

Game that they did at the time so yeah you can see here guys that this is Dota 2 with large scale deep reinforcement learning and the thing about reinforcement learning is that once it gets to that crazy crazy level where the graph just keeps going up and up and up

And abilities just keep getting better and better and better because they keep getting reinforced in that feedback loop it essentially does allow for crazy things and this was something that we previously didn’t think was possible because with Dota 2 the problem was that it required some long-term planning yet

They were able to use reinforcement learning to essentially beat some of the best players in the world so open AI doing some research on cryptography isn’t something that’s going to be completely impossible and we do know that AI algorithms are really really good in terms of pattern selection and pattern prediction

So it wouldn’t be surprising if this is something that they did have a crack at but of course this is quite true where it does say that there is a long stretch from solving grad school problems to solving as 192 therefore I suspect that Dara spent a huge amount of

Computational resources in training the open AI qstar model plus fine-tuning it on cryptography literature then of course you can see here from The Verge Google and open aai team up with the White House for cyber security challenge a handful of companies are also opening their systems up to public review now

One thing I do find very interesting as well was the timing of this article so you can see right here that this is August the 9th 20123 and that is only 5 months ago but it’s only 4 months ago or 3 months ago from where some of these

Alleged leaks have recently come out so it says that DARPA launches a 2-year competition to build AI powered cyber defenses as a part of ongoing White House initiative to make software more secure the defense Advanced research projects agency Dara plans to launch a 2-year contest the AI cyber challenge

That will task competitors with identifying and fixing software vulnerabilities using AI so you can see that a couple of months ago around 3 to 4 months ago they did start working together now it’s likely that they did start working together earlier than this so it’s clear that opening ey does have

Some kind of relationship with the White House which doesn’t make it a complete stretch to say that they are in the background working on something that could try and break this it doesn’t mean that it has broken it but we do see a relationship there between these

Entities so now we’re going to take a look at some more information from this document because this document funnily enough was a document that was created by someone on Reddit that essentially was made to disprove the entire thing but when making the letter the problem

Was is that the more they dug into it was that they found that the letter that they managed to find after hours and days of research only strengthened the claims that they found but if you take a look at the thoughts here I do find their thoughts very very fascinating so

It says the person who wrote this letter regardless if it’s true or fake has an expert level of understanding of AI research he uses many domain knowledge ter terms but no one seems to find any sort of misuse of the termin by him it does reference the project Tundra and to

Analysis which a topic that has been discussed very little on the internet and it’s unlikely that a troll would ever touch on those things if it was a troll they would have written something that is less likely to be dismissed from the first look and something that is

More mematic Al fit for example they would have written NSA Colorado instead of NS C Etc and that is pretty true like I feel like when you’re trying to analyze small things like that there are small instances where if you are going to make a mistake you would more so use

Terms that everyone knows rather than something that people have to dive into you know scouring Google to see as legitimate because if you are someone that’s faking something ideally what you want your end goal is for it to be shared around as much as possible and if

You want it to be shared around as much as possible you’re going to use terms that everyone knows you’re not going to use nsac you’re not going to talk about you know project Tundra and Tower analysis you’re going to talk about stuff that the average user can digest

So it goes on to say openi basically found a way to break encryption so they notified the NSA and the NSA is interested in Breaking encryption themselves which we of course previously talked about he also goes on to say this could explain the turmoil at open AI

Which of course I did say and it does give a very good reason for some open a employees writing a letter to the board warning of a new discovery that could threaten Humanity remember that is something from Reuters so we do know that Reuters did say that something

Could threaten Humanity which goes to show that if that is actually 100% factual from Reuters which it probably is considering they were previously right about the qar leak then that means that it is possible not probable but possible that this is one of the explanations since this is definitely

One of the only things that could threaten Humanity in a way that we don’t want and this would explain why they still haven’t explained why they fired some outman because they literally couldn’t talk about this and of course it goes on to say that this means all information systems for example crypto

The entirety of the internet can now be compromised and that is definitely one thing that still bugs my mind like something that I still lose sleep about is why did they really fire Sam Elman and what did ilat see and why hasn’t the board given a real reason it would make

So much sense that this is something that they cannot simply talk about then of course we do have some more information on project trra that we can see when we go down here so here’s where it continues to state that it’s it’s difficult to find out why Tundra is even

Mentioned in the story it’s cited to support some conclusion but I’m not sure what that conclusion is Tundra was an undergraduate student projects as the original document makes clear not some super secret government program into cryptography then it references the original document and then of course you

Can see that this is the document here and it’s actually quite hard to see but if you scroll down you can see the project Tundra is actually here so this is something that they just of course just copied and pasted out of it but like I said um and this goes into it

Like how could someone go all the way down find this document then scroll down to project Tundra then talk about it I mean of course it is possible that this leak is completely fake but I do think that if this leak is completely fake they definitely went to a very very long

Extent to do so then it continues to state that what I learned is that the NSA had organized 22 students and in collaboration with the NSA mathematicians they tried to solve problems related to classified operational problems among them is cryptography and of course one of the projects that we just showed was Project

Tundra and it says here are some of the the relevant parts from the classified documents and then it shows here it says this year’s 22 participants were called from 260 applicants over 12 weeks students worked in small teams with NSA mathematicians to develop real world solutions to classified operational

Problems you can see Tundra a research and of a new statistic y y y Tundra again here and then of course this is the part from the text and of course it basically summarizes what we do quote unquote know about this alleged leak and it says that of course one of the

Resulting projects from the collaboration was Project Tundra which was of course a while back a new technique called the to statistic and then of course qar used the described to analysis technique improved upon it to break the as12 building on top of the work that was previously done and that

Seems to make sense to me and it goes on to say that this has been another attempt to disprove this 4 channel leak and it failed only to make this leak more and more credible it’s actually quite true if you Google project Tundra you don’t really see anything it’s only

When you Google project tundra t statistic that you do see this final thing here the crypto Tri gr free stack exchange and that’s one of the only Real Results that we do have so it’s definitely something that really is obscure that only a few people would even know about and definitely if this

Is a troll it’s not just someone random on the internet this is definitely someone that has done extensive research into cryptography and yeah one of the key points here that instead of completely doing so itself building on on top of existing work does actually make sense and another interesting

Comment that they do reference here it says except literally none of us knew about it until the leak it took the community days to dig up what it was talking about and if you look back at the original thread there’s a bunch of people claiming to work in AI calling it

Techn Babel and I don’t think they were exaggerating with their expertise the information is Niche and specialized enough that using it in context is out of scope for the most perfectly competent AI research for example even now that you know to analysis is a thing

You have no idea how to use that term in context or how project Tundra and AI could possibly relate in a nuanced way you know what the letter claimed and yet nothing else and even if you tried to to a phrase the leak in your own words

You’d inevitably screw up and make some incorrect statements that could be picked apart that’s a pretty high standard to step and of course you know forchan does have a lot of terrible information out there but like we stated before earlier on in the video it’s before where there were many things that

People also thought were fake were leaked before the Llama model from meta was actually leaked there that was why they decided to release it earlier of course the Panama papers did actually leak there and more corruption scandals than you can count on did leaked it and

As it says you have to assess the evidence you can’t just turn your brain off the second you hear for Chan and in addition to this it does say that comparing the earliest times of the leak from forchan and the earliest time of the information that was on the internet

It says there’s no earlier mention of the qstar model on the internet before that time the time difference between the first official qar mention and the leak is 8 hours and 20 minutes meaning that if the leak was fake and they read the information article the moment it

Was posted they would have had to write the leak in 8 hours and 20 minutes which is not impossible but not as probable considering the fact that the information article I’m someone who literally pays attention to every single piece of AI information on the internet and I follow pretty much everyone that

There is to do with AI news and I found the information I think it was about 3 hours and I literally browse Twitter every hour that I’m a wait cuz I want to make sure that I can find every single piece of information like I’m saying I

Would have only had maybe four to 6 hours to then go ahead and create a leak about this so is it humanely possible that someone created an unforce Sur viable expertly written leak within 8 hours or 6 hours this article further dives into something called nsac which

Of course we talked about before but that it actually refers to NSA Colorado and NSA Colorado is a multi-disciplined cryptological center that leverages Partnerships to produce integrated intelligence critical to Warfare in support of national missions and priorities worldwide and of course recently it was unveiled that NSA is

Starting a dedicated AI security center now there was an article on this by Gary Marcus a leading expert on AI and he does talk talk about the open AI breakthrough in an interesting light now I do want to state that this article is of course more in the more skepticism

Side but he does make some very good points he did state that breakthroughs rarely turn to be General to live up to initial Rosy expectations and often advances work in some context but not otherwise for example he’s talking about breakthroughs in driverless cars and of course the previous breakthrough with

Open eyes Rubik’s Cube where they did talk about they had a breakthrough in robotics so he does detail this and say where previously opening eyes has been on breakthroughs but haven’t been that great however I do want to state that this isn’t something that open aai has

Talked about this is something that did get leaked so I would state that this instance is definitely something where open AI isn’t tooting its own horn now one of the other videos that does delve into this topic is David Shapiro’s recent video and I think one of the

Things that he says during the final statement is very important even if this letter is true like let’s say for example that this is 100% true it’s verified on the off chance on the 10% chance that this is actually true we’re never ever going to hear about this

Again at least not from open AI because if this is a leak this is of course a issue to do with National Security so this isn’t something that the government would come out and say because they wouldn’t want anyone else to know that they’ve cracked this because if you’ve

Got a key to open everyone’s safe you’re not going to tell everyone hey I’ve got this new key that can open everyone’s safe because everyone’s just going ahead and just make new saves or just buy different saves that you can no longer access so it doesn’t make sense for if

This for it to be true for them to basically say yes this is something that we now have and there was another article that I did also find interesting that I did want to cover which was called the qar hypothesis and this was called the qar hypothesis Tru of

Thoughts reasoning process reward models and supercharging synthetic data and synthetic data is definitely something that I really do want to get into because it’s something that is so fascinating with regards to how these large language models are trained and their data sets so there’s a few points

About this and this guy is a machine learning scientist that has a PhD from Berkeley AI it does talk about stuff that we did mention before it talks about that if it is real it links things from from reinforcement learning literature Q values and AAR search and

His initial hypothesis which was a vague merging of Q learning and a star search then he continues to talk about how this involves selfplay which is the idea that an agent can improve its gameplay by playing against slightly different versions of itself this is something that of course we did see before which

We discussed with Alpha go then of course we do have look ahead planning which is the idea of using a model of the world to reason into the future and produce better actions or outputs and these two variants are based on model predictive control which is often used

On continuous State and Monte Carlo tree search which works on discrete actions and States essentially the article continues to mention some other things like like modular reasoning with lm’s tree of thoughts which is essentially so in this article he continues to talk about other stuff like modular reasoning

With llms like tree of thoughts and other methods of prompting and it was funny because these are the kind of methods that improve the base systems of GPT 4 and many other large language models and essentially he comes to a conclusion here after going through a

Whole bunch of stuff that you can see but we didn’t include that qar seems to be using prms to score the tree of thoughts reasoning data that is then optimized with offline reinforcement learning this wouldn’t look too different from existing reinforcement learning the human feedback toolings that use offline algorithms like DPO or

Iql that do not need to generate from the llm during training the trajectory seen by the reinforcement learning algorithm is a sequence of reasoning steps so finally doing reinforcement learning in a multi-step fashion rather than contextual Bandits and while that hypothesis is itely very interesting I

Largely believe that if H Star is true we’re not really going to hear about it because if that a 192 cyer attack is true then of course you know it’s a national security issue which we’re never going to hear about but I do think that this past month and these past two

Weeks have definitely opened our eyes to the craziness in terms of the technological advancements that are going on within open AI because of course there was something that was true you know he did talk about a breakthrough two weeks before he did get fired for a really vague reason search

And planning is the future with AGI which is definitely leaning more towards what Q learning is and of course they are working on GPT 5 and routers did claim that it would harm Humanity so these are things that are quite true so if we take a look at these six

Irrefutable facts about you know Sam Alman confirming now confirming that the qar leak was true confirming that you know there was a breakthrough two weeks before the fact that he did get fired for a vague reason the fact that search and planning is the future likely going

To lead us towards AGI and obviously largely potentially something new like transformance some kind of new architecture and the fact that they are working on gbt 5 and the fact that rers did claim it would harm Humanity definitely does Place Us in an interesting position but I do think that

Right now the focus is definitely going to be GPT 5 because whatever that model is whether it be video like they did State before I do believe that GPT 5 is going to be truly advanced in a way that we didn’t think so either way whilst the skepticism from this letter does remain

High if there is any new information don’t forget to leave it down in the comment section below and stay subscribed to the channel for any updates regarding this discussion

