How does AI works??

How does AI works?? Artificial Intelligence, the most trending topic after the launch of Chat GPT, is the future of mankind, so understanding how AI works ?? becomes more important to us.

In this article we will be learning the generalized workings of artificial intelligence.

Meaning of Generative Artificial Intelligence

Before understanding the workings of AI we first need to understand its History and the meaning of the word “Artificial Intelligence”, The word “Generative Artificial Intelligence“, “Generative” means creating something new, like a new text (ex. Chat GPT, Gemini, Bard, etc. ), audio (suno AI) or an image (ex. MID Journey) and “artificial intelligence” means the machine can do work by its own, so in whole it means a machine which can generate some new data (text, image, audio, etc. ) by its own.

History of AI

The Idea of a machine generating something new on its own is not an entirely new concept. We have been using it for a while. For example, when we use Google search, then google itself tries to auto-complete our search, in this case, google is generating some new text based on the data of its users. This example shows that generative AI is not a new concept.

Google predicting the search
Google predicting the next word in our search

However, after the launch of Open AI’s Chat GPT, this thing became so popular that almost everyone has heard of it.

How does AI works?

For this article we will be using Chat GPT as a example how actually Chat GPT works.

Firstly, the word GPT means Generative Pre-Defined Transformer,
here Generative means to generate some new content, Pre-Defined means the model is trained on some short of data and the word “Transformer” it holds the main key here, In the case of AI the Transformer means a type of neural network. Now there may be many types of neural network or transformer.

Full Form and meaning of GPT
The full form of GPT

For this article we will be focing on a type of transformer which takes some initial text and just simply predicts the next word, now most of you readers are going to ask, ‘Rahul how the hell we will create a AI like Chat GPT with this model??’ just wait a bit, let me explain it.

Let us assume that we just created a new model which takes some initial words and predicts the next word, for instance we gave it a text ‘Rahul is a’ now our model predicts the next likely word which may be ‘good’, ‘bad’, ‘booring’, etc. Let us say our model predicted ‘good’, now we have new phrase, “Rahul is a good” now what will happen I will pass this new phrase as a new sentence to our model, now our model have input, “Rahul is a good” now it will again predict the next word, and this process continues till we get a whole sentence.

How does AI models predicts the next word
transformer predicting new words in a sequence

Now in this way we create a model which can generate a new desirable sentence from some initial set of words or we can say promts.

Now let us understand how does the model predicts the next word,
so first of all our initial input gets broke into some small pieces which are called tokens then each token gets converted into a vector, vectors are the set of numbers, now these numbers are arranged in such a way that it somehow conveys the meaning of that token or piece of sentence.

How does the Transformer works within AI

Now the sequence of these vectors is passed to an attention block, now what does this attention block do, inside this block all vectors communicate with each other and exchange their values, wait wait… Let me explain you with a better example, like one sentence is “She shed a tear while watching the emotional movie” and the other sentence is “Be careful not to tear the paper” Now here “tear” this token may be the same in both sentences but Even though the spelling of both tokens is the same, their vector representation will be different, like I told you earlier, vector representation depends on the meaning of the particular token, now what is the meaning of this word, we cannot tell just by looking at the token, For this, we have to contact with nearby tokens, and by finding the meaning of this token, its vector has to be changed accordively, and this is what attention block do. Each token will communicate with all the tokens around you and will also change its vector value so that our machine can understand the meaning of each particular token according to the sentence.

Now our machine has updated each token according to the sentence, after that all these tokens go to the second layer, which we can call multi-layer perception block, in this block now vectors or tokens will not contact with each other, by the way, understanding this block is not that easy, but later we will understand it in-depth how this block works, For now, you can understand that something is typed in this particular block that many questions are asked from each vector and the value of the vector changes on the basis of these questions.

Now after this, our vectors go through attention block and multilayer perception again and again and after this process happens quite often, the meanings of all our vectors change in such a way that we get a last vector, The last vector is all the words that can probably be the next word of our passage, now we will perform a particular operation on top of it and by creating a probability distribution we can predict the next word of our passage,

Now, as I said earlier, if we somehow create a model that can predict the next word, then we can create a complete chat-bot. So now we have some such model that predicts the next word.

If you have used the initial versions of Chat-GPT, then you will know that in the initial versions, he used to do the same, you used to give an intel text or story and on its basis he used to complete your text or story.

The easy way to make it a chat bot can be to give some advance prompt to our model, which we call System Prompt, and whenever the user asks any question, we predict the words by putting our System Prompt and User Prompt in our transformer. Now for this, you have to train your AI model a little more, but this is its general idea.

Rahul Barakoti

Rahul is a tech enthusiast with a passion for unraveling the complexities of our digital age. Armed with a degree in Computer Science, he has worked in various tech roles, gaining valuable insights into the industry. Now, as a writer, Rahul shares his expertise through articles and blog posts, demystifying complex concepts and sparking conversations about the future of technology. With a keen eye for detail and a knack for storytelling, he empowers readers to navigate the ever-evolving tech landscape with confidence and curiosity.

Related Articles

One Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

The atmospheric oxygen level in Kedarnath 2024 INDIAN TEAM SQUAD FOR T20 WORLD CUP Top places to visit in uttarakhand 5 Best Virtual Reality Games Doppler weather Radar in Uttarakhand INS Vikrant Exploring Education and Research at FRI Dehradun India’s defence export 2023 Symptoms of Hormonal Imbalance in Women Immunity Boosting Drinks – Easy HomeMade