uses WeChat sweep code
to share with friends and friends circle
Hao Nan Xiao check from
, the official account QbitAI
two months ago, the AI design master who clip < / strong >, just by < a target ="_ blank" href=" https://news.163.com/news/search?keyword=OpenAI "> openai < / a >" pokes away "the brain. < / P > < p > < / P > < p > unexpectedly, this powerful AI is so similar to the way of human thinking. < / P > < p > for example, whether you hear the word "fried chicken" or see the real object of fried chicken, you may drool. Because you have a group of "fried chicken < a target ="_ blank" href=" https://news.163.com/news/search?keyword=%E7%A5%9E%E7%BB%8F%E5%85%83 "> neurons < / a >", which are specially responsible for responding to fried chicken. < / P > < p > this clip is similar. < / P > < p > whether you hear the word "Spider Man" or see a picture of spider man, a special area of clip begins to respond, and even the area originally used to respond to red and blue will be "restless". < / P > < p > openai found that clip had a "spider man neuron". < / P > < p > in brain science, this is nothing new. As early as 15 years ago, scientists studying the human brain discovered that a face corresponds to a group of neurons. < / P > < p > but it's a big step forward for AI. In the past, from text to image and from image to text, two systems were used, and their working methods were different. < / P > < p > while clip has a very similar working mode to the human brain. CV and NLP are not only technically connected, but also have the same thinking in the brain. They also have special processing areas. < / P > < p > seeing that the two are so similar, some netizens said: < / P > < p > is terrible, which shows that GM < a target = "_ blank" href=" https://news.163.com/news/search?keyword=%E4%BA%BA%E5%B7%A5%E6%99%BA%E8%83%BD "> AI < / a > (AGI) is coming, faster than anyone thought. < / P > < p > moreover, openai was surprised to find that clip's response to images seems to be similar to that of intracranial neurons in patients with epilepsy, including neurons that respond to emotions. Maybe AI can help treat neurological diseases in the future. < / P > < p > the "brain" of AI is actually the same as that of human beings < / P > < p > to review the previous situation, < strong style = "margin: 0px; padding: 0px; max width: 100%; box sizing: border box! Important; word wrap: break word! Important; color: rgb (0, 153, 127);" > clip < / strong > is Shenma in the end. < / P > < p > not long ago, openai released < strong style = "margin: 0px; padding: 0px; max width: 100%; box sizing: border box! Important; word wrap: break word! Important; color: rgb (0, 153, 127);" > Dall · e < / strong >, which was born in gpt-3. It can generate pictures accurately according to text description. < / P > < p > Dall · e's understanding of natural language and images and < strong style = "margin: 0px; padding: 0px; max width: 100%; box sizing: border box! Important; word wrap: break word! Important; color: rgb (0, 153, 127);" > integration and penetration have reached an unprecedented level. Once published, it immediately attracted < a target = "_ blank" href=" https://news.163.com/news/search?keyword=%E5%90%B4%E6%81%A9%E8%BE%BE "> Wu Enda < / a >, the father of keras and other bigwigs praised. < / P > < p > and clip is the core part of Dall · E. < / P > < p > to put it simply, clip is a reordering model, which checks all the generated results of Dall · E and shows the good ones. < / P > < p > clip can be a "referee" without the ability to "integrate" the meaning of words and pictures. However, it is not clear where this ability comes from. < / P > < p > openai then digs deep into the principle and structure of clip neural network, and finds its < strong style = "margin: 0px; padding: 0px; max width: 100%; box sizing: border box! Important; word wrap: break word! Important; color: rgb (0, 153, 127);" > multimodal neuron < / strong >, which has a working mechanism similar to that of human brain: < strong style = "margin: 0px; padding: 0px; Max width: 100%; box sizing: border box! Important; word wrap: break word! Important; color: rgb (0, 153, 127); "> can respond to the same meaning in text and image at the same time < / strong >. < / P > < p > and the so-called < strong style = "margin: 0px; padding: 0px; max width: 100%; box sizing: border box! Important; word wrap: break word! Important; color: rgb (0, 153, 127);" > mode < / strong >, refers to a process or something, which contains multiple different features. Image is usually associated with label and text interpretation, and is an element of a complete understanding of a thing. < / P > < p > for example, you can see the following three words: spider man < / strong >, or the following three words: strong style = "margin: 0px; padding: 0px; max width: 100%; box sizing: border box! Important; word wrap: break word! Important; color: rgb (0, 153, 127);" > spider man < / strong >, or the following three words: strong style = "margin: 0px; padding: 0px; max width: 100%; box sizing: border box! Important; word wrap: break word! Important; Color: rgb (0, 153, 127); "> Spiderman < / strong >, can think of a superhero in a red and blue tights. < / P > < p > when you are familiar with this concept and see such a black-and-white hand painting, you can immediately understand that this is "Spider Man": < / P > < p > the multimodal neurons in clip have no difference in ability from human beings. < / P > < p > openai has found several neurons that are specifically responsible for something, including 18 animal neurons and 19 celebrity neurons. < / P > < p > there are even neurons specialized in understanding emotions: < / P > < p > in fact, human beings are the sum of multimodal learning. We can see objects, hear sounds, feel texture, smell and taste. < / P > < p > in order to make AI get rid of the previous mechanical working mode of "artificial mental retardation", one way is to make it understand multimodal signals at the same time as human beings. < / P > < p > therefore, some researchers believe that multimodal learning is the real development direction of artificial intelligence. < / P > < p > in the process of implementation, it is usually to identify the output weighted combination of different subnetworks, so that each input mode can have a learning contribution to the output prediction.
根据任务不同,将不同的权值附加到子网后预测输出,就能?
2023-03-22 10:04:29