laitimes

Saw her rebuild the Tower of Babel

Gender bias and gender inequality in academia have a long history. As a gathering place for the most cutting-edge ideas and technologies of mankind, women in academia often face various unnecessary dilemmas. To this end, the academic community has actively promoted equal rights in recent years and tried to eliminate various negative influences, but the road to gender equality is still a long way to go.

However, we see that even under many obstacles, many outstanding female scientists have been seen, making indelible contributions to the advancement of human science and technology, and have become a "she" force that must be faced. To that end, we've launched the "See Her" series to tell their stories.

Once upon a time, people living on earth spoke the same language and had the same accent. One day, they decided to unite and build a tower that reached heaven to heaven. Unexpectedly, this alarmed God, so God made human beings speak different languages, making it impossible for them to communicate with each other. The plan thus failed, and humanity has since scattered.

Saw her rebuild the Tower of Babel

The Tower of Babel

Museum Boijmans Van Beuningen

This is the story of the Tower of Babel in the Bible's Old Testament, and the Bible's interpretation of the emergence of so many different languages and races in our world. However, from drilling wood for fire to the invention of writing, from the advent of printing to the revolution of the steam engine, from the world's first telephone to today's Internet of Everything, the history of human development is no less than an epic poem of "going against the sky", in which the driving force of science and technology and innovation is unquestionable. Today, human beings have come a long way, and if we were to build another tower today, we would not have to build bricks with our bare hands, or even be satisfied with using some simple machinery. The reinvention of the Tower of Babel in this era may be a picture of people working with a group of artificial intelligence robots.

Saw her rebuild the Tower of Babel

God & iPad La Biennale di Venezia

If the above story tells us anything, it is that the importance of communication, understanding, and collaboration is self-evident, and empathy, as one of the highest functions of human beings, is what builds on them and really unites us spontaneously to fight for a common goal. Numerous studies have shown that women naturally have an advantage over men in empathy, and some outstanding female scientists have combined this advantage with the rigorous logic and imagination of science to become the most important force in our "tower building".

Does the machine understand what we're talking about?

To get robots to help us build towers, we must first let them understand what we mean. Get used to the simple "Hey! Siri" we may not realize the complexity of the artificial intelligence and core natural language processing (NLP) behind it, and this alone is far from enough. In fact, as early as 2001, in the "MIT Technology Review" "Top Ten Breakthrough Technologies in the World", natural language processing was already on the list. But the real qualitative leap came after the rise of deep learning in 2013-2014 (selected as one of the "Top 10 Breakthrough Technologies in the World" in 2013) and applied to NLP, which is an extremely visionary and far-sighted choice from the current point of view.

Saw her rebuild the Tower of Babel

MIT Technology Review, January 2001

MIT Technology Review

Back to the present, as the saying goes, human sorrow and joy are not the same, and Yang Diyi, an assistant professor at Georgia Tech's School of Computer Science, hopes to promote the interaction between people and machines and between people by developing more advanced NLP technology.

Diyi Yang currently leads the Georgia Institute of Technology's Social and Language Technology Lab, which combines NLP, machine learning, and the social sciences to study how humans use language in social contexts, and her work is a novel fusion of artificial intelligence technology and social science theory.

Early in her research career, she completed a paper called "Who Did What: Editor Role Identification in Wikipedia" under the guidance of her supervisors Robert Kraut (one of the pioneers in the field of human-computer interaction at Carnegie Mellon University) and Eduard Hovy (an authority in the field of NLP). By analyzing the editorial content of the English Wikipedia to identify the roles played by editors and studying how each role affects the quality of the article, the paper helps researchers and community administrators better build a healthy, prosperous community.

Saw her rebuild the Tower of Babel

Yang Diyi Yang Diyi

In 2016, she partnered with the American Cancer Society to combine NLP and recommendation systems to identify communications between cancer patients and doctors. Cancer patients will have a lot of pressure when communicating, the text written is generally longer, and the main content that they actually want to express may only be a few points. Yang Diyi and his team analyzed based on the real data provided by the association, used the hierarchical attention network to classify a large number of dialogue information on the association's network platform, used algorithms to highlight the important contents such as symptoms and needs, and then matched patients seeking different types of help with different doctors by building a recommendation system, so that the system has been improved in both efficiency and humanistic care.

She said

"The model must not only have the function of querying and matching information, but also convey emotional support in a way that encourages it."

Saw her rebuild the Tower of Babel

Yang Diyi's paper on hierarchical attention networks Yang Diyi

In Yang Diyi's view, language is not only grammar, syntax, and discourse, but also the expression and transmission of language has a goal, and this goal is the intention that the speaker wants to achieve. Therefore, social NLP should have a deeper understanding of language, such as who is speaking, who is speaking to whom, what information is wanted to convey, what the purpose is, and so on. She has always adhered to the principle of people-centered research, and strives to build socially conscious language technologies so that NLP models can go beyond fixed data sets or corpora for social knowledge and common sense derivation, reason about massive user-generated unstructured data, and achieve the next stage of natural language understanding.

With all this, Yang Diyi was successfully selected into the 2021 MIT Science and Technology Review "35 People Under the Age of 35" China, and the award category was "Humanistic Caregiver", which is well deserved. Her work undoubtedly allows machines to better understand our expressions in social contexts and even achieve "empathy" to some extent.

When talking about the small proportion of women among scientific researchers, Yang Diyi said that one of the important reasons is that there are many words with strong destructiveness in the language system we are accustomed to, such as "girls have good grades when they are young, and their mathematical, physical and chemical achievements will not be good when they grow up", etc. Behind them is the depreciation and constraint of social culture on women's abilities, which often requires the efforts of several generations to eliminate. As a strong advocate of "technology for good", she seeks to use NLP technology to eliminate the negative effects of such discrimination, prejudice and stereotypes, and is currently conducting research on information on social media such as Twitter to address social issues such as hate speech, covering all aspects of race and gender.

Saw her rebuild the Tower of Babel

Yang Diyi's paper on racial discrimination under the epidemic Yang Diyi

Do we really understand machines?

We build the Tower of Babel in collaboration with machines, and collaboration and communication are based on a "mutual" understanding. When machines "understand" us more and more, in turn, do we really understand machines?

At first glance this may seem counterintuitive, you might say that machines are designed and built by us, and the answer is of course yes. In fact...... Not necessarily, it starts with artificial intelligence and machine learning. Essentially, where a mechanism can accomplish a function through feedback, it is artificial intelligence. Its prototype is very simple, such as the early flush toilet, as long as you press the flush button, the toilet can automatically complete the flush function in the case of small mistakes. As our needs become more and more complex, the requirements for artificial intelligence are also rising, so machine learning is introduced. For example, we want to identify whether the object in the picture is a cat, according to the traditional machine learning idea, we need to disassemble the image of the cat one by one, and identify and grab the cat ears, cat eyes, cat paws and other features to mark, the workload is huge, obviously unreasonable. Therefore, deep learning and even reinforcement learning based on convolutional neural networks came into being, the above features can be automatically crawled, only need to throw the massive data (pictures of cats) into the model for training, as long as the amount of data is large enough, its accuracy will be high, in other words, the machine becomes "smart".

Cat recognition is one of the earliest success stories of deep neural networks

Sohu Technology

Isn't that amazing? But things are not so simple. In simple terms, convolutional neural networks are complex structures designed to mimic the cognitive abilities of the human brain, and their essence is a trial & error mechanism that trains the accuracy of their decisions by acting and then receiving positive or negative feedback. However, this kind of "action" and "feedback" is an end-to-end mechanism, and its decision-making process, judgment weights, and influencing factors are unknown, which is the so-called "black box" of artificial intelligence neural networks, which will lead to many ridiculous consequences. For example, a self-driving car manufacturer found during testing that their car began to deflect to the left with increasingly obvious rules during driving, but there was no obvious reason for it, and the developer could not understand this behavior. After months of painstaking debugging, the system architect finally discovered the root of the problem – the color of the sky. Because some of the training takes place in the desert and the sky is a specific hue, neural networks unknowingly establish a correlation between left turn and lighting conditions; for example, an image classification neural network becomes very good at identifying horses. The system's designers were very proud of this until they discovered the key to its efficiency: since pictures of horses are often copyrighted, neural networks classify these animals by searching for the "" symbol. The "creativity" of this neural network is beyond doubt, but it is only a matter of time before something goes wrong.

"Artificial retardation" Infoworld

We created artificial intelligence, but obviously didn't understand how reinforcement learning and convolutional neural networks behind it worked. So, how do you "open the black box" to find and avoid potential problems? Wang Mengdi, a tenured professor in the Department of Operations Research and Financial Engineering and the Department of Computer Science at Princeton University, is conducting "unboxing" research and exploration, trying to explore the concise rules behind reinforcement learning.

As mentioned earlier, "The core idea of cybernetics is that for a known system, a mechanical system, or an electrical system, we can describe it completely with differential equations, at which point we can design a feedback mechanism that we can use to achieve our goals." This is cybernetics, the prehistoric era of artificial intelligence. Wang Mengdi explained. Similarly, reinforcement learning is based on the state of the system, constantly dynamically manipulating the system. The difference is that for reinforcement learning algorithms, the system to be controlled is a black box function that does not have a complete mathematical description and is difficult to directly solve the optimal strategy. During her phD at MIT, Wang Mengdi chose the direction of partial mathematics and partial theory system and information theory, and she also started from the ancient theoretical idea of cybernetics, combined with the latest and most cutting-edge reinforcement learning, and used her advantages in mathematics and statistics to solve the problems of the unexplainability and difficulty of reproducibility of the "black box" of reinforcement learning.

Saw her rebuild the Tower of Babel

Wang Mengdi Wang Mengdi

"Reinforcement learning is the future of artificial intelligence, which should be combined with cybernetics and statistical ideas to explore the dynamic process of a complex system with big data methods." This aspect is blank under the framework of reinforcement learning, and my job is to establish this framework. "

In 2016, Google DeepMind's AlphaGo defeated Lee Sedol, the top player of human Go, and it was also the first time that reinforcement learning algorithms entered the public eye. "Why do we care about the game?" Wang Mengdi, who joined DeepMind during his academic sabbatical and is a senior research scientist, said, "In the process of developing their own intelligence, human children learn how to make decisions through games, and the development of artificial intelligence is also at this stage, and we will soon see that artificial intelligence will not only be able to play games, but will solve more difficult problems." In fact, in high-risk areas such as biomedicine and finance, the amount of data is limited and the fault tolerance rate is extremely low, it is impossible to allow ordinary reinforcement learning artificial intelligence to carry out unlimited trial and error like in the game, and the "black box" attribute of reinforcement learning determines its uncontrollability, which is also the difficulty of sim2real. Wang's work enables "interpretable, transparent ai", which not only detects and eliminates bias, improves the accuracy and performance of models, and reduces the amount of labeled data required to train the network, but also enables the application of AI in high-risk fields.

"Black Box" decrypts Alice Yang

With various outstanding contributions, Wang Mengdi was successfully selected as a "pioneer" in the 2018 MIT Science and Technology Review "35 People Under the Age of 35" China.

At present, Wang Mengdi's research work focuses on data dimensionality reduction and offline reinforcement learning, and pays more attention to "efficient" attempts, collecting data at the lowest cost and retaining the most content information. Her work has greatly promoted "black box transparency", in other words, we can finally know what the AI robot is "thinking".

How do we work with machines?

Once we understand each other with the robot, the next question is how to talk and cooperate. The construction of the Tower of Babel is literally an act of "ascending to heaven", and we must maximize our respective advantages to make it possible. Fortunately, there is another outstanding female scientist who has laid a "tower" for us. "Machines and humans have very different capabilities," says Danqi Chen, an assistant professor of computer science at Princeton University, "and we humans are good at logical reasoning and discerning hints and subtleties in language, and machines are very good at processing massive amounts of data on a large scale." As one of the first pioneers to apply deep learning to natural language processing (NLP), her research covers the tasks of NLP to understand the structure of language itself and the specific application of two main categories, and has output important research results on several key issues such as syntactic analysis, knowledge graph, information extraction, dialogue and question answering systems, helping machines acquire knowledge and better answer questions.

Chen Danqi Chen Danqi

Chen Danqi has been interested in the humanities since he was a child, and at the same time he is extremely good at mathematics, and his indissoluble relationship with machines has been formed as early as high school. She participated in the competition and summarized a set of divide and conquer algorithms, which were later widely adopted and influential, and were named "CDQ Divide" by industry insiders according to her name. In 2012, when she graduated from Tsinghua Yao class and went to Stanford University, she suddenly realized that NLP is actually the intersection of the two worlds of humanities and mathematics, which is the best for herself, almost a mission. Chen Danqi studied under Christopher Manning, an authority in the field of NLP, and the algorithm he developed with him later gave birth to the famous Google SyntaxNet, which has been called "the world's most accurate natural language parser".

Later, during her internship at FAIR, an AI research institute owned by Facebook (now Meta), she led the open domain question answering system project, DrQA, and published the paper "Reading Wikipedia to Answer Open-Domain Questions", which explained how this project obtained answers from Wikipedia through massive reading and retrieval, and answered factoid questions. This project shows how machines' reading and question-and-answer capabilities can be broken through with the help of large-scale open source external knowledge bases, and also shows us the possibility of asking any question to the machine, and then the machine can find relevant information in the massive amount of data and organize it into answers and even solutions to assist us in making decisions.

DrQA Meta (Facebook)

There are many similar studies, including her doctoral dissertation "Neural Reading Comprehension and Beyond". The 156-page paper, which focused on machine reading comprehension, quickly became one of Stanford's hottest PhD dissertations in 10 years, and her supervisor Christopher Manning did not hesitate to say: "Her simple, clean, high-success model attracted everyone's attention... Her thesis focused on neural network reading comprehension and Q&A, emerging technologies that are leading to better ways to access information. Conciseness and practicality have always been the key words throughout her research.

"I'm excited about the most fundamental, simple yet practical methods. I care a lot about how to build a practical NLP system and always enjoy the process. I don't want my research to stop at a beautiful concept, but to be practically put into practical application. ”

With the above contributions, Chen Danqi, as a "pioneer", was successfully selected as one of the "35 Under 35 Years old" China in the 2019 MIT Science and Technology Review. Today, she has formed her own NLP team in Princeton to address core issues in more NLP areas. One of the most ambitious may be to further exploit the advantages of large-scale processing of data by machines, so that machines can obtain and understand all the existing knowledge of humans on the Internet through NLP, and be able to "think" like humans, further logically derive this vast sea of knowledge, and make judgments and decisions with little or no supervision. Chen Danqi calls it "deep understanding."

Rebuild the Tower of Babel

Communication, understanding, empathy are the cornerstones of building the "Tower of Babel" in any era. Whether between people or between people and machines, these outstanding female scientists insist on starting from people themselves and make indelible contributions to the interconnection of the world. Maybe one day we will really "reach the heavens", and we may not reach the so-called "Promised Land", but the unity of all mankind.

Translating everything from the "Babel Fish" "Guide to The Galaxy"

The development of the world needs science, and women are the indispensable backbone of its development.

Since 1999, MIT Technology Review Has selected "35 Under 35" (TR35) from around the world every year, which is one of the most authoritative evaluation systems for young talents in the field of science and technology. In 2017, the TR35 China Selection was officially launched, and has been held for five years, among which many outstanding young female scientists have been successfully selected every year.

【Registration in progress】

2022 "Under 35 Years Old Scientific and Technological Innovation 35 People" China's registration is in full swing! Young scholars, researchers, inventors, and technology entrepreneurs under the age of 35 in China (including Chinese currently overseas) are welcome to apply for the election, and also solicit nominations from all walks of life to jointly find the 35 people who are most likely to change the world.

【Inquiry E-mail】

Resources:

1.https://tr35.mittrchina.com/

2.https://www.boijmans.nl/en

6.https://www.aminer.cn/pub/5843777eac44360f108417ec/hierarchical-attention-networks-for-document-classification

7.https://scholar.google.com/citations?view_op=view_citation&hl=zh-CN&user=j9jhYqQAAAAJ&sortby=pubdate&citation_for_view=j9jhYqQAAAAJ:1yQoGdGgb4wC

8.https://faculty.cc.gatech.edu/~dyang888/

12.https://scholar.google.com/citations?view_op=view_citation&hl=zh-CN&user=33yNvIgAAAAJ&sortby=pubdate&citation_for_view=33yNvIgAAAAJ:EYYDruWGBe4C

13.https://mwang.princeton.edu/

14.https://engineering.princeton.edu/news/2020/06/29/princeton-engineering-faculty-members-receive-grants-covid-19-research-c3-ai-digital-transformation-institute

16.https://medium.com/analytics-vidhya/reinforcement-learning-a-surface-level-explanation-75690f03840d

17.https://www.infoworld.com/article/3315748/explainable-ai-peering-inside-the-deep-learning-black-box.html

21.https://www.cs.princeton.edu/news/ushering-machines-world-human-knowledge

22.https://www.technologyreview.com

- End -

Read on