The Military-Industrial Scientific Research System of the Academic Master

Chapter 758 Manifold Learning

The question raised by Yao Mengna is not difficult for Chang Haonan to understand.

It is just difficult to solve.

To be honest, this involves a series of problems such as text mining, data visualization, information retrieval, data mining, machine learning and even artificial intelligence.

If fully automated production is achieved as Yao Mengna envisioned, it will be Industry 4.0.

At this point in time in 1999, it is obviously not very realistic.

But the impossibility of fully realizing this whole set of things does not mean that there are no parts that can be used as breakthroughs.

For example, data mining and information retrieval are very popular research directions around the millennium.

Its core purpose is to extract valuable knowledge from massive databases and a large amount of complex information, and further improve the utilization rate of information.

In fact, before Chang Haonan was reborn, the field of aircraft design and manufacturing had begun to apply this technology, and he himself had been exposed to a lot of it.

But as an ordinary technician with an engineering background, he did not have much theoretical foundation.

And the system first needs to build a complete and feasible idea.

This has led to a lot of nouns in his mind, but he doesn't know which one is the key to breaking the deadlock.

In fact, he is now facing the dilemma of not being able to extract valuable information from a large amount of complex information.

"Information..."

Chang Haonan pulled a piece of paper from the side and wrote two words in the middle of the paper.

In an idealized model, it is best that one data can accurately and uniquely describe a meaning.

That is, one-dimensional data.

The application problems in elementary and middle school are generally like this.

In real life, most of the problems are also this kind of problem.

For slightly more complicated situations, a set of data is often required to fully describe a meaning.

But at the same time, this set of data can often describe more than just this one meaning.

In order to mathematically describe the phenomenon that one (or more) set of data corresponds to multiple meanings, it is necessary to expand a set of data in different dimensions.

This is a situation where mathematical theory is pushed to reality.

In turn, the information collected in reality, in most cases, is already expanded high-dimensional data.

And if you want the computer to process these high-dimensional data...

Chang Haonan thought for a while, and wrote down three basic conditions on paper:

1. Compress the original high-dimensional data to reduce the dimension of the original high-dimensional data, thereby saving storage space and reducing the computational complexity of high-dimensional data.

2. Eliminate, or at least reduce the noise hidden in the original high-dimensional data.

3. Extract high-quality data features to improve the effectiveness of subsequent data representation and classification tasks.

He went through these three items in his mind and tried to get the system to give a result.

No response.

Obviously, this cannot be regarded as a "complete and feasible" idea.

...

Unconsciously, Chang Haonan sat at his desk until it was almost time for lunch.

Still couldn't come up with a good idea.

Until a sound from his stomach woke him up from his deep thoughts.

He was indeed a little hungry.

Yao Mengna looked at a noun and three sentences on the paper, and knew that Chang Haonan probably didn't have any ideas, so she simply stood up and said:

"Why don't we go eat first?"

"Okay."

Chang Haonan is not the kind of person who is obsessed with trivial matters.

Moreover, I can't figure out something like mathematics by just thinking about it.

Without inspiration, it's useless to say anything.

It's better to relax and change your mind.

Fifteen minutes later, the three of them (including Zhu Yadan) were already sitting around a round table on the second floor of the cafeteria.

This is a small kitchen with a meal order system. The price is a little more expensive than the big cafeteria below, and there is one more floor, so not many people come here to eat.

However, the small supermarket next to it has a lot of people coming and going.

There is a steaming bowl of mutton soup noodles in front of Chang Haonan, but he didn't rush to pick up his chopsticks. Instead, he stared at the people going up and down the stairs not far away.

In the 1990s, instant noodles were still a very popular instant food.

When Chang Haonan was an undergraduate, everyone's conditions were generally poor, and not many people had spare money to afford it.

But by 1999, it was not uncommon for college students to have a few bags or even a box in their dormitories.

"You say..."

Chang Haonan suddenly said:

"How do companies that produce instant noodles ensure that they don't miss or overfill the seasoning packets?"

Yao Mengna, who was eating with her head down, was stunned, and then realized that Chang Haonan was still thinking about the question she had just raised.

Putting seasoning packets in instant noodles and riveting an airplane are actually similar in mathematical models.

And companies that produce instant noodles are obviously unlikely to have very high-end equipment and technology.

"About... weigh?"

Yao Mengna guessed:

"The seasoning packet accounts for about 10% of the weight of the whole instant noodles. If there is less or more, it should be easy to detect."

"Well... but the weight of the noodles itself has errors, and there are several types of seasoning packets. Weighing can only prove that the total amount is correct, but it cannot guarantee that there is no mistake..."

Chang Haonan shook his head and denied it.

Zhu Yadan next to him looked at Chang Haonan on the left and Yao Mengna on the right. He really didn't know why the two of them suddenly discussed this issue.

"That..."

Although she felt that she was showing off in front of the two doctors, she couldn't help it in the end:

"Before the packaging step, wouldn't it be enough to find someone to watch next to the assembly line?"

Yao Mengna held her forehead with one hand:

"We are just thinking about how to achieve the same effect without this person."

"This..."

Zhu Yadan shrank her head instantly:

"I just said it casually... But sometimes the role of the human brain may still be irreplaceable..."

The table was quiet again, with only occasional faint chewing sounds.

But Chang Haonan still didn't move his chopsticks.

"You're right."

A few minutes later, when Zhu Yadan was about to finish the fried noodles on the plate in front of him, Chang Haonan suddenly said:

"The human brain can analyze high-dimensional data in some way to gain perception of the outside world."

"?"

Zhu Yadan raised his head with a head full of question marks, but looking at Chang Haonan thinking, he was very self-aware and didn't interrupt.

"In other words, external information with high dimensions must be latent in a nonlinear manifold structure in a low-dimensional space..."

Nearly 70 years ago, American statistician Harold Hotelling had proposed the principal component analysis method for reducing the dimension of high-dimensional data.

He believed that the larger the variance, the more information it provides, and vice versa, the less information it provides. Therefore, several principal components with large variance and high information content are constructed through the linear combination of the original components, and then the matrix singular value decomposition is performed to reduce the data dimension.

However, the principal component analysis method is only equivalent to finding the best linear mapping in the sense of the smallest projection distance, and there are not so many simple linear problems in reality.

However, this idea can be used as a reference.

Chang Haonan put down the mutton soup noodles that he had only eaten a mouthful of, stood up, and walked out of the cafeteria quickly.

Zhu Yadan, who was responsible for security, hurried to catch up.

Yao Mengna's reaction was a little slow. Just when she wanted to get up, she realized that she had not paid yet, so she had to take out her wallet and walked to the cashier helplessly.

Chang Haonan, who returned to the office, found the paper again.

A few more lines were written below the three basic conditions.

Given a set of high-dimensional data X={x1, x2, …, xn}RD, n is the number of data samples, and D is the dimension of the high-dimensional data.

Assume that the data samples in X come from or approximately come from the data Y={y1, y2, …, yn}Rd in the low-dimensional embedding space.

Find a mapping relationship from the high-dimensional observation space to the low-dimensional embedding space, so that yi=(xi), and a one-to-one reconstruction mapping relationship^-1, so that xi=^-1(yi).

At this point, Chang Haonan showed a satisfied smile on his face.

Although he still did not give a complete idea, he had at least parsed the three abstract basic conditions into a specific mathematical problem.

For theoretical research, clearly raising the question is almost equivalent to walking half of the road to success.

Thinking of this, he returned to the top of the paper and wrote six more words.

Manifold learning method.

Chapter 758/1354
55.98%
The Military-Industrial Scientific Research System of the Academic MasterCh.758/1354 [55.98%]