贝叶斯网络建模

I am feeling sick. Fever. Cough. Stuffy nose. And it’s wintertime. Do I have the flu? Likely. Plus I have muscle pain. More likely.

Bayesian networks are great for these types of inferences. We have variables, some whose values have been fixed. We are interested in the probabilities of some free variables given these fixed values.

In our example, we want the probability that we have the flu, given some symptoms we have observed, and the season we are in.

So far it looks like reasoning with conditional probabilities. Is there more to it? Yes. A lot more. Let’s scale up this example and it will come out.

Towards A Large-scale Bayes Network

Imagine that our network models every possible symptom, every possible disease, outcomes of every possible medical test, and every possible external factor that might potentially affect the probability of some disease. External factors break down into behavioral ones (smoking, being a couch potato, eating too much), physiological ones ( weight, gender, age), and others. For good measure, let’s also throw in treatments. And side-effects.

By now there is enough and useful medical knowledge to capture tens of thousands of variables (at the very least) and their interactions. For any set of symptoms, together with the values of some of the behavioral, physiological, and other external factors, we could estimate the probabilities of various diseases. And more. For a given disease, we could ask it to give us the most likely symptoms. And way more. Such as I have a cough and high fever but the flu has been diagnosed out, what other diseases are likely? For a given diagnosis, and our particular symptoms, and possibly additional factors such as our gender and age, we could ask it to recommend treatments.

Now we are getting somewhere. How does all this magic work? This is what we will explore here.

Connectivity

First question, where does the network come in? In modeling the interactions among the tens of thousands of variables.

Modeling all possible interactions among that-many variables is nearly impossible. It is the network that gives us a mechanism to cut through this complexity. By letting us specify which interactions to model. The aim is to seek a model that is rich enough. But not overly complex.

Speaking of interactions, how do we decide which ones to model? Typically via domain knowledge. In our case, leveraging the collective knowledge of the medical field acquired over millennia of clinical practice and research.

What would our Bayes net look like? Structurally, a giant directed graph with nodes for the various symptoms, diseases, medical tests, behavioral factors, physiological factors, and treatment options. With suitably chosen (or inferred) arcs to model significant interactions among them. Such as among specific symptoms and specific diseases.

Connectivity Refined

A Bayes network is structurally a directed graph, an acyclic one at that. Directed means that edges have a direction to them, which is why they are called arcs. Acyclic means there are no directed cycles. Here is an example of a directed cycle: A BCA.

Apart from the acyclicity constraint, the modeler has full control over what nodes to connect with arcs and how to orient them. That said, in complex real-world use cases such as the one we are discussing here (medical diagnosis) there is an appealing guiding principle.

Choose arcs to model direct causes. Orient them in the direction of causality.

So if A is a direct cause of B, we would add the arc AB. Such a network is called a causal Bayes network.

A causal network’s structure is only as accurate as its variables and the fidelity of the causal relationships. For instance, the truth might be that A causes B and B causes C. But we might not even know of B’s existence. So the best we would be able to do is to model this via the arc AC.

Causal Modeling

Okay, so let’s think causally in the medical setting. This is what we come up with.

``Variable Type A causes Variable Type B Exampledisease         causes symptom            flu causes you to coughbehavior        causes disease            smoking causes lung cancerphysiological   causes disease            aging “causes” various    factor                                    diseasestreatment       "causes" disease          chemotherapy reduces                                           cancertreatment       causes side-effect        chemotherapy causes                                           hair-loss``

Before closing this section, let’s note that we shouldn’t worry too much about getting a few causal arcs wrong. (Of course, we prefer not to.) The consequences are not severe. In fact, we’ll likely have quite a new non-causal arcs in the network anyhow. To model correlations whose links to causation are unclear or non-existent. In fact, the network can’t even distinguish between casual and non-casual arcs. Not in our use case.

Take this example. Say A and B are strongly correlated. Say you thought A causes B, so modeled this with the arc AB. But you were wrong. Adding this arc is still a good thing, as it models the correlation. The next section discusses non-causal arcs in more detail.

Non-causal Arcs

Causality is a compelling guiding principle in the network’s design. However, it is not sufficient. That is, adding non-causal arcs can improve the model further.

Consider correlations among variables. Such as among a set of symptoms or a set of diseases. Causal relationships within the set may not be known or even exist. We do want to model the correlations though. So we should add suitable “non-causal” arcs.

Here is a simple example. Say there is strong belief or evidence that dry cough and irritated throat are correlated. Say these are the only two variables in the network. Connecting them with an arc in either direction will capture this correlation. Leaving the arc out will treat them as independent. We don’t want that.

The Network’s Master Equation

At some juncture, just like a picture can reveal a vista, so can math. We are at that point. So here goes.

Formally, a Bayes Network is a directed acyclic graph on n nodes. The nodes, call them X1, X2, …, Xn, model random variables. The arcs model interactions among them.

More precisely, the structure of the network factors the joint distribution over the n variables as

P(X1, X2, …, Xn) = product_i P(Xi|parents(Xi))

P (X1， X 2，…，Xn)= product_ i P ( X i | 父母 (Xi))

There is a lot to unpack here. Let’s start with: parents(Xi) is the set of nodes with arcs coming into Xi. Huh?

Let’s ease into it with simple examples. All have the same 5 nodes A, B, C, D, E.

Our first network will have no arcs. So none of the nodes will have any parents either. So

P(A,B,C,D,E) = P(A)P(B)P(C)P(D)P(E)

P(A，B，C，D，E)= P(A)P(B)P(C)P(D)P(E)

Our second network will be a Markov chain. Structurally, the graph is a single path A → B → C → D → E. Node A does not have any parents. Node B’s parent is A. Node C’s parent is B. Etc. So

P(A,B,C,D,E) = P(A)P(B|A)P(C|B)P(D|C)P(E|D)

P(A，B，C，D，E)= P(A)P(B | A)P(C | B)P(D | C)P(E | D)

Our third network is the Naive Bayes classifier in which E serves as the class variable and A, B, C, and D as the predictor variables. It’s graphical structure is

E → A, E → B, E → C, E → D

E→A，E→B，E→C，E→D

E has no parents. Each of A, B, C, and D has one parent: E. Accordingly

E没有父母。 A，B，C和D中的每个都有一个父对象：E。

P(A,B,C,D,E) = P(A|E)P(B|E)P(C|E)P(D|E)P(E)

P(A，B，C，D，E)= P(A | E)P(B | E)P(C | E)P(D | E)P(E)

Readers familiar with naive Bayes classifiers will recognize the form on the right-hand side of this equation. Think of A, B, C, D as the predictors, E as the class variable.

Now we are ready for a clinical example.

Clinical Network Example: Flu and its Symptoms

Consider the network whose variables are flu, fever, cough, stuffy nose, and season. For simplicity suppose the first four are boolean (yes/no) and the third categorical (spring, summer, fall, winter).

Causal modeling would yield the following arcs:

``flu → fever, flu → cough, flu → stuffy nose``

To these let’s add the arc `flu ``season`. This is not a causal arc, i.e., we could have flipped its direction. But we won’t. So that its direction is aligned with the direction of the causal arcs emanating from flu. This will be convenient for the diagnosis covered in the next section.

Interestingly, it’s not a coincidence that this network’s structure is that of the naive Bayes classifier.

Diagnosis: From Symptoms To Flu

We want the probability that we have the flu, given that we have a fever, cough, stuffy nose, and wintertime. Let’s formally express this as

``P(flu = yes | fever = yes, cough = yes, stuffy nose = yes, season = winter)``

or more concisely (and a bit more generally) as

``P(flu|fever,cough,stuffy nose, season)``

To infer this, we just apply the Bayes rule:

``numerator(x) = P(fever|flu=x)*P(cough|flu=x)*P(stuffy nose|flu=x)*P(season | flu=x)*P(flu=x)P(flu=yes|fever, cough, stuffy nose, season) = numerator(yes)/(numerator(yes)+numerator(no))``

This is why this network is called a Bayesian network. The inference from symptoms to a disease involves Bayesian reasoning.

The “Beyond Flu” Network

“超越流感”网络

We already have a prescription, so let’s execute. First, start adding nodes for additional diseases and symptoms. Second, add nodes for behaviors, physiological factors, medical tests, etc. Third, start adding more causality arcs, following the guidance given earlier. Such as

``smoking → lung cancer, aging → disease-1, aging → disease-2, …, aging → disease-kchemotherapy → cancer, chemotherapy → hair-loss``

Next, start adding suitable non-causal arcs. To capture correlations among symptoms, correlations among diseases, etc.

The macrostructure of the “backbone” of such a network is below.

``behaviors, physiological factors ⇒ diseases treatments ⇒ diseasesdiseases ⇒ symptoms treatments ⇒ side-effects``

tests?

The terms in plural denote sets of nodes of certain types. Such as diseases. X ⇒ Y denotes a set of arcs from X to Y. This level does not reveal the heads and tails of specific arcs.

We have already discussed why the arc sets are oriented the way they are. The reason we have chosen behaviors and physiological factors to jointly influence diseases is that these two types of factors interact. For instance, the adverse effect of certain bad behavior choices on certain diseases is often higher in older people than in younger people.

The macro-parents of diseases could in fact be more elaborate. Such as

``behaviors, physiological factors, treatments ⇒ diseases``

This would model the joint interaction of all three types of factors, behaviors, physiological factors, and treatments on diseases. That said, such a macro-level interaction would in general produce quite a complex network. So to convey the essence of the backbone, we’ll stick to our earlier macro-structure. That said, exceptions, i.e. specific triplets of (behavior, physiological factor, treatment) that influence a particular disease can always be added in. The macro-structure is just a big picture view, not an enforceable schema. The schema is only at the fine-level, specified by the network’s arcs.

Notice we have a set of nodes, tests, which is dangling. We’ll let you ponder how this set should be connected to the rest of the network. Should we have testsdiseases, or diseasestests, or some other?

Training the “Beyond Flu” Network

Training means estimating the various probability distributions P(Xi|parents(Xi)) of the model from data, belief, or a combination.

Training Symptom Distributions

Let’s start with learning the probability distribution of any one symptom conditioned on its parents. Let’s make a simplifying assumption that a symptom’s parents can only be diseases. For instance, parents of the symptom cough would include flu and bronchitis.

Given a symptom S and its parents pa(S), the conditional probability table to capture P(S|pa(S)) is exponential in the number of diseases in pa(S). This is because in principle any subset of the n diseases in pa(S) can occur. (By “occur” we mean diagnosed in a particular visit.) There are 2^n such subsets. This can be quite large when n is large.

Three factors will collectively mitigate this issue. One is that most symptoms will not have a huge number of parents, i.e. a huge number of diseases that can cause them.

The second is that in any one instance, the diagnosed diseases will be a sparse subset of the parents. A diagnosis instance corresponds to taking a snapshot of the state of the diseases of a particular person displaying the symptom. Of all the potential diseases the symptom can appear in, a single person will almost certainly be diagnosed with at most a few. If even more than one. This sparsity will greatly help the training. Simply put, sparsity implies “no significant higher-order interactions”. A numeric example below will illustrate this phenomenon.

The third factor is that we have some control over what we deem to include in the set of parents pa(S) of a given symptom S. If a symptom’s parent set gets especially large, we can prune away diseases that are less correlated with the symptom.

Discovering A Symptom’s Parents From Data

Which diseases should we set as the parents of a given symptom S? Previously we suggested, as a general guideline, using domain knowledge for this. In our particular case, there is a better way. Patient records will reveal which symptoms correlate with which diseases. So this aspect of the structure can also be fruitfully learned from data. The patient records capture within them the collective wisdom of lots of experts making diagnoses in varying scenarios.

The benefit of learning a symptom’s parents from the data are huge. This avoids the network designer from having to acquire the domain knowledge to do this — whether it be via discussions with domain experts, extended readings, or some more elaborate mechanism. Even if this work were distributed over a large team of modelers and domain experts such manual design is laborious and error-prone. There are too many symptoms and too many diseases.

That said, domain knowledge can still help fill in the gaps for situations that may not be covered by patient records, or to surface inconsistencies between belief and data. Simply put, domain-knowledge + data-driven learning is generally better than either alone.

We’ll discuss patient visit records in detail in the next section, as we will anyhow need them for learning the parameters of the network, such as the probabilities in P(S|pa(S)). Regardless of how we have arrived at the structure of pa(S).

Patient Visit Records

We’ll assume every interaction with a medical expert generates a new record, capturing the symptoms observed and the diseases diagnosed. If multiple diseases were diagnosed, which of the observed symptoms were implicated in which disease are also captured. As deemed by the medical expert. The diagnosis may be as certain or as speculative as the expert sees fit. All we care about is that it was done by a professional.

Let’s see an example patient visit record. Made up. Not medical advice!

``(symptoms = high fever, cough, sore throat, lump in throat; disease = flu)(symptoms = lump in throat, chest pain; disease = gerd)``

During this visit, two diseases were diagnosed: flu and GERD. The health expert implicated lump in throat in both.

From such a record we can derive symptom-centered representations, one for each observed symptom. Such a representation lists the diagnosed diseases implicated to that symptom during the visit. These diseases will also be referred to as the symptom’s parents in that visit record.

In our above example, lump in throat’s parents in the record are flu and GERD.

Symptom-centered representations lend themselves to learning symptom distributions.

Discovering A Symptom’s Parents

From the collection of symptom-centered representations derived from all the patient visit records we have access to, we can easily determine the symptom’s parents. These are all the diseases implicated in this data. The parents of lump in throat would be flu and GERD if all we had is the single patient visit record to learn from.

A huge and diverse set of patient visit records may yield, for some symptoms, huge sets of parents. As mentioned earlier, we can prune such large sets by dropping parents that are less correlated with the symptom.

Training Symptom Distributions From Patient Visit Records

We want to learn, for each symptom, its distribution conditioned on its parents. We have a symptom-centered data set available for this learning. (This was derived from patient visit records as described earlier.)

Consider any one instance in this data set. It lists a symptom, together with the diseases implicated with it during a patient visit. What it does not list is the diseases among the symptom’s parents that were not implicated. As we will see below, we need this information as well. Fortunately, we can deduce these diseases by subtracting the implicated diseases from the symptom’s parents.

Let’s see an example. Say cough’s parents are flu, pneumonia, and asthma. (In a real network this list would include a lot more diseases.) Say cough’s parents in a particular patient record are flu. From this, we can deduce that in this instance cough is not caused by pneumonia or asthma. While this deduction is not correct with 100% certainty in this instance repeated occurrences of this same deduction do give a good estimate of the associated conditional probabilities.

From these two pieces of information — which diseases among a symptom’s parents are implicated to and which not in a particular patient record — we will derive a training vector of the following form.

``cough flu pneumonia asthma  1    1      0       0``

This is easy to read. It says that, in this patient record, cough is present, and of cough’s parents, flu is diagnosed, pneumonia is not diagnosed, and asthma is not diagnosed.

Next, consider a patient record whose observed list of symptoms does not include cough. Next, derive values for cough’s parents in this record depending on whether a disease in this set of parents is diagnosed in that record or not.

Here is an example. Say a patient record resulted in the diagnosis

``(symptoms = shortness of breath, chest pain, wheezing; diseases = asthma)``

From this, we may derive the record

``cough flu pneumonia asthma  0    0      0       1``

Armed with a rich enough collection of such records, which of course will keep growing as people will keep getting sick in the foreseeable future, we can learn P(cough|parents(cough)). More broadly, the distribution for any symptom conditioned on its parents.

Are such training instances, looked at individually, perfect? No. The absence of a disease in a diagnosis does not mean with certainty that it is not present, now or soon. The same applies to a symptom. That said, over a larger number of training instances in diverse-enough settings, such noise should get drowned out by the signal. For example, if only 30% of the records in which flu is diagnosed also reveal cough as an observed symptom, we can infer with high confidence that flu produces cough as an observed symptom no more than half the time.

Training The Influence Of Behaviors And Physiological Factors On Diseases

Here we refine the macro-structure

``behaviors, physiological factors ⇒ diseases``

We’ll assume the needed information may also be derived from patient records.

We seek to estimate, for every disease D, the parameters of D’s distribution conditioned on its parents. The parents of D are suitable subsets of the behaviors and physiological factors. Which behaviors and which physiological factors? These could be set via domain knowledge as a lot is known about which behaviors affect which diseases. (Adversely or beneficially.) Similarly for physiological factors. Alternatively or in addition, a disease’s parents could also be inferred from data.

Let’s illustrate such training from data. Consider the following patient record

``smoker, 50 years old, male, diagnosed: lung cancer``

First, from a collection of such records we can infer lung cancer’s parents, i.e. the behaviors and physiological factors that influence its diagnosis. As with symptom distributions, we need two more types of information to estimate the distribution of lung cancer given its parents.

1. In a particular diagnosis of lung cancer, which of the parents were missing?

在特定的肺癌诊断中 ，哪些父母失踪了？

2. How to estimate the probability that one does not have lung cancer in the presence of some of its parents?

如何估算某些父母在场的情况下没有肺癌的可能性？

For 1, as in the symptoms case, the missing parents are the full set of parents minus those in this patient record. For 2, again as in the symptoms case, we derive these from patient records in which some of lung cancer’s parents occur whereas the patient is diagnosed as being free of lung cancer. An example is a smoker who does not have lung cancer. How do we decide whether a factor is “key” or not? Try domain knowledge.

Training The Influence Of Treatments On Diseases

We have a problem here. Our macro-structure schema had

``behaviors, physiological factors ⇒ diseases treatments                       ⇒ diseases``

That is, any single disease D would have two sets of parents, one involving certain combinations of behaviors and physiological factors, and the other involving treatments. We could, of course, combine these two sets of parents into one. Doing this widely has the issues discussed earlier. That said, specific triplets of behavior, physiological factor, and treatment in the context of specific diseases may be worth including. (As was discussed earlier.)

To summarize we wouldn’t want to collapse

``behaviors, physiological factors ⇒ diseases treatments                       ⇒ diseases``

into

``behaviors, physiological factors, treatments ⇒ diseases``

as a general rule.

Keeping Two Sets Of Parents Separate

So how do we keep the two sets of parents separate for a given disease D? One way is to introduce an additional variable for D (we’ll call it DI) as below.

``behaviors, physiological factors ⇒ DItreatments, DI                   ⇒ D``

We can think of DI as modeling disease onset and D as modeling the disease’s next state, following one or more treatments. That said, this scheme is incapable of modeling the dynamic evolution of a disease in response to treatments. This would require D to be a parent of DI, which would violate the acyclicity constraint on a Bayes network.

Let’s see this in a specific example.

``diet, age, gender               → heart disease-Iheart-disease-I, treatment      → heart disease``

Treatments And Side-Effects

Let’s start simple. We have a node for every side-effect. We have a node for every treatment. A side-effect’s parents are all treatments that have that side-effect.

Let’s see an example.

``chemotherapy, bone marrow transplantation, …, → fatigue``

What is the value of including such arcs in our network? One is that it lets us seek treatments that are both effective for a particular disease and have relatively mild side-effects.

Inferences In This Scaled Network

Let’s start by repeating our network’s macro-structure here. This helps to see what types of inferences the network lends itself to.

``behaviors, physiological factors ⇒ diseases treatments                       ⇒ diseasesdiseases                         ⇒ symptoms treatments                       ⇒ side-effectstests ?``

Now onto specific inferences. Each is followed by an explanation of how it can be made to work. In this explanation, we focus on whether and how the various probabilities involved can be computed from data or domain knowledge. The aim is to provide insights into how the structure of the network simplifies various calculations.

In practice, one may be using an inference algorithm as a black-box, which will do whatever it does behind the scenes.

What is the likelihood of getting lung cancer if I smoke, am a female, and am 75 years old?

We seek P(lung cancer | smokes, female, 75 years old).

The good news is that all the observations this inference is conditioned on are lung cancer’s parents.

The bad news is that lung cancer may have additional parents. These need to be marginalized out. Marginalization involves averaging over the various values these additional parents can take, weighted by their probabilities. As the number of such values is exponential in the number of additional parents, marginalization is a slow process. Sophisticated algorithms do exist to speed it up. Their discussion is beyond the scope of this post.

Frequently used restrictions of node distributions can be cached at the node. Think of this as attaching, to a node S, not only P(S|parents(S)) but also P(S|subset(parents(S)) for suitable subsets of parents(S). Such cached distributions may then be used as appropriate, reducing the need for on-the-fly marginalization.

I smoke, am a female, and am 75 years old. And I have a persistent cough. What is the likelihood I have lung cancer?

We seek P(lung cancer | smokes, female, 75 years old, persistent cough). By Bayes rule,

``P(lung cancer | smokes, female, 75 years old, persistent cough) =P(smokes, female, 75 years old, persistent cough | lung cancer)*P(lung cancer)/P(smokes, female, 75 years old, persistent cough)``

(We’ll explain the bold-face font later.)

(稍后我们将解释黑体字体。)

Next, we leverage an important property.

A node is conditionally independent of its non-descendants given its parents.

As this is the first time we are seeing this property in this post, let’s delve into it a bit. Consider the network A BC. (A Markov chain.) Applying the aforementioned conditional independence probability, we get that C is independent of A given B. That is, P(C|B, A) equals P(C|B). Or in other words, once we have observed B, the value of A provides no additional information towards predicting the value of C.

Applying this conditional independence property to our situation gives

``P(smokes, female, 75 years old, persistent cough | lung cancer) =P(smokes,female,75 years old|lung cancer)*P(persistent cough|lung cancer)``

Okay, let’s now collect together all the terms in bold. These are what remain to be estimated. We have copied them below.

``P(lung cancer)P(smokes, female, 75 years old, persistent cough)P(smokes,female,75 years old|lung cancer)P(persistent cough|lung cancer)``

P(lung cancer) is easy to estimate from a sufficiently rich set of patient records. Some usable estimates may already exist in the public domain.

P(persistent cough|lung cancer) can also be estimated from patient records as the fraction of records diagnosed with lung cancer that have persistent cough as an observed symptom.

P ( 持续性咳嗽 | 肺癌 )也可以从患者记录中评估为诊断为患有持续性咳嗽作为观察到症状的肺癌记录的一部分。

To estimate P(smokes, female, 75 years old, persistent cough), we’ll invoke the independence assumption. This leaves us with P(smokes), P(age), P(persistent cough), and P(female). The first three are easy to estimate from data combined with knowledge. The last one we can just set to 0.5.

As a slight digression, strictly speaking, the variables mentioned in the previous paragraph are not all entirely independent. For instance, women live longer than men so age and gender are at least mildly dependent.

Finally, we are left with P(smokes, female, 75 years old|lung cancer). Conditioning (smokes, female, 75 years old) on lung cancer makes the former three conditionally dependent. So we should avoid invoking independence if we can. If we can’t, well it’s not the end of the world. The resulting inference is still meaningfully interpretable. Specifically, it operates as a Naive Bayes classifier which predicts lung cancer from smokes, female, age, and persistent cough treated as conditionally independent of the outcome.

Macro Lesson

The macro lesson from the above example is that when seeking to diagnose a disease from some observed physiological factors and some observed symptoms, the physiological factors can be reasonably assumed to be independent of the symptoms given the disease. Sure older people may be more likely to exhibit certain symptoms than younger ones. However, when we additionally condition on a disease that could explain the symptom, the added influence of being old is small in comparison.

What cancer treatments have minimal side-effects?

Let’s express this in terms of a hybrid of logic and probabilities. We seek treatments T such that P(cancer|T) is high and for every side-effect SE, P(SE|T) is low. The key observation here is that in both probabilities, the variable being conditioned on is among the parents of the variable whose probability distribution we seek to compute. (In the previous sentence, if the word “variable” is causing confusion, replace it by “event”.) Thus we can leverage the network’s structure to compute what we want efficiently.

https://www.sciencedirect.com/science/article/pii/S1532046418302041

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5519723/ In this article, disease, and symptom mentions are also extracted from unstructured text such as Nurse notes. Named entity recognition (NER) techniques are useful for this purpose. (In this case, the named entities are diseases and symptoms.) Check out https://towardsdatascience.com/named-entity-recognition-in-nlp-be09139fa7b8 for more on NER.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5519723/在本文中，还从非结构化文本(如护士笔记)中提取了疾病和症状。 命名实体识别(NER)技术可用于此目的。 (在这种情况下，命名的实体是疾病和症状。)请访问https://towardsdatascience.com/named-entity-recognition-in-nlp-be09139fa7b8了解有关NER的更多信息。

http://www.cs.cmu.edu/~guestrin/Class/10701-S05/slides/bns-inference.pdf Insightful example here

``flu, allergy → sinus, sinus → headache, sinus → nose``

Read this as “flu or allergy cause sinus, sinus causes a headache, and sinus can hamper the proper functioning of your nose”.

原文作者：weixin_26713521
原文地址: https://blog.csdn.net/weixin_26713521/article/details/108194467
本文转自网络文章，转载此文章仅为分享知识，如有侵权，请联系博主进行删除。