Neural symbolic reasoning with knowledge graphs Knowledge extraction, relational reasoning, and inconsistency checking.pdf

checking,extraction,Graphs,inconsistency,Knowledge,Neural,pdf,reasoning,relational,symbolic,计算机与AI
文档页数:9
文档大小:1.22MB
文档格式:pdf
文档分类:计算机与AI
上传会员:
上传日期:
最后更新:

Article

Neural symbolicreasoningwith knowledge graphs:Knowledge extraction relational reasoning and inconsistency checking

Huajun Chenab Shumin Deng Wen Zhang Zezhong Xu Juan Li Evgeny Kharlamov d

Cllg f Cu Si Zhig iy H 310027 ChHanghou Jte Ceer Zhiong Usy Hngths I0000 Chine AZFT Jaint Lab for Knowlnige Egite Hargzhsu 31121 ChBch CfArRG St SIRUS Reroh Gamr Unwnsty of Oale Ola N-016 Nway

ARTICLE INF O

A BS T R A C T

Knowlde graphs (KGs) expres relationships betwn enty pairs and many real-ife problems can be fo- lated as knowledge graph reasoning (KGR). Convemtional approaches to KGR have achieved promising perfor- mnce bt ieme ds th nd mst Rmthd fcus y n pasef hlifecycle such as KG pletion or refinement while ignoring reasoning over other stages such as KG extraction.d qd On the other hand traditional KGR methods broadly categorized as symbolic and neural are unable to balance of KGR with reard to the whole KG lifecyde inclding KG extraction cmletion and refinement which cor-respond to three subtasks: knwledge extraction relatina resing ad incnsency checking In addition we propoe the imlemntation of GR using a oel neural symbolic framework with read to both scalabilityand interpretability. Experimental results demonstrate that our proposed methods outperform traditional neural symbolic models.

Artilr Ntory:Receised 25 Apel 2021 Received in revised form 12 August 2021Arallable online 9 September 2021 Acepted 20 August 2021

KiywondsKnowliodge graph Neural symbolicreasoningInosistency checking Knowledge extraction Belatioual reasoning

1. Introduetion

putation in neural space could help reasoning tasks in KGs either by inferring new triples or by detecting noise implicitly or explicitly.

tionships between entity pairs or assign types to entities in the form of Knowledge graphs (KGs) are collections of facts that express rela-(ubjec predicate object) or (h r r). Many large KGs including Freebase [1] Wikidata [2] AliMeKG [3] and OpenKG have been built in recent years and have been used for a broad range of important applications such as question answering [4] knowledge-based language models [5] and remender systems [6]. Reasoning with KGs or KG reasoning (KGR) revolves around the lifecycle of KG including KG extraction KGpletion and KG Refinement Therefore KG extraction corresponds to knowledge extraction which aims to establish structured KGs from un-n [euoea o spuodsas uopado x [] odoo pamns soning which focuses on inferring new knowledge based on existingknowledge to plete KGs [8]. KG refinement corresponds to incon-sistency checking which aims to detect noise in KGs and clean up KGs. We consider that the three stages of KGR follow the same general neu-ral symbolic integration framework as shown in Fig. 1I: Symbolic facts and rules/axioms could be embedded into neural space and the -

main categories: symbolic methods based on logical inference and neural Conventional approaches to KGR can be broadly classified into twomethods basednvectorsacerepresentations ymbolicmethods p aad aou fau8 aue peq su 8o usn q auaau 8o mcise and interpretable thereby providing valuable insight to inference results. For example traditional reasoners such as Pellet [9] and Her-miIT² reason over KG based on a predefined ontology NELL [10] uses a collection of restricted Horn Clause rules to infer new belief triples andAMIE [11] considersthe problems of leaming models posed of a col- lection of first-order logical rules. Although symbolic methods performdefining the reasoning logic thus lacking scalability. Neural methods well in terms of precision and interpretability they require manuallylearn latent representations of KG entities and relations in continuousform embedding-based reasoning are more powerful when there are a vector space that are called embeddings. Thus neural methods that e-HolE [12] and ComplEx [13] can preserve the infoemation and seman- large number of relations or triples. Neural methods such as TransE [8] entities. However although neural methods are good at scalability they tics in KGs including the existenoe of triples and similarities between

Fig. 1. Overview of the general neural symbolic integration framework with three substructures for three subtasks in KG lifecycle. Symbolic KG and Symbolic Rules/Axioms respectively store symbolic facts and rales/axioms in Symbolic Spoce and Neural KG and Neural Rales/Axioms respectively embed thesesymbols n Naural Space XG Exructin via NSKE (§2) XG Complefon via tripleGNN (B3) and KG Refinemmr via K-NAN (64) respectively integrates some of the

explained. lack interpretability and moreover their inference results cannot be

propose neural symbolic methods for KGR which integrate the ad. To overe the drawbacks of symbolic and neural methods wevantages oflogical inferenceandneural mbeddinfocusing onboth interpretability and scalability. Previous neural symbolic models suchas NeuralLP [14] leam numerical rules under the framework of differ. entiable rule learming models such as IterE [15] simultaneously learmrules and embedding based on iterative learning and models such as JOTE [16] jointly learm KG embeddings of instances and ontological con-KGs such as KG coempletion or refinement while ignoring other impor- cepts However most of them only focus on one phase in the lifecycle oftant KGR subtasks such as KG extraction. In this study we propose to perform neural symbolic reasoning with KGs in the entire lifecycle ofKGs which corresponds to three subtasks: (1) neural symbolic reasoningrized as follows: for knowedge extraction (2) neural symbolic relationl resoning and (3) neural symbolic inconsistency checking Our contributions can be summa-

• We take a more prehensive perspective of neural symbolic rea-KG extraction KG pletion and KG refinement. soning with KGs and consider the entire lifecyele of KGs including• We propose a novel neural symbolic framework with three substrue-tures for KG reasoning which integrates the interpretability of sym- bolic methods and the scalability of neural methods for three sub-consistency checking (34). tasks: knowledge extraction (32) relational reasoning (3S) and in-• We explore datasets focused on neural symbolic reasoning with KGs and the experimental results of all subtasks demonstrate that ourproposed neural symbolic framework achieves better performance than baselines.

2. Neural symbolic reasoning for knowledge extraction

In this section we introduce neural symbolic reasoning in KG extrac-tion corresponding to the substructure for knowledge extraction in our proposed neural symbolic framework as shown in Fig. 1: Symbolic KG

2.1. Task formulation

2.2. Model and method

Rules/Axioms desfuctiss Symbolic KG.

extraction (KE) tasks of relation extraction (RE) and event extraction Knowledge Extraction. In this study we consider the knowledge(EE). Suppose that we have a relation class set R = ( ]i ∈ [1 N] an event class set & = ( |/ e [1 N] and corpus T = ( Xx |]/ ∈ [1. Mf]) thatcontain M instances. Each instance X in T is denoted as a token se- quence X = (x/ I/ e [1. L]) with a maximum of L tokens. Our goal isto predict the relation and event labels for each instance in the corre- sponding corpus. Traditional approaches to KE are mostly based on neu-ral networks [1724] and ignore the correlation knowledge of relation and event classes.

symbolic reasoning utilizes correlation knowledge between the rela- Neural Symbolic Reasoning for Knowledge Extraction Neuraltion classes in R or the event classes in & for KE We utilize seman- tic connections among relations for neural symbolic reasoning in RE including implicit semantic connection with KG embedding e.g the relation place lived is more relevant to nationafity than profession andctediny(xy)nex obody.oftr(xz)basioutry.ofy 2). explicit semantic connection with rule leaming e.g the rule is lo-We also utilize a multi-faceted event correlation neural symbolic reason.ing in EE including event temporal and causal relations.

ture called NSRE is devised to implement neural symbolic reasoning for In our proposed neural symbolic framework a general substruc-knowledge xtration nclding thre module: (1) feature representa tion (2) rule learming and (3) correlation inferenoe. Fig. 2 shows thekey concepts of the three modules.

Feature representation aims at capturing syntax features of each in-anay a ou x suaon ndu o x Supoou Suuqo pue oues space. Rule lerming aims to map class set R and into a semantic space

or a product of two matrices. The normalized truth value F of g can becalculated by:

(3)

where |IIly denotes the Frobenius norm and the subscript p denotesone of the three bjet properties and are the maximm and minimum Frobenius norm scores respectively. F e [0 1] is the truthvalue for grounding and a higher F indicates that z is more likely tobe valid.

Fig. 2. Overview of our proposed NSKE.

and establish the correlation among the class set. Correlation inference seeks to infer new class correlations based on existing relation correla-tions C' or event correlations C°.

3. Neural symbolic relational reasoning

In this section we introduce neural symbolic reasoning in KG -our proposed neural symbolic framework. As seen in Fig. 1: {Symbolic pletion corresponding to the substructure for relational reasoning inKG - *; Symbolic Rules/Axioms =

2.2.1. Feature representation

to obtain contextual representation X . Note that the instance encoder Given a token sequence X (x x we use an instance encoderis pluggable for example it can be a pre-trained BERT [25] where the token embedding of [CLS] is treated as the contextual representationX Furthermore it can also be replaced by other models such as the PCNN [26] and JRNN [27]. In addition the entity pair for RE and thetrigger for EE will be marked.

3.1. Probiem formulation

2.2.2. Rule leaming

Relational Reasoning As ooe of the most important tasks in KG its goal is to infer potential relations between entities or missing en- tities. Addressing this issue has bee an important topic and manybolic methods provide a transparent reasoning proces ove KG bt lck oo epmo p a spaddethe generalization ability to unseen examples. Recently several studies have viewed this as a link prediction problem and attempted to solve itwith some network embedding approaches although these approaches are difficult to explain.

We then project X R and into the semantic space The semanticrepresentation for X is regarded as the embedding of a possible corre.sponding class denoted by P = S(X ) The semantic projector S(-) is pluggable e.g it can be DeViSE [28] with a linear function or ConSE[29] with a convex bination.

The class embedding P for R and is calculated using a rule-guidedclass encoder. In KG logic rules show connections between relations or events. They are in the form of a body ~ head where head is a binaryatom and body is a conjunction of binary and unary atoms sach as rule souse(x y) ^ father(y z) mother(x z). For RE we adopt typical rulemining methods such as AMIE [11] to generate rules from structural KGs. For EE we adopt event-event relation recognition models such asTCR [30] to detect event correlation rules. The class embedding P ; is denoted by:

ing the two reasoning methods based on symbols and neural networks. Neural-symbolic Relational Reasoning. We focused on bin-properties facts and axim Tal 1 sowsthe axims and their smn Given a KG = [&.P F A} with & P F and A as the set of entities tics used. In this study we aim to devise a neural axiomatic reasoning framework that not only learns embedding from facts but also utilizesontological axioms. More precisely our aim is to devise a neural ax- jomatic reasoning framework that not only learns embedding from factsbut also utilizes ontological axioms in a unified and model-agnostic man- ner without an explicit deductive reasoning process and axd-hoc modeldesign for different axioms.

(1)

where Rie is the / rule with the top K highest confidence score and conf represents the confidence of Rsle E(Rale ) is the embedding ofRule which can be obtained via KG embedding models such as Dist- Mult [31].

3.2. Model and method

usen soe enau u satuap edoud om ae awith mixed utilization of facts and axioms. The first challenge is ax- om injection. As axioms and facts are in different forms it is difficult tocallyrepresented as direted labeled entity graphsThe second challenge inject axiomatic logic into embeddings learned from facts that are typi-is model-agnosticism for axiom utilization. As the utilization of axiomstedious as there are many types of axioms. varies we need to design ad-hoc models for different axioms that are

2.2.3. Correlation inference

Given the correlation rules among relation or event types we infering g to infer new corelation triples which can be generalized in the u andsso us opsq samaufollowing form:

(2)

where the right-side event triples ( ) e C with k e [1 a) alreadyexists in KG and (h' r) G is newly infered triples to be added.

tripletGNN was devised to implement neural symbolic relational reason- In our proposed neural symbolic framework a substructure calleding. First we peopse a riple gqph that represents a KG with facts and axioms as an undirected unlabeled graph in contrast with conventionaldireted labeled enity graph and a novel way to store triplet graphs instead of using adjacent matrioes to better adapt tbe inductive graphneural network [32] enhanced with a novel node feature interaction mod- prediction task for KGC. Second we propose a method based on a graphfor Iinkage to tackle the second challenge. ule in which an interaction matrix is calculated to distinguish the reason

To pute the truth value of the grouning we select three objectproperties (OP) of relations defined in OWL22: eubOP invereeOP and transitiveOP and then learn the matrices of relations from the linearmap assumption [15].

Assuming that M and M} denote the relation set on the left andright of Eq (2) respectively they are matrices either from a single matrix

Table 1Six OW2 axioms their semantics in tems of First Order Logics and the corresponding triples.

Axlom OPDoesisr(pc) x. : OPDssie(pc) Implies x E c Semanties Triplet fom (g.dsmain c)OPRaa SvOPof(p ps) Vs y : /eof( P;) and (xy) implis p(yx) x.y: SwPo(Pand p;(x y) Imles p;(xy) Vs.y: OPRaxa(y).mplis y Ec (q.raxpe c) (b sabPreperyOf -)EpitsfesOP(n P) lneerseOPo() P) x.y:EglP( Pand p;(x Imlles p;(xy) Vs y : EgloPaf(p; P)and p (x y/ imlis p;(x 9) ( aitalPerryOf -p) (P isvrseiPrapertyOf -p )SyxtricOP(p) (x) x p g : (g. syewetric.p)

To utilize the interaction matrix we learn a transformation tensorTRwithTRas tberansfomatinmaixfhex element e Because the shape of the interaction matrix is 3 × 6 (threefeature embedding of the target triplet interacted with three feature em- beddings and three structure embeddings) after being flattened and be-fore transformation the shape of the interaction matrix will be I × 18. To keep the shape of the interaction matrix the same to ensure that itcould be reshapd bck t3×6again afebeing tranfomed the shaeof the transformation tensor 7 has to be set as 18 x 18. Then we trans- form I: I reshapeb(ffaren1xi$(I) × T ). In the transformed interac-tion matrix F each cell indicates the importance of the vetor from the neighbor triplet to the x element from the target triplet r.n' supervised by I': t' = IF x nf. Then we generate the structure embedding of under by aggregating

3.2.1. Triplet graph

node into a triplet graph with their elments (e ) as node fatures. Construetion. First we add each fact f ∈ F and axiom α E A as aThen we add edges based on the element-sharing principle. Specifically for the triplet pair ( 1 ∈ [0.n 1]^ #1 if e is an elementof both r and 1 we add an undirected edge between them.

elements that they both depict. In the contest of the KG unlike the line The intuition is that two triplets are related if they have mongraph constructed from the entity graph the triplet graph makes axioms into consideration which cannot be represented in an entity graph.

namically build a local triplet graph for a given triplet. We asign a Storage We apply an altermative method to automatically and dy-unique id for each triplet and element including entities propertes pue v auaqm °[1 *u'o] pue [1 '0] go auez au ug 'sads uoxe pue are the number of triplets and elements We store two key information for a triplet graph:

neighbor n e N(r) the final representation for is Score and Loss Function. After obtaining t°- for each sampled

(4)

(7)

triplet mapping matrix where e2r Rb with each cellas the id of a where 2e is the triplet element mapping matrix and 2z is the element- triplet containing e where is a hyperparameter.

With t we apply the score function and loss function from KG embed-The score functions are as follows: ding methods to calculate the score of r which indicates its true value.

Walking Walks on the triplet graph based on a target triplet =(e -) where . are id of the first second and third elements respectively can be obtained as follows:

(8)

agonal. isthecojugate ft During training w apply cros ntropy where W. is a diagonal matrix with diagonal values from t on the di-Joss function with logistic activation.

(5)

natin of and y thus ) R Ths a triplet graph of a targe where e2 is the xg row of the e2r matrix and x o y denotes the concate-fact can be built on the fly given its elements.

4. Neural symbolic inconsisteney checking

3.2.2. TriplerGNN

ment corresponding to the substructure for inconsistency checking in In this section we introduce neural symbolic reasoning in KG refine-our proposed neural symbolic framework. As shown in Fig 1: Symbolic fwie Neural

ing framework to utilize and learm from axioms and facts. Each triplet os sexe enau eaual e asodod am qde ad e node has node feture embedding and node struchure embedding which are used to predict a target triplet. For each element e in the triplet graph we learn an embedding e e Rd. For each triplet = (e e e²) e e e in order of feature embedding / R3d and leam a node structure embed-ding matrix t° e Rf fr 1. Given a target triplet = (e e²) we firt obtain its neighbor triplets N(r) based on the walking function defined inEq. (5) Then sample k from them is denoted as N() where |N(o) = x.is that there might be too many neighbors fr a specific in some spe The reason we choose k neighbors rather than all of the neighbor nodesstorage space. cial cases and by doing this we can reduce the training plexity and

4.1. Task formulation

knowledge and the goal is to detect incorrect triples in KGs and the Ineonsistency Cheeking It has been proposed to detect inconsistentspecific inconsistencies in these triples. Previous studies mainly focused on statistics-based methods [33 34] that exploit statistical distributions or logic-based methods [3537] where ontologies are utilized.

types between the rwo triplets we consider feature interaction. The Triplet Feature Interaction Module. To distinguish the connectionoverview of interaction module are shown in Fig. 3 where for each sam- pled neighbor triplet N() we pute an interaction matrix basedon the vector cosine similarity:

Neural Symbolic Inconsisteney Checking. In this study weused neural axiom networks and KG embedding methods for incon- sistency checking. Given a noisy KG = ((h p r)u (h o)(h r) EG. (h' / r′) G) where triples in G are correet triples and (' r′) is an incorreet triple. We design a framework that can model both the struc-tural information of triples and axiom information implied in triples. The axioms considered include domain range disjoint irreflexive andasymmetric. Our aim is to detect erroneous triples in ar would be added to g.

(6)

node feature and structure embedding for ≤. where t/ e Rf is node feature embedding for . (n′ .a²) e R3xf are

4.2. Model and method

4.2.1. Knowledge graph embedding

4.2.2. Neural cxiom network

Fig. 4. Framework of our proposed substructure of K-NAN.

NAN was devised to implement neural symbolic ineonsistency checking. In our proposedneuralsymbolic framework a substructure called K-K-NAN consists of a KG embedding module and a neural axiom network with five inconsistency axiom modules as shown in Fig. 4. For a triple(h r n) we use h I h and r_ to respectively represent the type (c) and semantic (x) embeddings of h and 1. F Ig and respectivelytype embeddings expected by . denote the semantic embedding of r and the subjeet as well as object

do Su se seu sad m p pq-uesn We select TransE [8] for KG embedding as it is the simplestcalculated as follows: erations between subject and object entities. The fitness of a triple is

where / and denote the L and L norms respectively. A smallervalue indicates higher fitness for a triple.

It is intuitive that a correct triple is also a logically consistent triple asymmetric axioms. namely the triple satisfies domain range disjoint irreflexive and

h and rg. To obtain a more accurate representation of rg we considerthe relations of the subject entity R(h) = ( ](h r ) C) where e de- notes any one entity. We introduce the attention mechanism and gen-attention weight s calculated and sofmax is applied over erate f based on the relations in R(h). For each relation R(h) the

where / means the /x relation of h. Then F and the lilkelihood Pz of h r satisfying the domain axiom are defined as:

Fig. 3. Triplet feature interaetion module.

and r - Similarly we devise an attention mechanism as in domain taq qdu ad uo s uss wxe ur[r ](e ) G and generate f based on allrelations in R() The atten- axiom consistency. We denote the relations connected to I as R(r) =tion weight a and o after softmax applied q are calculated by:

where k denotes the k relation in the connected relations of the objectP for satisfaction of the range axiom is defined as: entity The generated embedding f and the patibility probability

us jo quo a sn usu xe uostic embeddings of two relations with same subject and objet entities. For each triple (h r 2) we simply copy the idea from TransE which holds the view that =r. Therefore (_ is regarded as the generalrepresentation of the relation with h and as the subject and object en-score of (r_ h_) and r_ which is defined as: tity. The disjoint axiom can be simplified to calculate the patibility

and whether h = 1. A triple (h r) is irreflexive inconsistency if and only Irreflexive axiom consisteney focuses on whether is irreflexive if is irreflexive and h = r. We first determine wbether h = r. When = we then consider the rreflexive properties of the relation. To determineconstraint ) is alo cosiderd. Te probbility that a triple sat whether is irrflexive we apply a feed-forward lyer Besides th teisfies the irreflexive axiom is defined as:

(9)

where W and b are parameters.

Asymmetric axiom consistency focuses on whether is asymmet-ric namely whether (h r) and (1 h) both exist. If is asymmetric and (h r f) and ( r h) appear simultaneously an asymmetric inconsis-adud a up o Je puemopg e sdde 1 smo ou of the relation as similar to that defined in the irreflexive axiom. Mean-while we use TransE to check whether ( h) exists. The asymmetric axiom networlk can be defined as follows:

where denotes sigmoid function.

(10)

4.2.3. Fusion

The overall score for each triple is defined as:

where E is the score of the KG embedding module and Epa = [1 pue = xg a 11 = s1g 11=0g

(11)

(12)

(13)

(14)

(15)

(16)

(17)

资源链接请先登录(扫码可直接登录、免注册)
①本文档内容版权归属内容提供方。如果您对本资料有版权申诉,请及时联系我方进行处理(联系方式详见页脚)。
②由于网络或浏览器兼容性等问题导致下载失败,请加客服微信处理(详见下载弹窗提示),感谢理解。
③本资料由其他用户上传,本站不保证质量、数量等令人满意,若存在资料虚假不完整,请及时联系客服投诉处理。

投稿会员:匿名用户
我的头像

您必须才能评论!

手机扫码、免注册、直接登录

 注意:QQ登录支持手机端浏览器一键登录及扫码登录
微信仅支持手机扫码一键登录

账号密码登录(仅适用于原老用户)