当前位置：首页 > news >正文

[论文精读]Semi-Supervised Classification with Graph Convolutional Networks

news 2025/7/3 19:30:01

论文原文：[1609.02907] Semi-Supervised Classification with Graph Convolutional Networks (arxiv.org)

论文代码：GitHub - tkipf/gcn: Implementation of Graph Convolutional Networks in TensorFlow

英文是纯手打的！论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误，若有发现欢迎评论指正！文章偏向于笔记，谨慎食用！

1. 省流版

1.1. 心得

（1）怎么开头我就不知道在说什么啊这个论文感觉表述不是很清晰？

（2）数学部分推理很清晰

1.2. 论文框架图

2. 论文逐段阅读

2.1. Abstract

①Their convolution is based on localized first-order approximation

②They encode node features and local graph structure in hidden layers

2.2. Introduction

①The authors think adopting Laplacian regularization in the loss function helps to label:

$\mathcal{L}=\mathcal{L}_0+\lambda\mathcal{L}_{\text{reg}},\quad\\\\\text{with}\quad\mathcal{L}_{\text{reg}}=\sum_{i,j}A_{ij}\|f(X_i)-f(X_j)\|^2=f(X)^\top\Delta f(X)$

where $\mathcal{L}_0$ represents supervised loss with labeled data,

$f\left ( \right )$ is a differentiable function,

$\lambda$ denotes weight,

$X$ denotes matrix with combination of node feature vectors,

$\triangle =D-A$ represents the unnormalized graph Laplacian,

$A$ is adjacency matrix,

$D$ is degree matrix.

②The model trains labeled nodes and is able to learn labeled and unlabeled nodes

③GCN achieves higher accuracy and efficiency than others

2.3. Fast approximate convolutions on graphs

①GCN (undirected graph):

$H^{(l+1)}=\sigma\Big(\tilde{D}^{-\frac{1}{2}}\tilde{A}\tilde{D}^{-\frac{1}{2}}H^{(l)}W^{(l)}\Big)$

where $\tilde{A}$ denotes autoregressive adjacency matrix, which means $\tilde{A}=A+I_{N}$ ,

$I_{N}$ denotes identity matrix,

$\tilde{D}$ denotes autoregressive degree matrix,

$W^{(l)}$ represents the trainable weight matrix in $l$ -th layer,

$H^{(l)}$ denotes the activation matrix in $l$ -th layer,

$\sigma \left ( \right )$ represents activation function

2.3.1. Spectral graph convolutions

①Spectral convolutions on graphs:

$g_{\theta }\star x=Ug_{\theta }U^{T}x$

where the filter $g_{\theta }=diag\left ( \theta \right )$ ,

$U$ comes from normalized graph Laplacian $L=I_{N}-D^{-\frac12}AD^{-\frac12}=U\Lambda U^{T}$ and is the matrix of $L$ 's eigenvectors,

$\Lambda$ denotes a diagonal matrix with eigenvalues.

②However, it is too time-consuming to compute matrix especially for large graph. Ergo, approximating it in $K$ -th order by Chebyshev polynomials:

$g_{\theta^{\prime}}(\Lambda)\approx\sum_{k=0}^K\theta_k^{\prime}T_k(\tilde{\Lambda})$

where $\tilde{\Lambda}=\frac{2}{\lambda_{\max}}\Lambda-I_{N}$ ,

${\theta }'$ denotes Chebyshev coefficients vector,

recursive Chebyshev polynomials are $T_{k}\left ( x \right )=2xT_{k-1}(x)-T_{k-2}(x)$ with baseline $T_{0}(x)=1$ and $T_{1}(x)=x$

③Then get new function:

$g_{\theta'}\star x\approx\sum_{k=0}^{K}\theta'_kT_k(\tilde{L})x$

where $\tilde{L}=\frac{2}{\lambda_{\max}}L-I_{N}$ , $(U\Lambda U^\top)^k=U\Lambda^kU^\top$ .

④Through this approximation method, time complexity reduced from $O\left ( n^{2} \right )$ to $O\left ( E \right )$

2.3.2. Layer-wise linear model

①Then, the authors stack the function above to build multiple conv layers and set $K=1$ , $\lambda _{max}\approx 2$

②They simplified 2.3.1. ③ to:

$g_{\theta'}\star x\approx\theta'_0x+\theta'_1\left(L-I_N\right)x=\theta'_0x-\theta'_1D^{-\frac{1}{2}}AD^{-\frac{1}{2}}x$

where $\theta'_0$ and $\theta'_1$ are free parameters

③Nevertheless, more parameters bring more overfitting problem. It leads the authors change the expression to:

$g_\theta\star x\approx\theta\left(I_N+D^{-\frac{1}{2}}AD^{-\frac{1}{2}}\right)x$

where they define $\theta=\theta_0^{\prime}=-\theta_1^{\prime}$ ,

eigenvalues are in $\left [ 0,2 \right ]$ .

But keep using it may cause exploding/vanishing gradients or numerical instabilities.

④Then they adjust $I_{N}+D^{-\frac{1}{2}}AD^{-\frac{1}{2}}\rightarrow\tilde{D}^{-\frac{1}{2}}\tilde{A}\tilde{D}^{-\frac{1}{2}}$

⑤The convolved signal matrix $\begin{aligned}Z&\in\mathbb{R}^{N\times F}\end{aligned}$ :

$Z=\tilde{D}^{-\frac12}\tilde{A}\tilde{D}^{-\frac12}X\Theta$

where $C$ denotes input channels, namely feature dimensionality of each node,

$F$ denotes the number of filters or feature maps,

$\Theta\in\mathbb{R}^{C\times F}$ represents matrix of filter parameters

2.4. Semi-supervised node classification

2.4.1. Example

2.4.2. Implementation

2.5. Related work

2.5.1. Graph-based semi-supervised learning

2.5.2. Neural networks on graphs

2.6. Experiments

2.6.1. Datasets

2.6.2. Experimental set-up

2.6.3. Baselines

2.7. Results

2.7.1. Semi-supervised node classifiication

2.7.2. Evaluation of propagation model

2.7.3. Training time per epoch

2.8. Discussion

2.8.1. Semi-supervised model

2.8.2. Limitations and future work

2.9. Conclusion

3. 知识补充

4. Reference List

Kipf, T. & Welling, M. (2017) 'Semi-Supervised Classification with Graph Convolutional Networks', ICLR 2017, doi: https://doi.org/10.48550/arXiv.1609.02907

[论文精读]Semi-Supervised Classification with Graph Convolutional Networks

论文原文：[1609.02907] Semi-Supervised Classification with Graph Convolutional Networks (arxiv.org) 论文代码：GitHub - tkipf/gcn: Implementation of Graph Convolutional Networks in TensorFlow 英文是纯手打的！论文原文的summari…...

编程日记 2023/10/17 4:45:47

CICD：使用docker+ jenkins + gitlab搭建cicd服务

持续集成解决什么问题提高软件质量效率迭代便捷部署快速交付、便于管理持续集成（CI） 集成，就是一些孤立的事物或元素通过某种方式集中在一起，产生联系，从而构建一个有机整体的过程。持续，就是指长期…...

编程日记 2023/10/17 4:44:45

新能源电池试验中准确模拟高空环境大气压力的解决方案

摘要：针对目前新能源电池热失控和特性研究以及生产中缺乏变环境压力准确模拟装置、错误控制方法造成环境压力控制极不稳定以及氢燃料电池中氢气所带来的易燃易爆问题，本文提出了相应的解决方案。方案的关键一是采用了低漏率电控针阀作为下游控制调节阀实…...

编程日记 2023/10/17 4:43:44

Python 中的模糊字符串匹配

文章目录 Python中使用thefuzz模块匹配模糊字符串使用process模块高效地使用模糊字符串匹配今天，我们将学习如何使用 thefuzz 库，它允许我们在 python 中进行模糊字符串匹配。此外，我们将学习如何使用 process 模块，该模块允许我们借助模糊字符串逻辑有效地匹配或提取字符…...

编程日记 2023/10/17 4:42:44

记录一个奇怪bug

一开始Weapon脚本是继承Monobehavior的，实例化后挂在gameObject上跟着角色。后来改成了不继承mono的，也不实例化。过程都是顺利的，运行也没问题，脚本编辑器也没有错误。但偶尔有一次报了一些错误，大概是说Weapon (1)…...

编程日记 2023/10/17 4:41:42

SpringBoot面试题7：SpringBoot支持什么前端模板？

该文章专注于面试，面试只要回答关键点即可，不需要对框架有非常深入的回答，如果你想应付面试，是足够了，抓住关键点面试官：SpringBoot支持什么前端模板？ Spring Boot支持多种前端模板，其中包括以下几种常用的： Thymeleaf：Thymeleaf是一种服务器端Java模板引擎，能够…...

编程日记 2023/10/17 4:40:40

leetcode做题笔记172. 阶乘后的零

给定一个整数 n ，返回 n! 结果中尾随零的数量。提示 n! n * (n - 1) * (n - 2) * ... * 3 * 2 * 1 示例 1： 输入：n 3 输出：0 解释：3! 6 ，不含尾随 0示例 2： 输入：n 5 输出&a…...

编程日记 2023/10/17 4:39:39

linux之shell脚本练习

以下脚本已经是在ubuntu下测试的 demo持续更新中。。。 1、for 循环测试，，，Ping 局域网 #!/bin/bashi1 for i in {1..254} do# 每隔0.3s Ping 一次，每次超时时间3s，Ping的结果直接废弃ping-w 3 -i 0.3 192.168.110.$i…...

编程日记 2023/10/17 4:38:38

CSS阶详细解析一

CSS进阶目标：掌握复合选择器作用和写法；使用background属性添加背景效果 01-复合选择器定义：由两个或多个基础选择器，通过不同的方式组合而成。作用：更准确、更高效的选择目标元素（标签）。…...

编程日记 2023/10/17 4:37:37

osWorkflow-1——osWorkflow官方例子部署启动运行（版本：OSWorkflow-2.8.0）

osWorkflow-1——osWorkflow官方例子部署启动运行（版本：OSWorkflow-2.8.0） 1. 前言——准备工作1.1 下载相关资料1.2 安装翻译插件 2. 开始搞项目2.1 解压 .zip文件2.2 简单小测（war包放入tomcat）2.3 导入项目到 IDE、…...

编程日记 2023/10/17 4:36:36

Stm32_标准库_13_串口蓝牙模块_手机与蓝牙模块通信

代码： #include "stm32f10x.h" // Device header #include "Delay.h" #include "OLED.h" #include "Serial.h"char News[100] "";uint8_t flag 1;void Get_Hc05News(char *a){uint32_t i 0…...

编程日记 2023/10/17 4:34:35

Unity中用序列化和反序列化来保存游戏进度

[System.Serializable]标记类序列化 [System.Serializable]是一个C#语言中的属性，用于标记类，表示该类的实例可以被序列化和反序列化。序列化是指将对象转换为字节流的过程，以便可以将其保存到文件、数据库或通过网络传输。反序列化则是将字…...

编程日记 2023/10/17 4:33:33

Junit 单元测试之错误和异常处理

错误和异常处理是测试中非常重要的部分。假设我们有一个服务，该服务从数据库中获取用户。现在，我们要考虑的错误场景是：数据库连接断开。整体代码示例首先，为了简化，我们让服务层就是简单的类，然后使用I…...

编程日记 2023/10/17 4:32:32

LockSupport-park和unpark编码实战

package com.nanjing.gulimall.zhouyimo.test;import java.util.concurrent.TimeUnit; import java.util.concurrent.locks.LockSupport;/*** author zhou* version 1.0* date 2023/10/16 9:11 下午*/ public class LockSupportDemo {public static void main(String[] args) {…...

编程日记 2023/10/17 4:31:31