Log in

A Graph Attribute Aggregation Method based on Feature Engineering

  • Original Contribution
  • Published:
Journal of The Institution of Engineers (India): Series B Aims and scope Submit manuscript

Abstract

In the fields of social network analysis and knowledge graph, many semi-supervised learning algorithms based on graph convolutional neural network (GCN) have been widely used. Most of these algorithms usually improve the structure of the neural network and the sampling method of each layer of the neural network. However, they don’t pay much attention to the data pre-processing of the algorithm. In the analysis of the input data, the words of different quality in these original data are unevenly distributed. This may obscure some useful data and highlight some irrelevant data. In order to verify the correctness of this hypothesis, the paper proposes a feature matrix compression algorithm (FMC algorithm) for data pre-processing of GCN-based algorithms. The algorithm analyzes and arranges the word columns of the input matrix (the feature of graph) according to the frequency of the word, then merges those words in which the word frequency is smaller, so as to emphasize the role of these words in the graph and optimize the data scale. The present work uses four mainstream datasets in the field and several representative and different algorithms to complete the experiment. The experimental results show that the FMC algorithm achieves better performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. F. Fouss, A. Pirotte, J.-M. Renders, M. Saerens, Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. Knowl. Data Eng. 19(3), 355–369 (2007)

    Article  Google Scholar 

  2. L. Tang, H. Liu, Relational learning via latent social dimensions, in Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2009), pp. 817–826

  3. S.T. Roweis, L.K. Saul, Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)

    Article  Google Scholar 

  4. P. Sen, G. Namata, L. Getoor, M. Bilgic, B. Galligher, T. Eliassi-Rad, Collective classification in network data. AI Mag 29(3), 93 (2008)

    Google Scholar 

  5. P. Radivojac, W.T. Clark, T.R. Oron, A.M. Schnoes, T. Wittkop, A. Sokolov, K. Graim, C. Funk, K.M. Verspoor, A. Ben-Hur, A large-scale evaluation of computational protein function prediction. Nat Methods 10(3), 221–227 (2013)

    Article  Google Scholar 

  6. F. Lin, W.W. Cohen, Semi-supervised classication of network data using very few labels, in Advances in Social Networks Analysis and Mining (ASONAM), (IEEE Computer Society, 2010), pp. 192–199

    Google Scholar 

  7. A. García-Durán, A. Bordes, N. Usunier, Y. Grandvalet, Combining two and three-way embedding models for link prediction in knowledge bases. J Artif Intell Res 55, 715–742 (2016)

    Article  MathSciNet  Google Scholar 

  8. W.L. Hamilton, R. Ying, J. Leskovec, Inductive representation learning on large graphs, in Neural Information Processing Systems (NIPS) (2018)

  9. T.N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks, in the 5th International Conference on Learning Representations (2016)

  10. J. Chen, T. Ma, C. **ao, FastGCN: Fast learning with graph convolutional networks via importance sampling, in the 6th International Conference on Learning Representations (2018)

  11. S. Deng, H. Rangwala, Y. Ning, Learning Dynamic Context Graphs for Predicting Social Events, in the 25th ACM SIGKDD International Conference (2019)

  12. K. Lei, M. Qin, B. Bai, G. Zhang, M. Yang, GCN-GAN: a non-linear temporal link prediction model for weighted dynamic networks, in IEEE INFOCOM 2019 - IEEE Conference on Computer Communications (2019)

  13. H. Chen, B. Perozzi, R. Al-Rfou, S. Stiena, A tutorial on network embeddings, ar**v preprint ar**v.1808.02590 (2018)

  14. L. Page, S. Brin, R. Motwani, T. Winograd, The pagerank citation ranking: bringing order to the web. Stanford Digital Libraries Working Paper. 9(1), 1-14

  15. T. Mikolov, K. Chen, G. Corrado, J. Dean, Efficient estimation of word representations in vector space. Computer Science. ar**v preprint ar**v:1301.3781

  16. B. Perozzi, R. Al-Rfou, S. Skiena, Deepwalk: Online le-arning of social representations, in KDD (2014)

  17. A. Grover, J. Leskovec, Node2vec: Scalable feature learning for networks, in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016), pp. 855–864

  18. L.F.R. Ribeiro, P.H.P. Saverese, D.R. Figueiredo, struc2vec: Learning node representations from structural identity, in KDD (2017), pp. 385–394

  19. J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, Q. Mei, Line: Large-scale information network embedding, in Proceedings of the 24th International Conference on World Wide Web (2015), pp. 1067–1077

  20. J. Chen, J. Zhu, L. Song, Stochastic Training of Graph Convolutional Networks with Variance. ar**v preprint ar**v:1710.10568 (2017)

  21. M. Belkin, P. Niyogi, Laplacian Eigenmaps and spectral techniques for embedding and clustering. Adv Neural Inf Process Syst 14, 585–591 (2001)

    Google Scholar 

  22. A. Grover, A. Zweig, S. Ermon, Graphite: iterative generative modeling of graphs, in International conference on machine learning (2019), pp. 2434–2444

  23. P. Velickovic, W. Fedus, W.L. Hamilton, P. Liò, Y. Bengio, R.D. Hjelm, Deep graph infomax. ICLR (Poster) 2(3), 4 (2019)

    Google Scholar 

  24. H. Gao, Z. Wang, S. Ji, Large-scale learnable graph convolutional networks, in Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2018), pp. 1416–1424

Download references

Acknowledgements

The authors wish to thank the anonymous reviewers for their detailed feedback and suggestions for improving this work.

Funding

This work was supported by the National Natural Science Foundation of China (Grant Nos. 61272209, 61872164), in part by the Program of Science and Technology Development Plan of Jilin Province of China under Grant 20190302032GX, and in part by the Fundamental Research Funds for the Central Universities (Jilin University).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ming-Hui Sun.

Ethics declarations

Conflict of interest

Authors have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, H., Dong, LY., Ma, XT. et al. A Graph Attribute Aggregation Method based on Feature Engineering. J. Inst. Eng. India Ser. B 103, 711–719 (2022). https://doi.org/10.1007/s40031-021-00698-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40031-021-00698-z

Keywords

Navigation