A Graph Database-Based Approach to Analyze Network Log Files

Diederichsen, Lars; Choo, Kim-Kwang Raymond; Le-Khac, Nhien-An

doi:10.1007/978-3-030-36938-5_4

Lars Diederichsen¹⁰,
Kim-Kwang Raymond Choo¹¹ &
Nhien-An Le-Khac¹²

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11928))

Included in the following conference series:

International Conference on Network and System Security

2607 Accesses
8 Citations
1 Altmetric

Abstract

Network log files from different sources often need to be analyzed in order to facilitate a more accurate assessment of the cyber threat severity. For example, using command line tools, any log file can be reviewed only in isolation. While using a log management system allows for searching across different log files, the relationship(s) between different network activities may not be easy to establish from the analysis of these different log files. We can use relational databases to establish these relationships, for example using complex database queries involving multiple join operations to link the tables. In recent years, there has been a trend of using graph databases to manage data for semantic queries (e.g. importing a fixed amount of log data for subsequent analysis). Hence, in this paper, we propose a new approach to analyze network log files, by using the graph database. Specifically, we posit the importance of constantly monitoring log files for new entries for immediate processed and analysis, and their results imported into the graph database. To facilitate the evaluation of our proposed approach, we use the Zeek network security monitor system to produce log files from monitored network traffic in real-time. We then explain how graph databases can be used to analyze network log files in near-real time within a network security-monitoring environment. Findings from our research demonstrate the utility of graph data in analyzing log data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A review on graph-based approaches for network security monitoring and botnet detection

Article 30 August 2023

A Review of Graph Approaches to Network Security Analytics

Formalizing Graph Database and Graph Warehouse for On-Line Analytical Processing in Social Networks

References

Bejtlich, R.: The practice of network security monitoring: understanding incident detection and response. No Starch Press (2013)
Google Scholar
MIT Lincoln Laboratory. DARPA Intrusion Detection Evaluation. http://www.ll.mit.edu/ideval/data/1999data.html. Accessed 4 June 2017
National CyberWatch Center. MACCDC—Home of National CyberWatch Mid Atlantic CCDC (2017). https://www.maccdc.org. Accessed 27 July 2017
Neise, P.: Intrusion Detection Through Relationship Analysis. SANS Institute InfoSec Reading Room (2016). https://www.sans.org/reading-room/whitepapers/detection/intrusion-detection-relationship-analysis-37352. Accessed 18 March 2017
Neo4j. Neo4j, the world’s leading graph database. https://neo4j.com/. Accessed 21 Aug 2017
Netresec. PCAP files from the US National CyberWatch Mid-Atlantic Collegiate Cyber Defense Competition (MACCDC) (2017). https://www.netresec.com/?page=MACCDC. Accessed 20 Apr 2017
Paxson, V.: Bro: a system for detecting network intruders in real-time. Comput. Netw. 31(23), 2435–2463 (1999)
Article Google Scholar
Py2neo. The py2neo v3 Handbook. http://py2neo.org/v3/. Accessed 11 Mar 2017
Robinson, I., Webber, J., Eirfrem, E.: Graph Databases - New Opportunities for Connected Data, 2nd edn. O’Reilly Media Inc., Sebastpol (2015)
Google Scholar
Sanders, C., Smith, J.: Applied Network Security Monitoring: Collection, Detection, and Analysis. Elsevier (2013)
Google Scholar
Roesch, M.: Snort: lightweight intrusion detection for networks. In: Lisa, vol. 99, no. 1, pp. 229–238, November 1999
Google Scholar
Snort - Network Intrusion Detection & Prevention System. http://www.snort.org/. Accessed 21 Aug 2017
Suricata. Suricata—Open Source IDS/IPS/NSM engine. https://suricata-ids.org/. Accessed 21 Aug 2017
Zeek.org. The Zeek Network Security Monitor. https://www.bro.org. Accessed 15 Jan 2019
Schindler, T.: Anomaly detection in log data using graph databases and machine learning to defend advanced persistent threats. In: Gesellschaft für Informatik e.V. (Hrsg.) Informatik 2017. Lecture Notes in Informatics (LNI). Gesellschaft für Informatik, Bonn (2017)
Google Scholar
Uetz, R., Benthin, L., Hemminghaus, C., Krebs, S., Yilmaz, T.: BREACH: a framework for the simulation of cyber attacks on company’s networks. In: Digital Forensics Research Conference Europe (Poster) (2017)
Google Scholar
Djanali, S., et al.: Coro: graph-based automatic intrusion detection system signature generator for evoting protection. J. Theor. Appl. Inf. Technol. 81(3), 535–546 (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

German Federal Police, Potsdam, Germany
Lars Diederichsen
University of Texas at San Antonio, San Antonio, USA
Kim-Kwang Raymond Choo
School of Computer Science, University College Dublin, Dublin, Ireland
Nhien-An Le-Khac

Authors

Lars Diederichsen
View author publications
You can also search for this author in PubMed Google Scholar
Kim-Kwang Raymond Choo
View author publications
You can also search for this author in PubMed Google Scholar
Nhien-An Le-Khac
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kim-Kwang Raymond Choo .

Editor information

Editors and Affiliations

Monash University, Clayton, VIC, Australia
Joseph K. Liu
Fujian Normal University, Fuzhou, China
**nyi Huang

Appendix

1.1 A. cnn.log [5]

(1)
ts: Timestamp that represents the time when the first packet of the connection occurred,
(2)
uid: Unique Identifier (UID) for the connection,
(3)
id.orig_h: IP address of source host,
(4)
id.orig_p: Source port,
(5)
id.resp_h: IP address of destination host,
(6)
id.resp_p: Destination port,
(7)
proto: Transport layer protocol,
(8)
service: Identification of an application protocol,
(9)
duration: Duration of the connection in seconds,
(10)
orig_bytes: Number of payload bytes the originator sent,
(11)
resp_bytes: Number of payload bytes the responder sent,
(12)
conn_state: Summary of the connection state. 3
(13)
local_orig: Connection is originated locally
(14)
local_resp: Connection is responded locally
(15)
missed_bytes: Indicates the number of missed bytes and represents packet loss
(16)
history: State history of connections.4
(17)
orig_pkts: Number of packets that the originator sent,
(18)
orig_ip_bytes: Number of IP level bytes that the originator sent (this is taken from the IP total length header field),
(19)
resp_pkts: Number of packets that the responder sent,
(20)
resp_ip_bytes: Number of IP level bytes that the responder sent,
(21)
tunnel_parents: If this connection was over a recognized tunnel, this indicated UID values for any encapsulating parent connection used over the lifetime of this inner connection.

1.2 B. dns.log [5]

(1)
ts: Timestamp that represents the earliest time at which the DNS protocol message over the associated connection is observed,
(2)
uid: UID of the connection over which DNS messages are being transferred,
(3)
id.orig_h: IP address of source host,
(4)
id.orig_p: Source port,
(5)
id_resp_h: IP address of destination host,
(6)
id.resp_p: Destination port,
(7)
proto: Transport layer protocol,
(8)
trans_id: A 16-bit identifier that is assigned by the program that generated the DNS query and that is also used in responses to match up replies to outstanding queries,
(9)
rtt: Round trip time for the query response, indicating the delay between the moment that the request was seen until the answer has started,
(10)
query: Domain name that is the subject of the DNS query,
(11)
qclass: QCLASS value specifying the class of the query,
(12)
qclass_name: Descriptive name for the class of the query,
(13)
qtype: QTYPE value specifying the type of the query,
(14)
qtype_name: Descriptive name for the type of the query,
(15)
rcode: Response Code value in DNS response messages,
(16)
rcode_name: Descriptive name for the response code value,
(17)
AA: Authoritative Answer bit for response messages,
(18)
TC: Truncation bit that specifies whether the message was truncated,
(19)
RD: Recursion Desired bit that indicates in a request message whether the client wants recursive service for this query,
(20)
RA: Recursion Available bit that indicates in a response message that the server supports recursive queries,
(21)
Z: A reserved field that is usually “0” in queries and responses,
(22)
answers: Set of resolved IP addresses and domains in the query answer,
(23)
TTLs: shows the caching intervals of the associated resources described in the query answer,
(24)
rejected: Rejected bit indicated whether the server rejected the DNS query.

1.3 C. http.log [5]

(1)
ts: Timestamp for when the request happened
(2)
uid: UID for the connection,
(3)
id.orig_h: IP address of source host,
(4)
id.orig_p: Source port,
(5)
id.resp_h: IP address of destination host,
(6)
id.resp_p: Destination port,
(7)
trans_depth: Number representing the pipelined depth into the connection of this request/response transaction,
(8)
method: Method used in the HTTP request (i.e. GET, POST, etc.),
(9)
host: HTTP Host header value,
(10)
uri: URI used in the request,
(11)
referrer: HTTP “referer” header,
(12)
version: Version portion of the HTTP request,
(13)
user_agent: HTTP User-Agent header value,
(14)
request_body_len: Uncompressed content size of the data transferred from the client in bytes,
(15)
response_body_len: Uncompressed content size of the data transferred from the server in bytes,
(16)
status_code: HTTP status code returned by the server,
(17)
status_msg: Human-readable HTTP status message,
(18)
info_code: Reply code returned by the server,
(19)
info_msg: Human-readable reply message,
(20)
tags: Tags that are a set of indicators of various attributes discovered and related to a particular request/response pair.
(21)
username: HTTP Basic Authentication user name (if found),
(22)
password: HTTP Basic Authentication password (if found),
(23)
proxied: All of the headers that may indicate if the request was proxied,
(24)
orig_fuids: List of unique file IDs6 in the request,
(25)
orig_filenames: List of filenames in the request,
(26)
orig_mime_types: MIME types for request objects,
(27)
resp_fuids: List of FUIDs in the response,
(28)
resp_filenames: List of filenames in the response,
(29)
resp_mime_types: MIME types for response objects.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Diederichsen, L., Choo, KK.R., Le-Khac, NA. (2019). A Graph Database-Based Approach to Analyze Network Log Files. In: Liu, J., Huang, X. (eds) Network and System Security. NSS 2019. Lecture Notes in Computer Science(), vol 11928. Springer, Cham. https://doi.org/10.1007/978-3-030-36938-5_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-36938-5_4
Published: 10 December 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36937-8
Online ISBN: 978-3-030-36938-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Graph Database-Based Approach to Analyze Network Log Files

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A review on graph-based approaches for network security monitoring and botnet detection

A Review of Graph Approaches to Network Security Analytics

Formalizing Graph Database and Graph Warehouse for On-Line Analytical Processing in Social Networks

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

1.1 A. cnn.log [5]

1.2 B. dns.log [5]

1.3 C. http.log [5]

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Graph Database-Based Approach to Analyze Network Log Files

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A review on graph-based approaches for network security monitoring and botnet detection

A Review of Graph Approaches to Network Security Analytics

Formalizing Graph Database and Graph Warehouse for On-Line Analytical Processing in Social Networks

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

1.1 A. cnn.log [5]

1.2 B. dns.log [5]

1.3 C. http.log [5]

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation