当前位置: 首页 > news >正文

网站的关键字 设置/史上最强大的搜索神器

网站的关键字 设置,史上最强大的搜索神器,Nginx做跳转到其他网站,广东省建设行业数据开放平台第3章 Logstash数据分析 Logstash使用管道方式进行日志的搜集处理和输出。有点类似*NIX系统的管道命令 xxx | ccc | ddd,xxx执行完了会执行ccc,然后执行ddd。 在logstash中,包括了三个阶段: 输入input --> 处理filter(不是必须…

第3章 Logstash数据分析

Logstash使用管道方式进行日志的搜集处理和输出。有点类似*NIX系统的管道命令 xxx | ccc | ddd,xxx执行完了会执行ccc,然后执行ddd。
在logstash中,包括了三个阶段:
输入input --> 处理filter(不是必须的) --> 输出output

在这里插入图片描述

每个阶段都由很多的插件配合工作,比如file、elasticsearch、redis等等。
每个阶段也可以指定多种方式,比如输出既可以输出到elasticsearch中,也可以指定到stdout在控制台打印。

logstash支持多输入和多输出

ELFK架构示意图:

在这里插入图片描述

1.Logstash基础部署

  1. 安装软件
[root@host3 ~]# yum install logstash --enablerepo=es -y 			# 偶尔需要使用的仓库可以将它关闭,用到的时候临时打开[root@host3 ~]# ln -sv /usr/share/logstash/bin/logstash /usr/local/bin/	# 做软连接,命令就可以直接使用了
"/usr/local/bin/logstash" -> "/usr/share/logstash/bin/logstash"
  1. 创建第一个配置文件
[root@host3 ~]# vim 01-stdin-stdout.confinput {stdin {}
}output {stdout {}
}
  1. 测试配置文件
[root@host3 ~]# logstash -tf 01-stdin-stdout.conf 
  1. 自定义启动,这种方式通常用于实验环境,业务环境下,通常将配置修改后,使用systemctl来管理服务
[root@host3 ~]# logstash -f 01-stdin-stdout.conf 
Using bundled JDK: /usr/share/logstash/jdk
OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults
Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console
[INFO ] 2022-09-15 21:49:37.109 [main] runner - Starting Logstash {"logstash.version"=>"7.17.6", "jruby.version"=>"jruby 9.2.20.1 (2.5.8) 2021-11-30 2a2962fbd1 OpenJDK 64-Bit Server VM 11.0.16+8 on 11.0.16+8 +indy +jit [linux-x86_64]"}
[INFO ] 2022-09-15 21:49:37.115 [main] runner - JVM bootstrap flags: [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djdk.io.File.enableADS=true, -Djruby.compile.invokedynamic=true, -Djruby.jit.threshold=0, -Djruby.regexp.interruptible=true, -XX:+HeapDumpOnOutOfMemoryError, -Djava.security.egd=file:/dev/urandom, -Dlog4j2.isThreadContextMapInheritable=true]
[INFO ] 2022-09-15 21:49:37.160 [main] settings - Creating directory {:setting=>"path.queue", :path=>"/usr/share/logstash/data/queue"}
[INFO ] 2022-09-15 21:49:37.174 [main] settings - Creating directory {:setting=>"path.dead_letter_queue", :path=>"/usr/share/logstash/data/dead_letter_queue"}
[WARN ] 2022-09-15 21:49:37.687 [LogStash::Runner] multilocal - Ignoring the 'pipelines.yml' file because modules or command line options are specified
[INFO ] 2022-09-15 21:49:38.843 [LogStash::Runner] Reflections - Reflections took 114 ms to scan 1 urls, producing 119 keys and 419 values 
[WARN ] 2022-09-15 21:49:39.658 [LogStash::Runner] line - Relying on default value of `pipeline.ecs_compatibility`, which may change in a future major release of Logstash. To avoid unexpected changes when upgrading Logstash, please explicitly declare your desired ECS Compatibility mode.
[WARN ] 2022-09-15 21:49:39.703 [LogStash::Runner] stdin - Relying on default value of `pipeline.ecs_compatibility`, which may change in a future major release of Logstash. To avoid unexpected changes when upgrading Logstash, please explicitly declare your desired ECS Compatibility mode.
Configuration OK
[INFO ] 2022-09-15 21:49:39.917 [LogStash::Runner] runner - Using config.test_and_exit mode. Config Validation Result: OK. Exiting Logstash
[root@host3 ~]# logstash -f 01-stdin-stdout.conf 
Using bundled JDK: /usr/share/logstash/jdk
OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults
Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console
[INFO ] 2022-09-15 21:50:25.095 [main] runner - Starting Logstash {"logstash.version"=>"7.17.6", "jruby.version"=>"jruby 9.2.20.1 (2.5.8) 2021-11-30 2a2962fbd1 OpenJDK 64-Bit Server VM 11.0.16+8 on 11.0.16+8 +indy +jit [linux-x86_64]"}
[INFO ] 2022-09-15 21:50:25.103 [main] runner - JVM bootstrap flags: [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djdk.io.File.enableADS=true, -Djruby.compile.invokedynamic=true, -Djruby.jit.threshold=0, -Djruby.regexp.interruptible=true, -XX:+HeapDumpOnOutOfMemoryError, -Djava.security.egd=file:/dev/urandom, -Dlog4j2.isThreadContextMapInheritable=true]
[WARN ] 2022-09-15 21:50:25.523 [LogStash::Runner] multilocal - Ignoring the 'pipelines.yml' file because modules or command line options are specified
[INFO ] 2022-09-15 21:50:25.555 [LogStash::Runner] agent - No persistent UUID file found. Generating new UUID {:uuid=>"3fc04af1-7665-466e-839f-1eb42348aeb0", :path=>"/usr/share/logstash/data/uuid"}
[INFO ] 2022-09-15 21:50:27.119 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9600, :ssl_enabled=>false}
[INFO ] 2022-09-15 21:50:28.262 [Converge PipelineAction::Create<main>] Reflections - Reflections took 110 ms to scan 1 urls, producing 119 keys and 419 values 
[WARN ] 2022-09-15 21:50:29.084 [Converge PipelineAction::Create<main>] line - Relying on default value of `pipeline.ecs_compatibility`, which may change in a future major release of Logstash. To avoid unexpected changes when upgrading Logstash, please explicitly declare your desired ECS Compatibility mode.
[WARN ] 2022-09-15 21:50:29.119 [Converge PipelineAction::Create<main>] stdin - Relying on default value of `pipeline.ecs_compatibility`, which may change in a future major release of Logstash. To avoid unexpected changes when upgrading Logstash, please explicitly declare your desired ECS Compatibility mode.
[INFO ] 2022-09-15 21:50:29.571 [[main]-pipeline-manager] javapipeline - Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>250, "pipeline.sources"=>["/root/01-stdin-stdout.conf"], :thread=>"#<Thread:0x32e464e6 run>"}
[INFO ] 2022-09-15 21:50:30.906 [[main]-pipeline-manager] javapipeline - Pipeline Java execution initialization time {"seconds"=>1.33}
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.jrubystdinchannel.StdinChannelLibrary$Reader (file:/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/jruby-stdin-channel-0.2.0-java/lib/jruby_stdin_channel/jruby_stdin_channel.jar) to field java.io.FilterInputStream.in
WARNING: Please consider reporting this to the maintainers of com.jrubystdinchannel.StdinChannelLibrary$Reader
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
[INFO ] 2022-09-15 21:50:31.128 [[main]-pipeline-manager] javapipeline - Pipeline started {"pipeline.id"=>"main"}
The stdin plugin is now waiting for input:
[INFO ] 2022-09-15 21:50:31.270 [Agent thread] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
abc
{"message" => " abc","@version" => "1","host" => "host3.test.com","@timestamp" => 2022-09-15T13:52:02.984Z
}
bbb
{"message" => "bbb","@version" => "1","host" => "host3.test.com","@timestamp" => 2022-09-15T13:52:06.177Z
}

2.输入类型

在上例中,输入类型是stdin,也就是手动输入,而在生产环境中,日志不可能通过手工输入的发生产生,因此stdin通常都是用于测试环境是否搭建成功,下面会介绍几种常见的输入类型。

2.1 file

input {file {path => ["/tmp/test/*.txt"]# 从最开始读日志文件(默认是末尾),仅在读取记录没有任何记录的情况下生效,也就是说,在服务停止的时候有新文件产生,服务器启动后可以读取到(旧文件不行)start_position => "beginning"  }
}

文件的读取记录放在/usr/share/logstash/data/plugins/inputs/file/.sincedb_3cd99a80ca58225ec14dc0ac340abb80

[root@host3 ~]# cat /usr/share/logstash/data/plugins/inputs/file/.sincedb_3cd99a80ca58225ec14dc0ac340abb80
5874000 0 64768 4 1663254379.147252 /tmp/test/1.txt

2.2 tcp

和filebeat一样,Logstash同样支持监听TCP的某一个端口,用来接收日志。可以同时监听多个端口

这种方式通常用于无法安装客户端的服务器

也可以使用HTTP协议,配置方法和TCP类似

[root@host3 ~]#vim 03-tcp-stdout.conf 
input {tcp {port => 9999}
}output {stdout {}
}
[root@host2 ~]# telnet 192.168.19.103 9999
Trying 192.168.19.103...
Connected to 192.168.19.103.
Escape character is '^]'.
123456
test
hello
{"message" => "123456\r","@version" => "1","@timestamp" => 2022-09-15T15:30:23.123Z,"host" => "host2","port" => 51958
}
{"message" => "test\r","@version" => "1","@timestamp" => 2022-09-15T15:30:24.494Z,"host" => "host2","port" => 51958
}
{"message" => "hello\r","@version" => "1","@timestamp" => 2022-09-15T15:30:26.336Z,"host" => "host2","port" => 51958
}

2.3 redis

Logstash支持直接从redis数据库中拿数据。支持三种redis数据类型:

  1. list,表示的redis命令为blpop,代表从redis list的左边获取第一个元素,如无元素则阻塞;
  2. channel,表示的redis命令为subscribe,代表从redis频道获取最新的数据;
  3. pattern_channel,表示的redis命令为psubscribe,代表通过pattern正则表达式匹配频道,获取最新的数据。

数据类型之间的区别:

  1. channel与pattern_channel的区别在于,pattern_channel可以通过正则表达式匹配多个频道,而channel是单一频道;
  2. list与另外两个channel的区别在于,1个channel的数据会被多个订阅的logstash重复获取,1个list的数据被多个logstash获取时不会重复,会被分摊在各个Logstash中。

输入配置如下

input { redis {data_type => "list"         # 指定数据类型db => 5                     # 指定数据库,默认是0host => "192.168.19.101"    # 指定redis服务器IP,默认是localhostport => 6379password => "bruce"key => "test-list"}
}

redis中追加数据

[root@host1 ~]# redis-cli -h host1 -a bruce
host1:6379> select 5
OK
host1:6379[5]> lpush test-list bruce
(integer) 1
host1:6379[5]> lrange test-list 0 -1
(empty list or set)
host1:6379[5]> lpush test-list hello
(integer) 1
host1:6379[5]> lrange test-list 0 -1		# 可以看到,Logstash获取数据后,会将列表清空
(empty list or set)
host1:6379[5]> lpush test-list '{"requestTime":"[12/Sep/2022:23:30:56 +0800]","clientIP":"192.168.19.1","threadID":"http-bio-8080-exec-7","protocol":"HTTP/1.1","requestMethod":"GET / HTTP/1.1","requestStatus":"404","sendBytes":"-","queryString":"","responseTime":"0ms","partner":"-","agentVersion":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36"}'

Logstash获取数据

{"message" => "bruce","@timestamp" => 2022-09-16T08:17:38.213Z,"@version" => "1","tags" => [[0] "_jsonparsefailure"]
}
# 非json格式数据会报错,但是能接收
[ERROR] 2022-09-16 16:18:21.688 [[main]<redis] json - JSON parse error, original data now in message field {:message=>"Unrecognized token 'hello': was expecting ('true', 'false' or 'null')\n at [Source: (String)\"hello\"; line: 1, column: 11]", :exception=>LogStash::Json::ParserError, :data=>"hello"}
{"message" => "hello","@timestamp" => 2022-09-16T08:18:21.689Z,"@version" => "1","tags" => [[0] "_jsonparsefailure"]
}
# json格式的数据过来,Logstash可以自动解析
{"clientIP" => "192.168.19.1","requestTime" => "[12/Sep/2022:23:30:56 +0800]","queryString" => "","@version" => "1","agentVersion" => "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36","partner" => "-","@timestamp" => 2022-09-16T08:23:10.320Z,"protocol" => "HTTP/1.1","requestStatus" => "404","threadID" => "http-bio-8080-exec-7","requestMethod" => "GET / HTTP/1.1","sendBytes" => "-","responseTime" => "0ms"
}

2.4 beats

在FileBeat中已经配置好了将日志输出到Logstash,在Logstash中,只需要接收数据即可。

filebeat配置

filebeat.inputs:
- type: logpaths: /tmp/1.txtoutput.logstash:hosts: ["192.168.19.103:5044"]

Logstash配置

input { beats {port => 5044}
}

host2上在/tmp/1.txt中追加111,Logstash的输出

{"message" => "111","tags" => [[0] "beats_input_codec_plain_applied"],"agent" => {"id" => "76b7876b-051a-4df8-8b13-bd013ac5ec59","version" => "7.17.4","hostname" => "host2.test.com","type" => "filebeat","name" => "host2.test.com","ephemeral_id" => "437ac89f-7dc3-4898-a457-b2452ac4223b"},"input" => {"type" => "log"},"host" => {"name" => "host2.test.com"},"log" => {"offset" => 0,"file" => {"path" => "/tmp/1.txt"}},"@version" => "1","ecs" => {"version" => "1.12.0"},"@timestamp" => 2022-09-16T08:53:20.975Z
}

3. 输出类型

3.1 redis

redis也可以作为输出类型,配置方式和输入类似

output { redis {data_type => "list" db => 6 host => "192.168.19.101" port => 6379password => "bruce"key => "test-list"}
}

查看redis数据库

[root@host1 ~]# redis-cli -h host1 -a bruce
host1:6379> select 6
OK
host1:6379[6]> lrange test-list 0 -1
1) "{\"message\":\"1111\",\"@version\":\"1\",\"@timestamp\":\"2022-09-16T09:12:29.890Z\",\"host\":\"host3.test.com\"}"

3.2 file

file类型是输出到本地磁盘保存。

output { file {path => "/tmp/test-file.log"}
}

3.3 elasticsearch

output { elasticsearch {hosts => ["192.168.19.101:9200","192.168.19.102:9200","192.168.19.103:9200"]index => "centos-logstash-elasticsearh-%{+YYYY.MM.dd}"}
}

4. filter

filter是一个可选插件,在接收到日志信息后,可以对日志进行格式化,然后再输出。

4.1 grok

grok可以用来解析任意文本并进行结构化。该工具适合syslog日志、Apache和其他网络服务器日志。

①简单示例


input {file {path => ["/var/log/nginx/access.log*"]start_position => "beginning"}
}filter {grok {match => {"message" => "%{COMBINEDAPACHELOG}"# "message" => "%{HTTPD_COMMONLOG}"		# 新版本Logstash可能会用这个变量}}
}output {stdout {}elasticsearch {hosts => ["192.168.19.101:9200","192.168.19.102:9200","192.168.19.103:9200"]index => "nginx-logs-es-%{+YYYY.MM.dd}"}
}

解析出来的结果:

{"request" => "/","bytes" => "4833","@version" => "1","auth" => "-","agent" => "\"curl/7.29.0\"","path" => "/var/log/nginx/access.log-20220913","ident" => "-","verb" => "GET","message" => "192.168.19.102 - - [12/Sep/2022:21:48:29 +0800] \"GET / HTTP/1.1\" 200 4833 \"-\" \"curl/7.29.0\" \"-\"","httpversion" => "1.1","host" => "host3.test.com","@timestamp" => 2022-09-16T14:27:43.208Z,"response" => "200","timestamp" => "12/Sep/2022:21:48:29 +0800","referrer" => "\"-\"","clientip" => "192.168.19.102"
}

在这里插入图片描述

②预定义字段


grok是基于正则表达式来进行匹配,它的语法格式是%{SYNTAX:SEMANTIC}

  • SYNTAX是将匹配您的文本的模式的名称,这是内置好的语法,官方支持120种字段。
  • SEMANTIC是您为要匹配的文本提供的标识符,也就是你要给它去的名字。

示例:

  1. 日志源文件
55.3.244.1 GET /index.html 15824 0.043
  1. 匹配的字段应该是
    %{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}
  1. 配置文件
input {stdin {}
}filter {grok {match => { "message" => "%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}" }}
}output {stdout {}
}
  1. 匹配出来的结果
55.3.244.1 GET /index.html 15824 0.043
{"message" => "55.3.244.1 GET /index.html 15824 0.043","@version" => "1","@timestamp" => 2022-09-16T14:46:46.426Z,"method" => "GET","request" => "/index.html","bytes" => "15824","duration" => "0.043","host" => "host3.test.com","client" => "55.3.244.1"
}

针对不同服务的日志,可以查看官方文档的定义:
https://github.com/logstash-plugins/logstash-patterns-core/tree/master/patterns

③自定义字段


当预定义的字段不符合要求时,grok也支持自定义正则表达式来匹配日志信息

  1. 首先需要创建自定义表达式保存的目录,并将表达式写进去
[root@host3 ~]# mkdir patterns
[root@host3 ~]# echo "POSTFIX_QUEUEID [0-9A-F]{10,11}" >> ./patterns/1
  1. 修改配置文件
input {stdin {}
}filter {grok {patterns_dir => ["/root/patterns"]											# 指定表达式位置match => { "message" => "%{SYSLOGBASE} %{POSTFIX_QUEUEID:queue_id}: %{GREEDYDATA:syslog_message}" }	# 这里有系统预定义的,也有自定义的表达式,大括号外的字符就是常规的字符,需要逐个匹配,如冒号: }
}output {stdout {}
}
  1. 运行并测试
...
The stdin plugin is now waiting for input:
[INFO ] 2022-09-16 23:22:04.511 [Agent thread] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
Jan  1 06:25:43 mailserver14 postfix/cleanup[21403]: BEF25A72965: message-id=<20130101142543.5828399CCAF@mailserver14.example.com>
{"message" => "Jan  1 06:25:43 mailserver14 postfix/cleanup[21403]: BEF25A72965: message-id=<20130101142543.5828399CCAF@mailserver14.example.com>","host" => "host3.test.com","timestamp" => "Jan  1 06:25:43","queue_id" => "BEF25A72965",			# 自定义表达式匹配的字段"logsource" => "mailserver14","@timestamp" => 2022-09-16T15:22:19.516Z,"program" => "postfix/cleanup","pid" => "21403","@version" => "1","syslog_message" => "message-id=<20130101142543.5828399CCAF@mailserver14.example.com>"
}

4.2 通用字段

顾名思义,这些字段可以用在所有属于filter的插件中。

  • remove_field
filter {grok {remove_field => ["@version","tag","agent"]}
}
  • add_field
filter {grok {add_field => ["new_tag" => "hello world %{YYYY.mm.dd}"]}
}

4.3 date

在数据中,会有两个时间戳timestamp和@timestamp,日志产生的时间和数据采集的时间,这两个时间可能会不一致。
date插件可以用来转换日志记录中的时间字符串,参考@timestamp字段里的时间。date插件支持五种时间格式:

  • ISO8601
  • UNIX
  • UNIX_MS
  • TAI64N
input {file {path => "/var/log/nginx/access.log*"start_position => "beginning"}
}filter {grok {match => { "message" => "%{HTTPD_COMMONLOG}" }remove_field => ["message","ident","auth","@version","path"]}date {match => [ "timestamp","dd/MMM/yyyy:HH:mm:ss Z" ]		# timestamp必须是现有的字段,这里只是对这个字段的时间进行校正,且需要和timestamp字段的原数据格式一致,否则会报解析错误# timestamp原来的数据格式为"17/Sep/2022:18:42:26 +0800",因此时区改成ZZZ就会一直报错,因为ZZZ代表Asia/Shanghai这种格式,Z代表+0800timezone => "Asia/Shanghai"}}output {stdout {}
}

输出的格式:

{"timestamp" => "17/Sep/2022:18:42:26 +0800", #和@timestamp有8小时的时间差,可到Elasticsearch中查看,如果也有时间差,可以在date中修改timezone"response" => "200","httpversion" => "1.1","clientip" => "192.168.19.102","verb" => "GET","host" => "host3.test.com","request" => "/","@timestamp" => 2022-09-17T10:42:26.000Z,"bytes" => "4833"
}

使用target将匹配到的时间字段解析后存储到目标字段,若不指定,默认是@timestamp字段。这个字段在Kibana中创建索引时可以用到

  date {match => [ "timestamp","dd/MMM/yyyy:HH:mm:ss Z" ]timezone => "Asia/Shanghai"target => "logtime"}# 结果
{"timestamp" => "17/Sep/2022:21:15:30 +0800","response" => "200","logtime" => 2022-09-17T13:15:30.000Z,		# 日志产生的时间"httpversion" => "1.1","clientip" => "192.168.19.102","verb" => "GET","host" => "host3.test.com","request" => "/","@timestamp" => 2022-09-17T13:15:31.357Z,		# 日志记录的时间,可以看到和日志产生的时间有一定的延迟"bytes" => "4833"
}

4.4 geoip

用来解析访问IP的位置信息。这个插件是依赖GeoLite2城市数据库,信息不一定准确,也可以自己下载MaxMind格式的数据库然后应用,官方网站有自定义数据库的指导手册。

input {file {path => "/var/log/nginx/access.log*"start_position => "beginning"}
}filter {grok {match => { "message" => "%{HTTPD_COMMONLOG}" }remove_field => ["message","ident","auth","@version","path"]}geoip {source => "clientip" 			# IP地址的源参考clientip字段# fields => ["country_name" ,"timezone", "city_name"]		# 可以选择显示的字段}}output {stdout {}
}

得到的结果,可以看到,私有地址无法正常解析

{"timestamp" => "17/Sep/2022:21:15:30 +0800","response" => "200","geoip" => {},"httpversion" => "1.1","clientip" => "192.168.19.102","verb" => "GET","host" => "host3.test.com","tags" => [[0] "_geoip_lookup_failure"				# 私网地址],"request" => "/","@timestamp" => 2022-09-17T13:30:05.178Z,"bytes" => "4833"
}
{"timestamp" => "17/Sep/2022:21:15:30 +0800","response" => "200","geoip" => {					# 解析的结果放在geoip中"country_code2" => "CM","country_code3" => "CM","country_name" => "Cameroon","ip" => "154.72.162.134","timezone" => "Africa/Douala","location" => {"lon" => 12.5,"lat" => 6.0},"continent_code" => "AF","latitude" => 6.0,"longitude" => 12.5},"httpversion" => "1.1","clientip" => "154.72.162.134","verb" => "GET","host" => "host3.test.com","request" => "/","@timestamp" => 2022-09-17T13:30:05.178Z,"bytes" => "4833"
}

4.5 useragent

用来解析浏览器的信息。前提是输出的信息有浏览器信息字段。

input {file {path => "/var/log/nginx/access.log*"start_position => "beginning"}
}filter {grok {match => { "message" => "%{HTTPD_COMBINEDLOG}" }		# HTTPD_COMBINEDLOG可以解析浏览器remove_field => ["message","ident","auth","@version","path"]}useragent {source => "agent"							# 指定浏览器信息在哪个字段中,这个字段必须要存在target => "agent_test"						# 为了方便查看,将所有解析后的信息放到这个字段里面去}
}output {stdout {}
}

得到的结果:

{"timestamp" => "17/Sep/2022:23:42:31 +0800","response" => "404","geoip" => {},"httpversion" => "1.1","clientip" => "192.168.19.103","verb" => "GET","agent" => "\"Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0\"","host" => "host3.test.com","request" => "/favicon.ico","referrer" => "\"-\"","@timestamp" => 2022-09-17T15:42:31.927Z,"bytes" => "3650","agent_test" => {"major" => "60","name" => "Firefox","os" => "Linux","os_full" => "Linux","os_name" => "Linux","version" => "60.0","minor" => "0","device" => "Other"}
}
{
{
..."agent_test" => {"major" => "60","name" => "Firefox","os" => "Linux","os_full" => "Linux","os_name" => "Linux","version" => "60.0","minor" => "0","device" => "Other"}
}
{
..."agent_test" => {"os_minor" => "0","os_full" => "iOS 16.0","version" => "16.0","os_major" => "16","device" => "iPhone","major" => "16","name" => "Mobile Safari","os" => "iOS","os_version" => "16.0","os_name" => "iOS","minor" => "0"}
}
{
..."agent_test" => {"patch" => "3987","os_full" => "Android 10","version" => "80.0.3987.162","os_major" => "10","device" => "Samsung SM-G981B","major" => "80","name" => "Chrome Mobile","os" => "Android","os_version" => "10","os_name" => "Android","minor" => "0"}
}

4.6 mutate

  1. 切割自定的字段
input {stdin {}
}filter {mutate {split => {message =>  " "		# 将message消息以空格作为分隔符进行分割}remove_field => ["@version","host"]add_field => {"tag" => "This a test field from Bruce"}}
}output {stdout {}
}
111 222 333
{"tag" => "This a test field from Bruce","message" => [[0] "111",[1] "222",[2] "333"],"@timestamp" => 2022-09-18T08:07:36.373Z
}
  1. 将切割后的数据取出来
input {stdin {}
}filter {mutate {split => {message =>  " "		# 将message消息以空格作为分隔符进行分割}remove_field => ["@version","host"]add_field => {"tag" => "This a test field from Bruce"}}mutate {add_field => {"name" => "%{[message][0]}""age" => "%{[message][1]}""sex" => "%{[message][2]}"}}
}output {stdout {}
}
bruce 37 male
{"message" => [[0] "bruce",[1] "37",[2] "male"],"age" => "37","@timestamp" => 2022-09-18T08:14:31.230Z,"sex" => "male","tag" => "This a test field from Bruce","name" => "bruce"
}
  1. convert:将字段的值转换成不同的类型,例如将字符串转换成证书,如字段值是一个数组,所有成员都会被转换。如果该字段是散列,则不会采取任何动作
filter {mutate {convert => {"age" => "integer"			# 将age转换成数字类型}}
}
bruce 20 male
{"message" => [[0] "bruce",[1] "20",[2] "male"],"sex" => "male","name" => "bruce","age" => 20,					# 没有引号,代表已经修改成数字类型了"@timestamp" => 2022-09-18T08:51:07.633Z,"tag" => "This a test field from Bruce"
}
  1. strip:剔除字段中的前导和尾随的空格
filter {mutate {strip => { "name","sex" }}
}
  1. rename:修改字段名
filter {mutate {rename => { "sex" => "agenda" }}
}
  1. replace:替换字段内容
filter {mutate {replace => { "tag" => "This is test message" }		# 修改了tag字段的内容}
}
  1. update:用法和replace一样,区别在于如果字段存在则修改内容,如果过不存在则忽略此操作

  2. uppercase/lowercase:转换成大写/小写;capitalize:首字母大写。转换的是字段内容

filter {mutate {uppercase => "tag" capitalize => "name" }
}

5 高级特性

5.1 判断语法

在input中打上标记后,可以在output和filter中通过判断语句来做区别化的处理

input {beats {port => 8888type => "nginx-beats"}tcp {port => 9999type => "tomcat-tcp"}
}output { if [type] == "nginx-beats" {elasticsearch {hosts => ["192.168.19.101:9200","192.168.19.102:9200","192.168.19.103:9200"]index => "nginx-beats-elasticsearh-%{+YYYY.MM.dd}"}} else {elasticsearch {hosts => ["192.168.19.101:9200","192.168.19.102:9200","192.168.19.103:9200"]index => "tomcat-tcp-elasticsearh-%{+YYYY.MM.dd}"}
}

5.2 多实例运行

Logstash支持多实例运行,但是如果直接启动,第二个实例会报错,需要指定path.data的路径才能正常启动。

[root@host3 ~]# logstash -f 01-stdin-stdout.conf --path.data /tmp/logstash

相关文章:

ElasticSearch学习笔记之三:Logstash数据分析

第3章 Logstash数据分析 Logstash使用管道方式进行日志的搜集处理和输出。有点类似*NIX系统的管道命令 xxx | ccc | ddd&#xff0c;xxx执行完了会执行ccc&#xff0c;然后执行ddd。 在logstash中&#xff0c;包括了三个阶段: 输入input --> 处理filter&#xff08;不是必须…...

求职力扣刷题DAY20--二叉树 part06

20 654. 最大二叉树 给定一个不重复的整数数组 nums 。 最大二叉树 可以用下面的算法从 nums 递归地构建: 创建一个根节点&#xff0c;其值为 nums 中的最大值。递归地在最大值 左边 的 子数组前缀上 构建左子树。递归地在最大值 右边 的 子数组后缀上 构建右子树。 返回 n…...

Error:Kotlin: Module was compiled with an incompatible version of Kotlin.

一、问题&#xff1a;运行spring boot项目时&#xff0c;idea报出错误&#xff1a;时提示报错如下图&#xff1a; 错误代码&#xff1a; Error:Kotlin: Module was compiled with an incompatible version of Kotlin. The binary version of its metadata is 1.6.0, expected …...

关于flutter 启动 页面加载空白(三四秒空白页面)

一:可以在 对应的xml配置启动动画 <item><bitmapandroid:gravity"center"android:src"mipmap/ic_launcher" /></item> 二&#xff1a;以下是对应的文件目录 注意事项&#xff1a;俩处xml都配置一下&#xff0c;配置一样就可以了...

计量校准证书和检定证书区别,企业仪器校准要哪种证书好?

很多企业做校准&#xff0c;会要求校准机构出具相关证书&#xff0c;而有时候也会被机构询问&#xff0c;是要做检定还是校准&#xff0c;出具的证书是要校准证书还是检定证书&#xff1f;那么两者有什么区别呢&#xff1f; 1-检测方式不同 首先两种证书是不同检测方式所给的证…...

解析Java中1000个常用类:StackWalker类,你学会了吗?

推荐一个我自己写的小报童专栏导航网站: http://xbt100.top 收录了生财有术项目精选、AI海外赚钱、纯银的产品分析等专栏,陆续会收录更多的专栏,欢迎体验~复制URL可直达。 以下是正文。 Java 9 引入了许多新特性,其中之一是 StackWalker 类。StackWalker 提供了一种高效…...

【代码随想录算法训练Day32】LeetCode 122 买卖股票的最佳时机 II、LeetCode 55.跳跃游戏、LeetCode 45.跳跃游戏II

Day32 贪心第二天 LeetCode 122 买卖股票的最佳时机 II 思路真是无比巧妙&#xff0c;把区间利润拆成每天的利润&#xff0c;其实就是算出每天的利润&#xff0c;然后只取其中的正值即可。 在代码中计算是否计算加时还与0取最大值&#xff0c;相当于大于0才加入。 class Sol…...

Qt之QGraphicsView —— 笔记3:矩形图元连接(附完整源码)

效果 完整源码 注意:在ui文件中拖入一个QGraphicsView类窗口控件,然后用MyGraphicsView提升该类。 main.cpp #include "widget.h" #include <QApplication>int main(...

2024年,计算机相关专业还值得选择吗?

2024年&#xff0c;计算机相关专业还值得选择吗&#xff1f; 随着2024年高考落幕&#xff0c;数百万高三学生又将面临人生中的重要抉择&#xff1a;选择大学专业。在这个关键节点&#xff0c;计算机相关专业是否仍是“万金油”的选择&#xff1f;在过去很长一段时间里&#xf…...

流批一体计算引擎-10-[Flink]中的常用算子和DataStream转换

pyflink 处理 kafka数据 1 DataStream API 示例代码 从非空集合中读取数据&#xff0c;并将结果写入本地文件系统。 from pyflink.common.serialization import Encoder from pyflink.common.typeinfo import Types from pyflink.datastream import StreamExecutionEnviron…...

Java进阶_多态特性

生活中的多态 多态是同一个行为具有多个不同表现形式或形态的能力。多态就是同一个接口&#xff0c;使用不同的实例而执行不同操作&#xff0c;如图所示&#xff1a; 现实中&#xff0c;比如我们按下 F1 键这个动作&#xff0c;同一个事件发生在不同的对象上会产生不同的结果。…...

一个热门的源码整站数据打包完整代码(开箱即用),集成了最新有效数据和完美wordpress主题。

分享一个资源价值几千元的好代码资源网整站打包代码&#xff0c;这个wordpress网站基于集成了ripro9.1完全明文无加密后门版本定制开发&#xff0c;无需独立服务器&#xff0c;虚拟主机也可以完美运营&#xff0c;只要主机支持php和mysql即可。整合了微信登录和几款第三方的主题…...

操作系统真象还原-第3章 完善MBR

继续学习第三章&#xff0c;MBR这个引导程序上一次只是打印一个字符串&#xff0c;没有起到引导作用&#xff0c;这一章估计是要做引导了&#xff0c;我设想一个扇区应该不够&#xff0c;会再load一段代码&#xff0c;然后跳到这段代码执行。 开始吧&#xff1a; 3.1 地址/se…...

翻转链表-链表题

LCR 141. 训练计划 III - 力扣&#xff08;LeetCode&#xff09; 非递归 class Solution { public:ListNode* trainningPlan(ListNode* head) {if(head ! nullptr && head->next ! nullptr){ListNode* former nullptr;ListNode* mid head;ListNode* laster nul…...

【Android面试八股文】volatile和synchronize有什么区别?

volatile和synchronize有什么区别? 在 Java 多线程编程中,volatile 和 synchronized 是两个重要的关键字,它们分别用于处理并发访问共享变量的问题。尽管它们都可以用于确保多线程环境下的数据一致性,但在实际应用中却有着明显的区别和适用场景。 作用范围: volatile 只能…...

linux flask | 接口保持在后台一直运行、python后端接口长期调用、python后台持续运行方法、python提供后端接口

文章目录 一、flask接口二、长期运行接口2.1、nohup与&后台运行 实际项目中我们需要用python提供一个后端接口&#xff0c;并在linux上持续运行这个程序&#xff0c;以供其他项目调用。下面就用个简单示例讲解下怎么写python后端接口&#xff0c;以及如何将程序长期运行在l…...

二分查找算法:穿越算法迷宫的指南

✨✨✨学习的道路很枯燥&#xff0c;希望我们能并肩走下来! 目录 前言 一. 二分查找算法介绍 二 二分查找的题目解析 2.1 二分查找 2.2 在排序数组中查找元素的第一个位置和最后一个位置 2.3 搜索插入位置 2.4 x的平方根 2.5 山峰数组峰顶的索引 2.6 寻找峰值 2.7 寻找旋转数…...

【Week-R3】天气预测,引入探索式数据分析方法(EDA)

文章目录 1. 导入模块2. 导入数据3.探索式数据分析方法&#xff08;EDA&#xff09;3.1 数据相关性探索3.2 是否会下雨3.3 地理位置与下雨的关系3.4 湿度和压力对下雨的影响3.5 气温对下雨的影响 4.数据预处理4.1 处理缺损值4.2 构建数据集 5 预测是否会下雨5.1 构建神经网络5.…...

VBA excel 表格将多行拆分成多个表格或 文件 或者合并 多个表格

excel 表格 拆分 合并 拆分工作表按行拆分为工作表工作表按行拆分为工作薄 合并操作步骤 拆分 为了将Excel中的数万行数据拆分成多个个每个固定行数的独立工作表&#xff0c;并且保留每个工作表的表头&#xff0c;你可以使用以下VBA脚本。这个脚本会复制表头到每个新的工作表&…...

利用Redis的队列模式实现消息的发送和订阅,适合分布式场景,Java实现代码

在Redis中&#xff0c;通常使用发布/订阅模式&#xff08;Pub/Sub&#xff09;来进行消息的实时通信。然而&#xff0c;标准的Redis发布/订阅模式并不直接支持确保一条消息只被一台机器消费。在这种模式下&#xff0c;所有订阅了特定频道的客户端都会收到发布的消息。 但是&…...

软件下载安装【汇总】

软件下载安装【汇总】 前言版权推荐软件安装【汇总】最后 前言 2024-5-12 21:38:34 以下内容源自《【汇总】》 仅供学习交流使用 版权 禁止其他平台发布时删除以下此话 本文首次发布于CSDN平台 作者是CSDN日星月云 博客主页是https://jsss-1.blog.csdn.net 禁止其他平台发布…...

重定向文件访问(Redirect file access)

重定向文件访问 重定向文件访问是指通过修改文件系统的路径&#xff0c;使对某个文件或目录的访问请求被转到另一个文件或目录。这在系统管理、测试和开发中非常有用&#xff0c;因为它允许您在不修改应用程序或服务配置的情况下&#xff0c;改变文件的实际存储位置。 proot …...

隐私计算(1)数据可信流通

目录 1. 数据可信流通体系 2. 信任的基石 3.数据流通中的不可信风险 可信链条的级联失效&#xff0c;以至于崩塌 4.数据内循环与外循环&#xff1a;传统数据安全的信任基础 4.1内循环 4.2外循环 5. 技术信任 6. 密态计算 7.技术信任 7.1可信数字身份 7.2 使用权跨域…...

果汁机锂电池充电,5V升压12.7V 升压恒压芯片SL1571B

在现代化的日常生活中&#xff0c;果汁机已经逐渐成为了许多家庭厨房的必备电器。随着科技的不断进步&#xff0c;果汁机的性能也在不断提升&#xff0c;其中锂电池的应用更是为果汁机带来了前所未有的便利。而5V升压12.7V升压恒压芯片SL1571B&#xff0c;作为果汁机锂电池充电…...

多个线程多个锁:如何确保线程安全和避免竞争条件

目录 前言 一、确定需要多个锁的场景 1.独立资源保护 2.部分依赖资源 二、避免死锁 三、锁粒度与并发性能 1. 粗粒度锁定 2.细粒度锁定 四、设计策略&#xff1a;减少资源依赖 1.资源分离 2.无锁设计 3.锁合并 五、Demo讲解 总结&#xff1a; 前言 当多个线程需要…...

Linux-笔记 设备树插件

目录 前言&#xff1a; 设备树插件的书写规范&#xff1a; 设备树插件的编译&#xff1a; 内核配置: 应用背景&#xff1a; 举例&#xff1a; 前言&#xff1a; 设备树插件&#xff08;Device Tree Blob Overlay&#xff0c;简称 DTBO&#xff09;是Linux内核和嵌入式系统…...

【排序算法】总结篇

✨✨这些 排序算法都是指的 需要进行比较的排序算法 ✨✨下面都是略微讲解一下思路&#xff0c;如果需要详细了解哪一个排序&#xff0c;点击&#x1f449;链接即可 ✨✨对于时间、空间复杂度、稳定性&#xff0c;希望你&#x1f9d1;‍&#x1f393;能够理解记忆&#x1f9d1;…...

鸿蒙开发文件管理:【@ohos.fileio (文件管理)】

文件管理 该模块提供文件存储管理能力&#xff0c;包括文件基本管理、文件目录管理、文件信息统计、文件流式读写等常用功能。 说明&#xff1a; 本模块首批接口从API version 6开始支持。后续版本的新增接口&#xff0c;采用上角标单独标记接口的起始版本。 导入模块 impor…...

硬件工程师学习规划

背景介绍 当前电子行业中&#xff0c;互联网因为中国人口基数大&#xff0c;得到很快的发展&#xff0c;一越成为世界第一梯队&#xff0c;互联网软件薪资要高于传统制造业硬件的薪资&#xff0c;从各大招聘软件上就能看到&#xff0c;那么为什么软件发展要好于硬件&#xff1…...

esp32 8行代码实现蓝牙音响

目录 硬件准备&#xff1a; 具体代码&#xff1a; 接线&#xff1a; 备注&#xff1a; 八行代码实现简易版蓝牙音响&#xff0c;亲测有效&#xff1a; esp32 DIY蓝牙音响_哔哩哔哩_bilibili 硬件准备&#xff1a; ESP32-wroom、MAX98357音频放大器模块、4欧3瓦小喇叭、杜…...