我应该如何在logstash中使用sql_last

一尘不染

我应该如何在logstash中使用sql_last_value？

elasticsearch

我不太清楚sql_last_value当我这样说时会做什么：

statement => "SELECT * from mytable where id > :sql_last_value"

我可以稍微了解使用它的原因，因为它不浏览整个数据库表以更新字段，而是仅更新新添加的记录。如我错了请纠正我。

所以我想做的是使用logstash这样创建索引：

input {
    jdbc {
        jdbc_connection_string => "jdbc:mysql://hostmachine:3306/db" 
        jdbc_user => "root"
        jdbc_password => "root"
        jdbc_validate_connection => true
        jdbc_driver_library => "/path/mysql_jar/mysql-connector-java-5.1.39-bin.jar"
        jdbc_driver_class => "com.mysql.jdbc.Driver"
        schedule => "* * * * *"
        statement => "SELECT * from mytable where id > :sql_last_value"
        use_column_value => true
        tracking_column => id
        jdbc_paging_enabled => "true"
        jdbc_page_size => "50000"
    }
}

output {
    elasticsearch {
        #protocol => http
        index => "myindex"
        document_type => "message_logs"
        document_id => "%{id}"
        action => index
        hosts => ["http://myhostmachine:9402"]
    }
}

一旦执行此操作，文档就根本不会上传到索引。我要去哪里错了？

任何帮助，不胜感激。

阅读 2108

2020-06-22

共1个答案

一尘不染

如果您的表中有一个时间戳列（例如last_updated），则最好使用它代替ID号。这样，当记录更新时，您也可以修改该时间戳，jdbc输入插件将提取记录（即ID列不会更改其值，更新的记录也不会被提取）

input {
    jdbc {
        jdbc_connection_string => "jdbc:mysql://hostmachine:3306/db" 
        jdbc_user => "root"
        jdbc_password => "root"
        jdbc_validate_connection => true
        jdbc_driver_library => "/path/mysql_jar/mysql-connector-java-5.1.39-bin.jar"
        jdbc_driver_class => "com.mysql.jdbc.Driver"
        jdbc_paging_enabled => "true"
        jdbc_page_size => "50000"
        schedule => "* * * * *"
        statement => "SELECT * from mytable where last_updated > :sql_last_value"
    }
}

如果您仍然决定使用ID列，则应删除该$HOME/.logstash_jdbc_last_run文件，然后重试。

2020-06-22