linux-打印文件1与文件2的差异,而不从文件2中删除任何内容
内容导读
互联网集市收集整理的这篇技术教程文章主要介绍了linux-打印文件1与文件2的差异,而不从文件2中删除任何内容,小编现在分享给大家,供广大互联网技能从业者学习和参考。文章包含4416字,纯文字阅读大概需要7分钟。
内容图文
我正在创建一个脚本,用于从a.csv日志文件中针对预定义的黑色IP列表搜索IP.
它首先导入日志文件,然后从中解析IP,然后针对预定义的黑色IP列表搜索解析的IP,最后需要询问用户(如果找到任何结果)将结果保存到导入的原始日志文件中.
文件1是代码中IP-output.csv的示例.
文件2是代码中$filename的示例(原始导入的.csv).
文件1:
107.147.166.60 ,SUSPICIOUS IP
107.147.167.26 ,SUSPICIOUS IP
108.48.185.186 ,SUSPICIOUS IP
108.51.114.130 ,SUSPICIOUS IP
142.255.102.68 ,SUSPICIOUS IP
档案2:
outlook.office365.com ,174.203.0.118 ,UserLoginFailed
outlook.office365.com ,107.147.166.60 ,UserLoginFailed
outlook.office365.com ,107.147.167.26 ,UserLoginFailed
outlook.office365.com ,174.205.17.24 ,UserLoginFailed
outlook.office365.com ,108.48.185.186 ,UserLoginFailed
outlook.office365.com ,174.226.15.21 ,UserLoginFailed
outlook.office365.com ,108.51.114.130 ,UserLoginFailed
outlook.office365.com ,67.180.23.93 ,UserLoginFailed
outlook.office365.com ,142.255.102.68 ,UserLoginFailed
outlook.office365.com ,164.106.75.235 ,UserLoginFailed
我想将文件2更改为:
outlook.office365.com ,174.203.0.118 ,UserLoginFailed
outlook.office365.com ,107.147.166.60 ,UserLoginFailed ,SUSPICIOUS IP
outlook.office365.com ,107.147.167.26 ,UserLoginFailed ,SUSPICIOUS IP
outlook.office365.com ,174.205.17.24 ,UserLoginFailed
outlook.office365.com ,108.48.185.186 ,UserLoginFailed ,SUSPICIOUS IP
outlook.office365.com ,174.226.15.21 ,UserLoginFailed
outlook.office365.com ,108.51.114.130 ,UserLoginFailed ,SUSPICIOUS IP
outlook.office365.com ,67.180.23.93 ,UserLoginFailed
outlook.office365.com ,142.255.102.68 ,UserLoginFailed ,SUSPICIOUS IP
outlook.office365.com ,164.106.75.235 ,UserLoginFailed
这是我创建的脚本:
#!/bin/bash
#
# IP Blacklist Checker
#Import .csv (File within working directory)
echo "Please import a .csv log file to parse/search the IP(s) and UserAgents: "
read filename
#Parsing IPs from .csv log file
echo "Parsing IP(s) from imported log file..."
grep -Eo '([0-9]{1,3}\.){3}[0-9]{1,3}' $filename | sort | uniq > IP-list.txt
echo 'Done'
awk 'END {print NR,"IP(s) Found in imported log file"}' IP-list.txt
echo 'IPs found in imported log file:'
cat IP-list.txt
#searches parsed ip's against blacked ip lists
echo 'Searching parsed IP(s) from pre-defined Blacked IP List Databases...'
fgrep -w -f "IP-list.txt" "IPlist.txt" > IP-output.txt
awk 'END {print NR,"IP(s) Found Blacked IP List Databases"}' IP-output.txt
echo 'Suspicious IPs found in Blacked IP List Databases:'
cat IP-output.txt
while true; do
read -p "Do you want to add results to log file?" yn
case $yn in
[Yy]* ) grep -Ff IP-output.txt $filename | sed 's/$/ ,SUSPICIOUS IP/' > IP-output.csv && awk 'FNR==NR {m[$1]=$0; next} {for (i in m) {match($0,i); val=substr($0, RSTART, RLENGTH); if (val) {sub(val, m[i]); print; next}};} 1' IP-output.csv $filename > $filename; break;;
[Nn]* ) break;;
* ) echo "Please answer yes or no.";;
esac
done
echo "Finished searching parsed IP(s) from pre-defined Blacked IP List Databases."
rm IP-list.txt IP-output.csv IP-output.txt
我要导入的日志文件真的很长,只有15到20列,并且IPlist.txt(涂黑的IP)中包含超过15000个IP.将结果保存到相同的日志文件后,.csv文件将为空,如果我将其保存为其他名称,则所有列均乱序,并且IP列旁边会出现“ SUSPICIOUS IP”列,而是需要它位于最后一列(行的末尾).
我还不知道如何仅在发现任何内容后才提示保存文件,如果不仅提示什么也没有提示!
我得到的结果:
outlook.office365.com ,174.203.0.118 ,UserLoginFailed
outlook.office365.com ,107.147.166.60 ,SUSPICIOUS IP ,UserLoginFailed
outlook.office365.com ,107.147.167.26 ,SUSPICIOUS IP ,UserLoginFailed
outlook.office365.com ,174.205.17.24 ,UserLoginFailed
outlook.office365.com ,108.48.185.186 ,SUSPICIOUS IP ,UserLoginFailed
outlook.office365.com ,174.226.15.21 ,UserLoginFailed
outlook.office365.com ,108.51.114.130 ,SUSPICIOUS IP ,UserLoginFailed
outlook.office365.com ,67.180.23.93 ,UserLoginFailed
outlook.office365.com ,142.255.102.68 ,SUSPICIOUS IP ,UserLoginFailed
outlook.office365.com ,164.106.75.235 ,UserLoginFailed
解决方法:
您的意思是这样的:
awk 'FNR==NR { m[$1]=$0; next; } { for (i in m) { idx = index($0, i); if (idx > 0) { print substr($0, 1, idx-1) m[i]; next; } } } 1' file1.txt file2.txt > newfile2.txt
它基本上按顺序处理file1.txt和file2.txt.对于第一个文件中的所有行,FNR == NR为true,其中映射m用替换模式构建(第一个空间映射到整行之前的所有内容).对于第二个文件,将检查每行中以m为单位的匹配项.如果存在匹配项(使用index()),则脚本会在匹配项之前打印所有内容,然后打印m中的值.哦,最后1将打印file2中不匹配的行.
内容总结
以上是互联网集市为您收集整理的linux-打印文件1与文件2的差异,而不从文件2中删除任何内容全部内容,希望文章能够帮你解决linux-打印文件1与文件2的差异,而不从文件2中删除任何内容所遇到的程序开发问题。 如果觉得互联网集市技术教程内容还不错,欢迎将互联网集市网站推荐给程序员好友。
内容备注
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 gblab@vip.qq.com 举报,一经查实,本站将立刻删除。
内容手机端
扫描二维码推送至手机访问。