grep针对大型文件的大型列表

一尘不染

grep针对大型文件的大型列表

linux

我目前正在尝试grep针对更大的csv文件（3.000.000行）使用大量的ID（〜5000）。

我想要所有包含ID文件中ID的csv行。

我的幼稚方法是：

cat the_ids.txt | while read line
do
  cat huge.csv | grep $line >> output_file
done

但这需要永远！

有没有更有效的方法来解决这个问题？

阅读 222

2020-06-03

共1个答案

一尘不染

尝试

grep -f the_ids.txt huge.csv

另外，由于您的模式似乎是固定的字符串，因此提供-F选项可能会加快速度grep。

   -F, --fixed-strings
          Interpret PATTERN as a  list  of  fixed  strings,  separated  by
          newlines,  any  of  which is to be matched.  (-F is specified by
          POSIX.)

2020-06-03