我想写到csv文件中
for rss in rsslinks: item = AppleItem() item['reference_link'] = response.url base_url = get_base_url(response) item['rss_link'] = urljoin_rfc(base_url,rss) #item['rss_link'] = rss items.append(item) #items.append("\n") f = open(filename,'a+') #filename is apple.com.csv for item in items: f.write("%s\n" % item)
我的输出是这样的:
{'reference_link': 'http://www.apple.com/' 'rss_link': 'http://www.apple.com/rss ' {'reference_link': 'http://www.apple.com/rss/' 'rss_link': 'http://ax.itunes.apple.com/WebObjects/MZStore.woa/wpa/MRSS/newreleases/limit=10/rss.xml'} {'reference_link': 'http://www.apple.com/rss/' 'rss_link': 'http://ax.itunes.apple.com/WebObjects/MZStore.woa/wpa/MRSS/newreleases/limit=25/rss.xml'}
我想要的是这种格式:
reference_link rss_link http://www.apple.com/ http://www.apple.com/rss/
只需使用抓取即可-o csv,例如:
-o csv
scrapy crawl <spider name> -o file.csv -t csv