我一直在寻找方法来改善托管在CDN(如Amazon S3)上的angularJS应用的SEO(即,没有后端的简单存储)。那里的大多数解决方案(PhantomJS,prerender.io,seo.js等)都依赖后端来识别搜寻?_escaped_fragment_器生成的url,然后从其他地方获取相关页面。即使grunt- html-snapshot最终也需要您执行此操作,即使您提前生成快照页面也是如此。
grunt seo
网址末尾有这样一个斜线 http://yourdomain.com/page1/
我个人确保http://yourdomain.com/page1( 不带 斜杠)也到达了目的地,但这不在这里。我还确保每种语言都有不同的状态和不同的URL。
//grunt plugins you will need: grunt.loadNpmTasks('grunt-prerender'); grunt.loadNpmTasks('grunt-replace'); grunt.loadNpmTasks('grunt-wait'); grunt.loadNpmTasks('grunt-aws-s3'); //The grunt tasks in the right order grunt.registerTask('seo', 'First launch server, then prerender and replace', function (target) { grunt.task.run([ 'concurrent:seo' //Step 1: in parrallel launch server, then perform so-called seotasks ]); }); grunt.registerTask('seotasks', [ 'http', //This is an API call to get all pages on my website. Skipping this step in this tutorial. 'wait', // wait 1.5 sec to make sure that server is launched 'prerender', //Step 2: create a snapshot of your website 'replace', //Step 3: clean the mess 'sitemap', //Create a sitemap of your production environment 'aws_s3:dev' //Step 4: upload ]);
//grunt config concurrent: { seo: [ 'connect:dist:keepalive', //Launching a server and keeping it alive 'seotasks' //now that we have a running server we can launch the SEO tasks ] }
//grunt config prerender: { options: { sitePath: 'http://localhost:9001', //points to the url of the server you just launched. You can also make it point to your production website. //As you can see the source urls allow for multiple languages provided you have different states for different languages (see note below for that) urls: ['/', '/projects/', '/portal/','/en/', '/projects/en/', '/portal/en/','/fr/', '/projects/fr/', '/portal/fr/'],//this var can be dynamically updated, which is done in my case in the callback of the http task hashed: true, dest: 'dist/SEO/',//where your static html files will be stored timeout:5000, interval:5000, //taking a snapshot of how the page looks like after 5 seconds. phantomScript:'basic', limit:7 //# pages processed simultaneously } }
如果您打开预渲染的文件,它们将适用于搜寻器,但不适用于人类。对于使用chrome的用户,您的指令将加载两次。因此,您需要在激活angular 之前 (即,紧随头部之后)将智能浏览器重定向到您的主页。
//Add the script tag to redirect if we're not a search bot replace: { dist: { options: { patterns: [ { match: '<head>', //redirect to a clean page if not a bot (to your index.html at the root basically). replacement: '<head><script>if(!/bot|googlebot|crawler|spider|robot|crawling/i.test(navigator.userAgent)) { document.location = "/#" + window.location.pathname; }</script>' //note: your hashbang (#) will still work. } ], usePrefix: false }, files: [ {expand: true, flatten: false, src: ['dist/SEO/*/**/*.html'], dest: ''} ] }
<div ui-view autoscroll="true" id="ui-view"></div> <!-- this script is needed to clear ui-view BEFORE angular starts to remove the static html that has been generated for search engines who cannot read angular --> <script> if(!/bot|googlebot|crawler|spider|robot|crawling/i.test( navigator.userAgent)) { document.getElementById('ui-view').innerHTML = ""; } </script>
aws_s3: { options: { accessKeyId: "<%= aws.accessKeyId %>", // Use the variables secretAccessKey: "<%= aws.secret %>", // You can also use env variables region: 'eu-west-1', uploadConcurrency: 5, // 5 simultaneous uploads }, dev: { options: { bucket: 'xxxxxxxx' }, files: [ {expand: true, cwd: 'dist/', src: ['**'], exclude: 'SEO/**', dest: '', differential: true}, {expand: true, cwd: 'dist/SEO/', src: ['**'], dest: '', differential: true}, ] } }
就是这样,您有解决方案! 人类和机器人都将能够阅读您的网络应用