![enter image description here](https://cms.rootstack.comhttps://cms.rootstack.com/sites/default/files/blog/img/js-blog_seo.png)
Search engine crawlers are designed to crawl HTML content on web pages; Nowadays webistes have evolved and many of them generate their content through JS, and these sites which generate that type of content are affected since the crawlers or bots do not know how to handle them correctly.
But there are tools that help us solve this problem as Prerender.io
Prerender.io is a middleware that is installed on your server and will check each request to see if it is a request from a crawler. If it is a request from a crawler, the middleware will send a request to Prerender.io to return the static HTML of that page. If not, the request will continue on its normal server paths. The crawler never knows that you are using Prerender.io since the response always passes through your server.
We have prepared a guide that can help you solve your SEO problems with JS applications
*For practical purposes we will use a basic application of Angular. This application is not intended to be perfect nor does it follow any style guide, it is only meant for demonstrating how Prerender works.*
### Create a basic app in Angular
The next step is to create an application that will make the calls to the Prerender server, we will use Express to create this application.
If we do not have it installed, execute the following commands:
[prism:bash]
npm install -g express
npm install -g express-generator
[/prism:bash]
We move to the folder where we will create the app
[prism:bash]cd /var/opt/[/prism:bash]
We create the app
[prism:bash]
express testapp
cd testapp/
npm install
[/prism:bash]
Edit the file __views/layout.jade__
[prism:jade]
doctype html
html
head
title= title
meta(name="fragment" content="!")
link(rel='stylesheet', href='/stylesheets/style.css')
script(src='//ajax.googleapis.com/ajax/libs/angularjs/1.2.6/angular.min.js')
script(src='javascripts/app.js')
body(ng-app="PrerenderApp")
block content
[/prism:jade]
Create the file __public/javascripts/app.js__ With the following code:
[prism:javascript]
var app = angular.module("PrerenderApp", []);
app.controller("ExampleController", function($scope) {
$scope.message = "Hello World";
});
[/prism:javascript]
Edit the file __views/index.jade__
[prism:jade]
extends layout
block content
div(ng-controller="ExampleController")
h1= title
p Welcome to #{title}
p {{message}}
[/prism:jade]
We change the port from ```3000``` to ```8080``` modifying the file __bin/www__
[prism:javascript]
var port = normalizePort(process.env.PORT || '8080');
[/prism:javascript]
Now, if we start the Express server
[prism:bash]DEBUG=testapp:* npm start[/prism:bash]
We should see something like that
[prism:bash]
> testapp@0.0.0 start /var/opt/testapp
> node ./bin/www
testapp:server Listening on port 8080 +0ms
[/prism:bash]
### Test the App
At this moment we make a call with curl to see how the crawlers would see our app, for this demonstration we will use the Useragent of the twitter crawler
[prism:bash]curl -A "Twitterbot" "http://localhost:8080"" [/prism:bash]
[prism:markup]
htmlExpress
Express
Welcome to Express
{{message}}
GET / 200 71.506 ms - 429
[/prism:markup]
If we access from the browser to our app [we should see](http://localhost:8080)
![browser](http://i68.tinypic.com/2ytqiig.png)
As we can observe the variable __message__ has been replaced for __Hello World__
We can show that the variable __message__ without replacing it, but, why does this happen?
Crawlers do not process JS as browsers do, so when they get the HTTP 200 code, crawlers assume that the page is already loaded, so they do not expect the JS to finish the app, in this case it does not expect that loading, so that the controller will replace the variable __message__ for __Hello World__
## Install and Configure Prerender
### Install PM2
PM2 is a production process manager for Node.js applications with a built-in load balancer. It lets you keep applications alive forever, reload them with no downtime, and facilitate common system administrator tasks.
[prism:bash]npm install pm2 -g[/prism:bash]
### Install Prerender Middleware
[prism:bash]
$ git clone https://github.com/prerender/prerender.git
$ cd prerender
$ npm install
[/prism:bash]
By default the Prerender server does not have any type of cache, so if we start the Prerender server and request a page, it will generate the HTML and then it will serve us; If we ask for the same page again it will generate it again; First let's verify that our Prerender service is working properly
We execute the following command
[prism:bash]pm2 start server.js[/prism:bash]
[prism:bash]
[PM2] Starting /server.js in fork_mode (1 instance)
[PM2] Done.
┌──────────┬────┬──────┬───────┬────────┬─────────┬────────┬─────┬───────────┬──────────┐
│ App name │ id │ mode │ pid │ status │ restart │ uptime │ cpu │ mem │ watching │
├──────────┼────┼──────┼───────┼────────┼─────────┼────────┼─────┼───────────┼──────────┤
│ server │ 0 │ fork │ 30710 │ online │ 0 │ 0s │ 3% │ 22.4 MB │ disabled │
└──────────┴────┴──────┴───────┴────────┴─────────┴────────┴─────┴───────────┴──────────┘
[/prism:bash]
At this moment we proceed to test the Prerender service
[prism:bash]curl -A "Twitterbot" "http://localhost:3000/http://localhost:8080"[/prism:bash]
And it should return
[prism:markup]
Express
Express
Welcome to Express
Hello World
[/prism:markup]
As we can see, Angular already process the DOM and we see that the variable __message__ was replaced by __Hello World__
At this point we proceed to configure the server of our app to return the HTML generated by Prerender when necessary, this step varies depending on which server we are configuring, in our case ExpressJS, but we can do it with Apache, Nginx or Heroku
We installed the Prerender proxy for node
[prism:bash]
cd /var/opt/testapp
npm install prerender-node --save
[/prism:bash]
Edit the file __app.js__ And we add the line after the lines of __app.set__
[prism:javascript] app.use(require('prerender-node').set('prerenderServiceUrl', 'http://localhost:3000'));[/prism:javascript]
This is all the configuration for the Express server.
[prism:httpt]http://localhost:3000[/prism:httpt] Is the url of our Prerender service
Restart the server of our app
We can use pm2 for this too
[prism:bash]pm2 start bin/www[/prism:bash]
[prism:bash]
┌──────────┬────┬──────┬───────┬────────┬─────────┬────────┬─────┬───────────┬──────────┐
│ App name │ id │ mode │ pid │ status │ restart │ uptime │ cpu │ mem │ watching │
├──────────┼────┼──────┼───────┼────────┼─────────┼────────┼─────┼───────────┼──────────┤
│ server │ 3 │ fork │ 9891 │ online │ 0 │ 8m │ 0% │ 35.1 MB │ disabled │
│ www │ 4 │ fork │ 10361 │ online │ 0 │ 0s │ 0% │ 17.0 MB │ disabled │
└──────────┴────┴──────┴───────┴────────┴─────────┴────────┴─────┴───────────┴──────────┘
[/prism:bash]
Now that we have the Prerender service and the Express Webserver running, we can prove that everything is working properly
[prism:bash]curl -A "Twitterbot" "http://localhost:8080"[/prism:bash]
o
[prism:bash]curl "http://localhost:8080/?_escaped_fragment_="[/prism:bash]
[prism:markup]
htmlExpress
Express
Welcome to Express
Hello World
[/prism:markup]
That is the result we were waiting for, Prerender served us the HTML code already generated.
For more information on configuring a particular server, visit the following [link](https://prerender.io/documentation/install-middleware)
Now, we already have the Prerender configured and our app verifies the *user-agent* or the *_escaped_fragment_* query string to bring the HTML from the Prerender service, but we do not have any type of cache configured so far, this means That every time a request is made to the Prerender service, it has to generate all the HTML over and over again
## Install Redis
linux debian/ubuntu correr este [sh](https://gist.github.com/rogerleite/5927948#file-redis-install-sh)
### Run the redis-server
[prism:bash]start redis-server[/prism:bash]
We go to the Prerender folder
[prism:bash]cd /var/opt/prerender[/prism:bash]
We installed the Redis plugin to prerender
[prism:bash]npm install prerender-redis-cache --save[/prism:bash]
And we add this line to the file __server.js__
[prism:javascript]server.use(require('prerender-redis-cache'));[/prism:javascript]
By default the plugin will connect to Redis on localhost WN, the default port (6379) without any authentication, you can override these settings by setting the following environment variables __REDISTOGO_URL, REDISCLOUD_URL, REDISGREEN_URL or REDIS_URL__ With the following format *redis://user:password@host:port/databaseNumber*
Restart the Prerender service and test
[prism:bash]curl -A "Twitterbot" "http://localhost:8080"[/prism:bash]
The first time it will take what has always taken (for this example it would be normal for about 1 second or a little more), but the next times we ask for the same URL the result will be almost immediate
We have ready our service Prerender running with cache.
### Example with Apache
To use Prerender with Apache we must be sure to have the following modules activated
+ mod_rewrite
+ mod_proxy
+ proxy_html
+ proxy_http
virtual-host de __apache__ para __angular__ en modo __html5__
[prism:bash]
ServerAdmin webmaster@localhost
ServerName servername.local
ServerAlias subdomain.domain.local
DocumentRoot "/dir/to/site/root"
ProxyRequests On
ProxyPreserveHost On
Require all granted
RewriteEngine on
AllowOverride All
Options Indexes MultiViews FollowSymLinks
Require all granted
# If requested resource exists as a file or directory
# (REQUEST_FILENAME is only relative in virtualhost context, so not usable)
# RewriteCond %{REQUEST_FILENAME} -f [OR]
# RewriteCond %{REQUEST_FILENAME} -d
# Go to it as is
# RewriteRule ^ - [L]
# If non existent
# Accept everything on index.html
# RewriteRule ^ /index.html
# If non existent
# If path ends with / and is not just a single /, redirect to without the trailing /
RewriteCond %{REQUEST_URI} !^/$
RewriteCond %{REQUEST_URI} ^(.*)/$
RewriteRule ^ %1 [R,QSA,L]
# Handle Prerender.io
RewriteCond %{HTTP_USER_AGENT} Googlebot|bingbot|Googlebot-Mobile|Baiduspider|Yahoo|YahooSeeker|DoCoMo|Twitterbot|TweetmemeBot|Twikle|Netseer|Daumoa|SeznamBot|Ezooms|MSNBot|Exabot|MJ12bot|sogou\sspider|YandexBot|bitlybot|ia_archiver|proximic|spbot|ChangeDetection|NaverBot|MetaJobBot|magpie-crawler|Genieo\sWeb\sfilter|Qualidator.com\sBot|Woko|Vagabondo|360Spider|ExB\sLanguage\sCrawler|AddThis.com|aiHitBot|Spinn3r|BingPreview|GrapeshotCrawler|CareerBot|ZumBot|ShopWiki|bixocrawler|uMBot|sistrix|linkdexbot|AhrefsBot|archive.org_bot|SeoCheckBot|TurnitinBot|VoilaBot|SearchmetricsBot|Butterfly|Yahoo!|Plukkie|yacybot|trendictionbot|UASlinkChecker|Blekkobot|Wotbox|YioopBot|meanpathbot|TinEye|LuminateBot|FyberSpider|Infohelfer|linkdex.com|Curious\sGeorge|Fetch-Guess|ichiro|MojeekBot|SBSearch|WebThumbnail|socialbm_bot|SemrushBot|Vedma|alexa\ssite\saudit|SEOkicks-Robot|Browsershots|BLEXBot|woriobot|AMZNKAssocBot|Speedy|oBot|HostTracker|OpenWebSpider|WBSearchBot|FacebookExternalHit [NC,OR]
RewriteCond %{QUERY_STRING} _escaped_fragment_
# Proxy the request
RewriteRule ^(?!.*?(\.js|\.css|\.xml|\.less|\.png|\.jpg|\.jpeg|\.gif|\.pdf|\.doc|\.txt|\.ico|\.rss|\.zip|\.mp3|\.rar|\.exe|\.wmv|\.doc|\.avi|\.ppt|\.mpg|\.mpeg|\.tif|\.wav|\.mov|\.psd|\.ai|\.xls|\.mp4|\.m4a|\.swf|\.dat|\.dmg|\.iso|\.flv|\.m4v|\.torrent|\.ttf|\.woff))(.*) http://localhost:3000/http://%{HTTP_HOST}/$2 [P,L]
# If requested resource exists as a file or directory
# (REQUEST_FILENAME is only relative in virtualhost context, so not usable)
RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
# Go to it as is
RewriteRule ^ - [L]
# If non existent
# Accept everything on index.html
RewriteRule ^ /index.html
ErrorLog "/var/log/apache2/domain.local.error.log"
[/prism:bash]
## To take into account
* Prerender uses Phantomjs as an engine so the use of JS features __ES6/ES7__ phantomjs will fail, Therefore the use of __BabelJS__ is recommended in order to convert your code to __ES5__
* If you are in __linux__ Make sure you have set your __locale__, This can also affect Phantomjs, [See other related issues](https://github.com/ariya/phantomjs/issues/13433)