Robots Dynamic File in EPiServer

In this blog post, we will create a robots file which will be dynamically generated by using the content of a field in the start page. To generate the file we will implement a http handler that match the request for robots.txt

2bbb

So without more hustle, we will first add the property to the start page mode. This page inherits from a base page (site page) which at the same time inherits from PageData class


    [SiteContentType(GUID = "98448299-51bd-45a9-8d68-7dcdfee0a030"]
    [ImageUrl(Global.Thumbnails.GenericPage)]
    public class StartPage : SitePage
    {
        [CultureSpecific]
        [Display(Name = "Robots", Order = 1, GroupName = Global.GroupNames.Metadata)]
        [UIHint(UIHint.Textarea)]
        public virtual string RobotsContent { get; set; }
    } 

Site page base class


  [SiteContentType(GUID = "98448299-51bd-45a9-8d68-7dcdfee0a031", AvailableInEditMode = false)]
    public class SitePage : PageData
    {
        // Nothing here
    }

Now that we have the Robots Content field in the start page, we will create the http handler class which will find the start page of the current site and read the content from the created field.


    /// <summary>
    /// The robots handler.
    /// </summary>
    public class RobotsHandler : IHttpHandler
    {
        /// <summary>
        /// The rep.
        /// </summary>
        private readonly IContentLoader rep;

        /// <summary>
        /// The site rep.
        /// </summary>
        private readonly ISiteDefinitionRepository siteRep;

        /// <summary>
        /// Initializes a new instance of the  class.
        /// </summary>
        public RobotsHandler()
        {
            this.rep = ServiceLocator.Current.GetInstance();
            this.siteRep = ServiceLocator.Current.GetInstance();
        }

        /// <summary>
        /// The is reusable.
        /// </summary>
        public bool IsReusable =&gt; false;

        /// <summary>
        /// The process request.
        /// </summary>
        /// 
        /// The context.
        /// 
        public void ProcessRequest(HttpContext context)
        {
            var domainUri = context.Request.Url;
            var currentSite = this.siteRep.List().FirstOrDefault(x =&gt; verifyInHost(x.Hosts, domainUri.Authority) == true);

            var robotsTxtContent = @"User-agent: *"
                                   + Environment.NewLine +
                                   "Disallow: /episerver";

            if (currentSite != null)
            {
                var startPage = this.rep.Get(currentSite.StartPage);

                var robotsFromField = string.Empty;
                var robotsProperty = startPage.GetType().GetProperty("RobotsContent");
                if (robotsProperty?.GetValue(startPage) is string)
                {
                    robotsFromField = robotsProperty.GetValue(startPage) as string;
                }

                // Generate robots.txt file
                if (!string.IsNullOrEmpty(robotsFromField))
                {
                    robotsTxtContent += Environment.NewLine + robotsFromField;
                }
            }

            // Set the response code, content type and appropriate robots file here
            // also think about handling caching, sending error codes etc.
            context.Response.StatusCode = 200;
            context.Response.ContentType = "text/plain";

            // Return the robots content
            context.Response.Write(robotsTxtContent);
            context.Response.End();
        }

        private bool verifyInHost(IList hosts, string siteName)
        {
            foreach (var host in hosts)
            {
                if (host.Authority.Hostname == siteName)
                {
                    return true;
                }
            }

            return false;
        }
    }

If you are using Web Api in your EPiServer project, it may be necessary to add an ignore route in the global.asax file for the handler to avoid issues


        protected override void RegisterRoutes(RouteCollection routes)
        {
            base.RegisterRoutes(routes);
            routes.Ignore("robots.txt");

            routes.MapRoute(
               "Default",
               "{controller}/{action}",
               new { action = "index" }
           );
        }

Finally, we will add the new handler to the web.config file in the system.webServer section.


  <system.webServer>   
    <handlers>
      <add name="Robots" verb="*" path="/robots.txt"
           type="ExampleProject.Handlers.RobotsHandler, ExampleProject, Version=1.0.0.0, Culture=neutral"
           preCondition="managedHandler"/>     
    </handlers>
  </system.webServer>   

Change the type attribute to the corresponding class and namespace in your project. And that is all. The next time a user will try to request http://www.site.com/robots.txt, the handler will execute and generate a response using the content of the field in the start page. I hope it will help someone and as always keep learning !!!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s