"robots.txt" on Webserver

Discussion about security topics in WinCC OA!
Search

Post Reply
3 posts • Page 1 of 1
name024
Posts: 1
Joined: Fri Oct 25, 2019 12:43 pm

"robots.txt" on Webserver

Post by name024 »

I have tried to add a "robots.txt" file on the project directory, to publish this file via the Webserver. But the Webserver seams not to publish files from the root projekt folder in the root Webserver pfad.
Does someone know, where I have to place the "robots.txt" to publish it on: "https://localhost/robots.txt"?

Reaseon: This file is read by Web-Crawler (like google use) and dissuade the robot to integrade the (WinCC-OA) system in the search, if the system is mistakenly is available over the Internet.

Further Informations:
https://wiki.selfhtml.org/wiki/Grundlagen/Robots.txt

Some other open source - IOT Software already implement this. By example: (search for robots.txt)
https://github.com/jens-maus/RaspberryM ... %A4nkungen

Example Content of the File:
<<<--------------------------------------------------------->>>
#Do not allow any searchbot anything (if they note this file)
User-agent: *
Disallow: /
<<<--------------------------------------------------------->>>


PS: this sould be an easy to implement feature for feature WinCC OA Versions/Patches, which increas the security (or prevent form bad publicity :D )

User avatar
kilianvp
Posts: 329
Joined: Fri Jan 16, 2015 10:29 am

Re: "robots.txt" on Webserver

Post by kilianvp »

You can modify webclient_http.ctl and add this:

Code: Select all

    httpConnect("getRobots", "/robots.txt", "text/plain");
above "httpConnect("getIndex", "/download");"

And add the getRobots Funktion

Code: Select all

dyn_string getRobots(dyn_string names, dyn_string values, string user, string ip,
                    dyn_string headerNames, dyn_string headerValues)
{
  return makeDynString("User-agent: *\nDisallow: /", "Status: 200 OK");
}
But search engines like shodan will ignore robots.txt

gschijndel
Posts: 225
Joined: Tue Jan 15, 2019 3:12 pm

Re: "robots.txt" on Webserver

Post by gschijndel »

Instead of returning a fixed value you could also create a generic function like this to return the actual contents of the file
(I have no idea why ETM has not done this already for the pem files)

Code: Select all

dyn_string readFile(const dyn_string &names, const dyn_string &values, const string &user, const string &ip,
                    const dyn_string &headerNames, const dyn_string &headerValues, int connIdx)
{
  string uri = httpGetURI(connIdx);

  return getFile(names, values, user, ip, headerNames, headerValues, baseName(uri), dirName(strltrim(uri, "/")));
}

Post Reply
3 posts • Page 1 of 1