Web based family history software

This Help forum is for issues relates to the latest release (1.7.7). For issues related to beta or github version please use their own Help forum.
Before asking for help please read "How to request help" by clicking on that tab above here."

Solved [SOLVED] Robots nofollow

  • Luenissla
  • Topic Author
  • Offline
  • Junior Member
  • Junior Member
More
7 years 6 months ago #1 by Luenissla
Robots nofollow was created by Luenissla
Hallo miteinander,

wie kann ich den Parameter '<meta name="robots" content="index,follow">' ändern auf '<meta name="robots" content="index,nofollow">', am besten für jede Datei getrennt?

Hello,
how can I change the parameters '<meta name =" robots "content =" index, follow ">' to '<meta name =" robots "content =" index, nofollow ">' best separated for each file?

Best regards / Viele Grüße
Hans-Joachim (Lünenschloß)

Please Log in or Create an account to join the conversation.

  • bertkoor
  • Offline
  • Platinum Member
  • Platinum Member
  • Greetings from Utrecht, Holland
More
7 years 6 months ago #2 by bertkoor
Replied by bertkoor on topic Robots nofollow
May I ask on what specific pages you want to set this? Because if you don't allow a robot to follow any further links, it cannot discover your whole tree of persons and families. In practice I have found that the most interesting contacts I had with people that found my site through a search engine, is through far-away places in my tree that I had never thought of might be interesting for someone else.

If you just want search engines not to visit specific record types such as Sources, Media or Notes, then better add those pages to your robots.txt.

Anyway, by default all page controllers (inheriting from PageController) are initialized. See app\Controller\PageController.php:
Code:
31 /** @var string Most pages are not intended for robots */ 32 private $meta_robots = 'noindex,nofollow';

But it also has a function setMetaRobots(). This function is called from two places.
First one is obviously Index.php:
Code:
87 $controller 89 ->setMetaRobots('index,follow')

Change that to "nofollow" and you have locked out (well-behaving) robots from your whole site. I think that's not what you want.

Second place where setMetaRobots is called is in app\Controller\GedcomRecordController.php:
Code:
77 // We want robots to index this page 78 $this->setMetaRobots('index,follow');

So effectively only the pages that represent a GEDCOM record (person, family, media, note, source, repository) have the "follow" attribute, all the other pages have "nofollow".
What you could do is override the constructor of the specific page controller (an example can be found in app\Controller\IndividualController.php) and add:
Code:
$this->setMetaRobots('index,nofollow')

stamboom.BertKoor.nl runs on webtrees v1.7.13

Please Log in or Create an account to join the conversation.

  • Luenissla
  • Topic Author
  • Offline
  • Junior Member
  • Junior Member
More
7 years 6 months ago #3 by Luenissla
Replied by Luenissla on topic Robots nofollow
Hallo Bert,
danke für die ausführliche Beschreibung. Wir haben in Deutschland einen eng gefassten Datenschutz. Da ich im Bergischen Datenpool nicht nur meine eigenen Daten habe, muss ich die Wünsche der anderen Teilnehmer berücksichtigen. Daher ist es ausreichend, wenn nur die erste Seite einer Datei in indiziert werden kann. ;-)
Ich werde Deine Beispiele ausprobieren.

Hi Bert,
thanks for the detailed description. We have a narrow data protection in Germany. I have not only my own data in the Bergischen Datenpool, so that I have to consider the wishes of the other participants. Therefore, it is sufficient if only the first page of ech file can be indexed in. ;-)
I'll try your examples.

Best regards / Viele Grüße
Hans-Joachim (Lünenschloß)

Please Log in or Create an account to join the conversation.

  • Luenissla
  • Topic Author
  • Offline
  • Junior Member
  • Junior Member
More
7 years 6 months ago #4 by Luenissla
Replied by Luenissla on topic [SOLVED] Robots nofollow
Problem solved.

Best regards / Viele Grüße
Hans-Joachim (Lünenschloß)

Please Log in or Create an account to join the conversation.

  • bertkoor
  • Offline
  • Platinum Member
  • Platinum Member
  • Greetings from Utrecht, Holland
More
7 years 6 months ago - 7 years 6 months ago #5 by bertkoor
Replied by bertkoor on topic [SOLVED] Robots nofollow

Luenissla wrote: Problem solved.

Sorry to say I doubt it really is, because I think changing the metadata for robots does not much for the problem you try to solve. And that is why I / we need to understand what the root cause is why you want to change this.

You say the reason is data and privacy protection. Then I really doubt this change will do any good.
Because essentially a robot is no different than human visitors. So look first at what normal visitors can see at your site. If this is within the law, you do not need to do more. Put it another way: if google can not reach a certain private page but a normal visitor can, then you are still breaking that law.

First starting point are the privacy rules you can set up in the control panel per tree. I assume Dutch privacy laws are simular to German. I may not publish information about living people without their explicit agreement. This is default behaviour for webtrees, and you can tweak it to taste. The privacy settings are quite flexible. If you have specific demands, we can look at it. But changing the robot metadata is not the way.

stamboom.BertKoor.nl runs on webtrees v1.7.13
Last edit: 7 years 6 months ago by bertkoor.

Please Log in or Create an account to join the conversation.

  • Luenissla
  • Topic Author
  • Offline
  • Junior Member
  • Junior Member
More
7 years 6 months ago - 7 years 6 months ago #6 by Luenissla
Replied by Luenissla on topic [SOLVED] Robots nofollow
Hello Bert,

bertkoor wrote: Put it another way: if google can not reach a certain private page but a normal visitor can, then you are still breaking that law.


I can not see any law which I would break, if I do not allow google not the same as a normal human visitor. Could you please explain these?

My Problem is, that the trees I hav on the Datenpool are not only my trees but also from other genealogists.

In Germany, the genealogists are very sensitive to their data. If I want to reache that they give their data into the data pool, then I need to protect the data of them better than the privacy maybe provides.

So it is enough for me when a search robot like google can read - and index - the first page of a tree and thats is ist, and no follow. And no indexing of any mediaobject.
And this will break no law. ;-)

Best regards / Viele Grüße
Hans-Joachim (Lünenschloß)
Last edit: 7 years 6 months ago by Luenissla. Reason: mistake

Please Log in or Create an account to join the conversation.

More
7 years 6 months ago #7 by fisharebest
Replied by fisharebest on topic [SOLVED] Robots nofollow
> if I do not allow google not the same as a normal human visitor

It is not possible to allow a human visitor to see a page and hide it from a search engine.

It is not possible for a website to know who (or what) is requesting a page.

If a human can see a page, a search engine can see it.

Some search engines will obey a robots.txt file, and not show some pages in their indexes.

Other search engines ignore robots.txt.

Greg Roach - greg@subaqua.co.uk - @fisharebest@phpc.social - fisharebest.webtrees.net

Please Log in or Create an account to join the conversation.

  • Luenissla
  • Topic Author
  • Offline
  • Junior Member
  • Junior Member
More
7 years 6 months ago #8 by Luenissla
Replied by Luenissla on topic [SOLVED] Robots nofollow

fisharebest wrote: > if I do not allow google not the same as a normal human visitor

It is not possible to allow a human visitor to see a page and hide it from a search engine.


This is correkt for "normal" sites, but webtrees uses visitor rules.

fisharebest wrote: It is not possible for a website to know who (or what) is requesting a page.

This is also correkt if the visitor does not bring any informations.

fisharebest wrote: If a human can see a page, a search engine can see it.

I think that therefor webtrees uses the "Website access rules"?

fisharebest wrote: Some search engines will obey a robots.txt file, and not show some pages in their indexes.
Other search engines ignore robots.txt.

That is unfortunately the case!

Best regards / Viele Grüße
Hans-Joachim (Lünenschloß)

Please Log in or Create an account to join the conversation.

More
7 years 6 months ago #9 by eh215
Replied by eh215 on topic [SOLVED] Robots nofollow
Hans -

By design, webtrees tries to make appropriate information available to visitors (and by extension, web indexing). There have been a number of threads about creating totally private trees, which would keep your information totally private, but at a cost.

I assume you've investigated the various privacy options on the Control Panel --> Family Trees --> Manage Family Trees but depending on the number and nature of the facts you are looking to keep private, had you considered the alternatives of:
  • Adding a "show to members" restriction on the individual (name) or facts/events you wish to hide (might be a manageable effort if there is a small number of them), or
  • Adding a "show to members" restriction to the Source or Repository that is part of the facts/events you wish to hide (would work if your site shows sources or repositories to visitors), or
  • Creating a Shared Note with a "show to members" restriction and adding that note to the facts or events or individuals you wish to hide (this has the advantage of more readily indexing the "hidden" information for you so you more readily know what has been made more private)

Eric

Please Log in or Create an account to join the conversation.

Powered by Kunena Forum
}