Before asking for help please read "How to request help" by clicking on that tab above here."
Solved [SOLVED] Robots nofollow
- Luenissla
- Topic Author
- Offline
- Junior Member
wie kann ich den Parameter '<meta name="robots" content="index,follow">' ändern auf '<meta name="robots" content="index,nofollow">', am besten für jede Datei getrennt?
Hello,
how can I change the parameters '<meta name =" robots "content =" index, follow ">' to '<meta name =" robots "content =" index, nofollow ">' best separated for each file?
Best regards / Viele Grüße
Hans-Joachim (Lünenschloß)
Please Log in or Create an account to join the conversation.
- bertkoor
- Offline
- Platinum Member
- Greetings from Utrecht, Holland
If you just want search engines not to visit specific record types such as Sources, Media or Notes, then better add those pages to your robots.txt.
Anyway, by default all page controllers (inheriting from PageController) are initialized. See app\Controller\PageController.php:
But it also has a function setMetaRobots(). This function is called from two places.
First one is obviously Index.php:
Change that to "nofollow" and you have locked out (well-behaving) robots from your whole site. I think that's not what you want.
Second place where setMetaRobots is called is in app\Controller\GedcomRecordController.php:
So effectively only the pages that represent a GEDCOM record (person, family, media, note, source, repository) have the "follow" attribute, all the other pages have "nofollow".
What you could do is override the constructor of the specific page controller (an example can be found in app\Controller\IndividualController.php) and add:
stamboom.BertKoor.nl runs on webtrees v1.7.13
Please Log in or Create an account to join the conversation.
- Luenissla
- Topic Author
- Offline
- Junior Member
danke für die ausführliche Beschreibung. Wir haben in Deutschland einen eng gefassten Datenschutz. Da ich im Bergischen Datenpool nicht nur meine eigenen Daten habe, muss ich die Wünsche der anderen Teilnehmer berücksichtigen. Daher ist es ausreichend, wenn nur die erste Seite einer Datei in indiziert werden kann.
Ich werde Deine Beispiele ausprobieren.
Hi Bert,
thanks for the detailed description. We have a narrow data protection in Germany. I have not only my own data in the Bergischen Datenpool, so that I have to consider the wishes of the other participants. Therefore, it is sufficient if only the first page of ech file can be indexed in.
I'll try your examples.
Best regards / Viele Grüße
Hans-Joachim (Lünenschloß)
Please Log in or Create an account to join the conversation.
- Luenissla
- Topic Author
- Offline
- Junior Member
Best regards / Viele Grüße
Hans-Joachim (Lünenschloß)
Please Log in or Create an account to join the conversation.
- bertkoor
- Offline
- Platinum Member
- Greetings from Utrecht, Holland
Sorry to say I doubt it really is, because I think changing the metadata for robots does not much for the problem you try to solve. And that is why I / we need to understand what the root cause is why you want to change this.Luenissla wrote: Problem solved.
You say the reason is data and privacy protection. Then I really doubt this change will do any good.
Because essentially a robot is no different than human visitors. So look first at what normal visitors can see at your site. If this is within the law, you do not need to do more. Put it another way: if google can not reach a certain private page but a normal visitor can, then you are still breaking that law.
First starting point are the privacy rules you can set up in the control panel per tree. I assume Dutch privacy laws are simular to German. I may not publish information about living people without their explicit agreement. This is default behaviour for webtrees, and you can tweak it to taste. The privacy settings are quite flexible. If you have specific demands, we can look at it. But changing the robot metadata is not the way.
stamboom.BertKoor.nl runs on webtrees v1.7.13
Please Log in or Create an account to join the conversation.
- Luenissla
- Topic Author
- Offline
- Junior Member
bertkoor wrote: Put it another way: if google can not reach a certain private page but a normal visitor can, then you are still breaking that law.
I can not see any law which I would break, if I do not allow google not the same as a normal human visitor. Could you please explain these?
My Problem is, that the trees I hav on the Datenpool are not only my trees but also from other genealogists.
In Germany, the genealogists are very sensitive to their data. If I want to reache that they give their data into the data pool, then I need to protect the data of them better than the privacy maybe provides.
So it is enough for me when a search robot like google can read - and index - the first page of a tree and thats is ist, and no follow. And no indexing of any mediaobject.
And this will break no law.
Best regards / Viele Grüße
Hans-Joachim (Lünenschloß)
Please Log in or Create an account to join the conversation.
- fisharebest
- Offline
- Administrator
It is not possible to allow a human visitor to see a page and hide it from a search engine.
It is not possible for a website to know who (or what) is requesting a page.
If a human can see a page, a search engine can see it.
Some search engines will obey a robots.txt file, and not show some pages in their indexes.
Other search engines ignore robots.txt.
Greg Roach - greg@subaqua.co.uk - @fisharebest@phpc.social - fisharebest.webtrees.net
Please Log in or Create an account to join the conversation.
- Luenissla
- Topic Author
- Offline
- Junior Member
fisharebest wrote: > if I do not allow google not the same as a normal human visitor
It is not possible to allow a human visitor to see a page and hide it from a search engine.
This is correkt for "normal" sites, but webtrees uses visitor rules.
This is also correkt if the visitor does not bring any informations.fisharebest wrote: It is not possible for a website to know who (or what) is requesting a page.
I think that therefor webtrees uses the "Website access rules"?fisharebest wrote: If a human can see a page, a search engine can see it.
That is unfortunately the case!fisharebest wrote: Some search engines will obey a robots.txt file, and not show some pages in their indexes.
Other search engines ignore robots.txt.
Best regards / Viele Grüße
Hans-Joachim (Lünenschloß)
Please Log in or Create an account to join the conversation.
- eh215
- Offline
- Senior Member
By design, webtrees tries to make appropriate information available to visitors (and by extension, web indexing). There have been a number of threads about creating totally private trees, which would keep your information totally private, but at a cost.
I assume you've investigated the various privacy options on the Control Panel --> Family Trees --> Manage Family Trees but depending on the number and nature of the facts you are looking to keep private, had you considered the alternatives of:
- Adding a "show to members" restriction on the individual (name) or facts/events you wish to hide (might be a manageable effort if there is a small number of them), or
- Adding a "show to members" restriction to the Source or Repository that is part of the facts/events you wish to hide (would work if your site shows sources or repositories to visitors), or
- Creating a Shared Note with a "show to members" restriction and adding that note to the facts or events or individuals you wish to hide (this has the advantage of more readily indexing the "hidden" information for you so you more readily know what has been made more private)
Eric
Please Log in or Create an account to join the conversation.