Web based family history software

file Question No index is blocking google crawlers. How to allow indexing?

  • MartinM
  • MartinM's Avatar Topic Author
  • Offline
  • New Member
  • New Member
More
10 months 4 days ago #1 by MartinM
I have finally cleaned up my database, and would like google to index my site.

My google console show over 33,000 pages are not indexed due to being excluded by a ‘no index’ tag. Can someone tell me how to remove this tag?

Thank you

Please Log in or Create an account to join the conversation.

More
10 months 4 days ago #2 by fisharebest
I looked at a few pages on your site, and I see the correct index tags. e.g.
Code:
<meta name="robots" content="index,follow">

Note that webtrees sets noindex tags on pages that should not be indexed, such as the calendar, charts and reports.

Can you give a specific example from the google console?

Greg Roach - greg@subaqua.co.uk - @fisharebest@phpc.social - fisharebest.webtrees.net

Please Log in or Create an account to join the conversation.

  • MartinM
  • MartinM's Avatar Topic Author
  • Offline
  • New Member
  • New Member
More
10 months 4 days ago #3 by MartinM
 

This browser does not support PDFs. Please download the PDF to view it: Download PDF

Please Log in or Create an account to join the conversation.

More
10 months 3 days ago #4 by fisharebest
These examples are all for the charts on your site - and I would expect "noindex" for these.

The charts don't work with search-engines.

Firstly, the charts have forms with options (number of generations, etc.) and search-engines can't fll these in.

Secondly, the charts/reports/etc. are expensive to generate, and largely repeat information from the individual/family/etc. pages. Repeated content will lower your google-ranking.

AFAICT, your site is working as designed.

Greg Roach - greg@subaqua.co.uk - @fisharebest@phpc.social - fisharebest.webtrees.net

Please Log in or Create an account to join the conversation.

  • MartinM
  • MartinM's Avatar Topic Author
  • Offline
  • New Member
  • New Member
More
10 months 3 days ago #5 by MartinM
Thank you for that explanation. I’m still pretty sure something isn’t quite right since I have over 25K individuals in that database, and google is indexing only 700 pages.

Please Log in or Create an account to join the conversation.

More
10 months 3 days ago #6 by photon flip
For some reason I can't read your pdf to see if it's only the noindex numbers but I think the more revealing numbers in Google search Console are for Crawled - currently not indexed and Discovered - currently not indexed. Googles explanation for these is is less than helpful:
Crawled - currently not indexed The page was crawled by Google but not indexed. It may or may not be indexed in the future; no need to resubmit this URL for crawling.
Discovered - currently not indexed The page was found by Google, but not crawled yet. Typically, Google wanted to crawl the URL but this was expected to overload the site; therefore Google rescheduled the crawl. This is why the last crawl date is empty on the report.

I've attach a screen shot of my site's indexing. I have about 1800 individuals on my tree. The numbers haven't changed much over the years which makes me doubt Google's explanations. But the the large number of noindex pages is understandable with @fisharebest information.

  

Please Log in or Create an account to join the conversation.

More
3 months 3 weeks ago - 3 months 3 weeks ago #7 by Floki
Hello everyone,

I feel similar to @Photon Flip

After setting up my Webtress behind an Nginx proxy:
Incl.
set_real_ip_from 10.0.0.0/8;
real_ip_header X-Forwarded-For;

and
trusted_headers="x-forwarded-for"

My site [url] stammbaum.sttzr.selfhost.bz/ [/url] was also found via Google!

But I haven't been found in Google for about 2 months! I can be found via Bing!According to Google Search Console, 5600 pages are crawled but not indexed and 1500 pages are found but not indexed. (see picture in the attachment)

The sitemap and Robots.txt should be OK?

I urgently need a tip and support!


Thank you in advance
 [attachment=undefined]Screenshot2024-12-29141736.png[/attachment]
Attachments:
Last edit: 3 months 3 weeks ago by Floki.

Please Log in or Create an account to join the conversation.

More
2 months 3 weeks ago #8 by Floki
At the moment I haven't managed to get my page displayed again on the Google search engine or Golle crawled my page.
Now I think it's even worse: my main page

is reported with error: redirection error.

According to Google, the sitemaps are fine.

My website is behind an Ngix proxy and running on a Docker instance.

Before I updated my Docker instance to Webtress 2.2, my sites were listed on Google - now they are not.

Someone here has an idea - I urgently need help

My Site: [url] stammbaum.sttzr.selfhost.bz/ [/url]

Thank you
Attachments:

Please Log in or Create an account to join the conversation.

More
2 months 3 weeks ago #9 by fisharebest
Your site can contain many trees.

One of these trees will be the default. (If you only have one, then it will be the default).

The homepage of the "site" (URL "/") will redirect to the homepage of the default tree (URL "/trtree/stuetzer").

This is normal. This is not an error.

The screenshot looks OK to me.

Greg Roach - greg@subaqua.co.uk - @fisharebest@phpc.social - fisharebest.webtrees.net

Please Log in or Create an account to join the conversation.

More
2 months 3 weeks ago #10 by Franz Frese

At the moment I haven't managed to get my page displayed again on the Google search engine or Golle crawled my page.
Now I think it's even worse: my main page
....
Someone here has an idea - I urgently need help

My Site: [url] stammbaum.sttzr.selfhost.bz/ [/url]
...
Most of your pages don't contain any useful content without logging in. If I were Google, I would ignore them too.

Please Log in or Create an account to join the conversation.

More
2 months 3 weeks ago #11 by Floki
First of all, thank you for taking the time for me!

I've now turned off data protection. Otherwise my website looks like any other webtrees site?

So you don't think that it could be due to settings for the Nginx proxy and Apache web server?

(any incorrect or missing rewrite settings after Docker update to 2.2?)
[url] stammbaum.sttzr.selfhost.bz/ [/url]

Please Log in or Create an account to join the conversation.

Powered by Kunena Forum