Please do NOT post requests for help here. Use the Help forum for that.

TOPIC:

Privacy Issues 8 years 7 months ago #1

  • StuartG
  • StuartG's Avatar Topic Author
  • Offline
  • Junior Member
  • Junior Member
  • Posts: 150
Firstly, I would like to apologise for the length of this post. I have tried to be as succinct as possible, whilst at the same time making it understandable for all users and not just the development team. I would like discussion on this topic to be as wide as possible.

I am very concerned about the ease in which living people (private records) can be made viewable to casual visitors.

I raised a bug report at No. 1168445

Also a similar bug report was raised at No. 920393

Both of these bug reports have been categorised as "Invalid" by the development team.

For a little more on this subject see webtrees wiki at Data entry and privacy

As can be seen, the development team consider having private records becoming public as not a bug, but simply a consequence of incorrect data entry. This is unreasonable as it breaks the prime functionality of privacy for living people.

The 'display' of individuals calls on a function in the code called "is dead". I have had the functionality of this code explained to me and there is probably a valid and perhaps historical reason for it to have been coded the way it is. Fisharebest has stated - "The privacy calculations do NOT look for reasons why someone is alive. They look for reasons why someone is dead."

As I understand it, "is dead" carries out a series of tests. Fisharebest has stated "The privacy calculations just look for *any* dates. They don't care if they belong to a birth, death, probate, etc." He continues: "If it ["is dead"] finds a date for any event for that person's spouse, children, grandchildren, parents or grandparents that is 45 years beyond the parameter set by the site administrator for "Age at which to assume a person is dead" (DeathDate), then "is dead" over-rides the DeathDate setting and considers the person to be dead and therefore he/she is publicly displayed, despite the fact they are clearly alive and should not be shown.

It is my view that when there is no 1 DEAT tag set, "is dead" should next look for a birth date and if it is within the (DeathDate) parameter set, it should stop there, not carry out any further tests and not display the individual. Currently, "is dead" would continue on with its series of tests. To me this is unacceptable.

For privacy purposes, the "is dead" tests should concentrate on hiding people and not showing them and not the other way around as is currently implemented. One consequence of this is that many people may not be displayed if they have no 1 DEAT event set. For example, if a child is born to parents who lived and died, say, 200 years ago, and no birth or death details are known and/or added to the database, then "is dead" should determine that this child is alive and not be displayed. It should not look at the data of other family members in an effort to determine that the child is dead and therefore should be displayed.

As a result of this suggested change, consideration needs be given to moving most of the "is dead" functionality out from where it is and it is suggested it be moved into the Gedcheck function. As Fisharebest has stated "performance is critical here". So moving a substantial amount of this code out into the Gedcheck function should improve general performance.

Gedcheck, using a portion of the "is dead" function code, should highlight and display in the error log, individuals who "is dead" has determined is likely to be dead and should no longer be private, but who do not have a 1 DEAT tag set and are therefore hidden. Then Administrators could take the appropriate action by adding the 1 DEAT Y tag, if that was appropriate, to make them public.

It appears that the Batch update function to 'Add missing death records' uses the "is dead" function. But as Batch update only displays the Gedcom data for each individual, it is not readily apparent whether the person should have the 1 DEAT tag added. The individual record needs to be viewed to check what action is required. Improving this functionality is a separate issue and need not be discussed here.
Stuart
webtrees 2.0.16
⚶ Vesta Modules
PHP 7.3.7
Mysqli

Please Log in or Create an account to join the conversation.

Privacy Issues 8 years 7 months ago #2

  • fisharebest
  • fisharebest's Avatar
  • Away
  • Administrator
  • Administrator
  • Posts: 15064
Firstly, I just want to be clear that marking the bug as "Invalid" means simply that the code is working as designed - not that your comments are invalid.

If I understand you correctly, the issue here is simply one of conflicting data. For example, a person born in 1950 and their parent born in 1750.

considers the person to be dead and therefore he/she is publicly displayed, despite the fact they are clearly alive and should not be shown.


Clearly alive obviously depends on which of the two dates is in error. If the parent's birth date is correct, they are clearly dead.

For background, here are a few notes about the design of this function.

We just look for any dates in the record. e.g. we look for all occurrences of "2 DATE xxxxx" within the GEDCOM record. We don't try to match these with specific events. This would require a lot more memory, plus we'd also need to attach meaning to different events. For example, some can occur a long time after death, such as PROB and CHAN.

The only inference we can make is that the person was born *before* all the dates found in their record.

This is why we look for the oldest possible dates, and use these to infer that an individual is definitely dead, rather than possibly alive.

Provided your data is correct, this algorithm is fast, efficient, and accurate.

But I agree that when you have incorrect/conflicting data, then this can wrongly consider someone to be alive.

should next look for a birth date


The difficulty is that to look for a birth date means that we'd need to break the record down into facts (which would require significantly more memory) before we could search for BIRT records (and presumably BAPM/CHR if no BIRT is present).

It would be a hefty increase in memory usage - and many webtrees sites are already bumping up against their hosting provider's limits. We need to look at ways to reduce the memory footprint, not increase it.

It is obviously a trade-off, but based on previous changes (where refactoring code has increased memory usage), such a change would be very unpopular.

This is why I suggested an alternative solution - adding a check for such inconsistencies. Running it over the full site is impractical as larger sites are unable to load all individuals into memory at one time. Maybe we could add it to the individual's page. e.g. a warning that dates for this individual are incompatible with dates for their grandchild/parent.

Or, if there are any other suggestions....
Greg Roach - This email address is being protected from spambots. You need JavaScript enabled to view it. - fisharebest.webtrees.net

Please Log in or Create an account to join the conversation.

Privacy Issues 8 years 7 months ago #3

  • ToyGuy
  • ToyGuy's Avatar
  • Offline
  • Moderator
  • Moderator
  • Live like it's Christmas every day - Santa Stephen
  • Posts: 4926
I agree we should error on the side of privacy protection, maybe leaving someone presumed alive if immediate family dates do not indicate a discernable date to use for birth/death. However, I don't want any popups or idiot checks to interfere with the entry of data process. We already add warnings about obviously bad dates, but maybe we should add a flag on every INDI page where the IS DEAD calculation was used to assume an INDI is dead.

Unfortunately, such a flag will not be seen in many cases of data entry, since any user might be modifying or adding dates to the parents or grandparents of an individual, make an innocent mistake in data entry, and then causing an entire line of descendants to change their IS DEAD status. The review changes procedure would not likely expose this error even if the admin was carefully reviewing the new data.
Santa Stephen the Fabled Santa
Latest webtrees at MyArnolds.com
Hosted by webtreesonline.com , a division of GeneHosts LLC
MacOS 10.6.8, Apache 2.2+, PHP 5.4.16, MySQL 5.5.28

Please Log in or Create an account to join the conversation.

Privacy Issues 8 years 7 months ago #4

  • norwegian_sardines
  • norwegian_sardines's Avatar
  • Offline
  • Platinum Member
  • Platinum Member
  • Posts: 2208
As part of the validation/data except that an administrator should be doing, a "date reality check" for the excepted person could be made available to the admin. This would allow the process of excepting a new/updated individual or family to take a look at closely associated individuals for valid date ranges. I would espect this to be rather fast for each individual or family added, but it could also be a checkbox selected when the except is made.

I don't like the idea of looking at BAPM (vs birth date) as the sole "is alive" indicator. People do, on occasion, get baptised just before death or late in life and this could set the calculation off the deep end.
Ken

Please Log in or Create an account to join the conversation.

Privacy Issues 8 years 7 months ago #5

  • fisharebest
  • fisharebest's Avatar
  • Away
  • Administrator
  • Administrator
  • Posts: 15064
AFAICT, there are just 4 cases where a data-entry error can cause a living person to be wrongly identified as dead.

Assuming the default setting of "max-alive-age" = 120 years, these are

1) a parent with any event more than 120+45 ago (before 1848)

2) a grandparent with any event more than 120+90 ago (before 1803)

3) a spouse with any event more than 120+40 years ago (before 1853)

4) a family with any event (e.g. marriage) more than 120-10 years ago (1903)
Greg Roach - This email address is being protected from spambots. You need JavaScript enabled to view it. - fisharebest.webtrees.net

Please Log in or Create an account to join the conversation.

Privacy Issues 8 years 7 months ago #6

  • StuartG
  • StuartG's Avatar Topic Author
  • Offline
  • Junior Member
  • Junior Member
  • Posts: 150
Thanks to all for your contributions.

Fisharebest has stated:

The difficulty is that to look for a birth date means that we'd need to break the record down into facts (which would require significantly more memory) before we could search for BIRT records (and presumably BAPM/CHR if no BIRT is present).


I certainly agree that memory is an issue. I have 256M of memory and I still run out. Therefore let me throw a radical thought into this discussion.

What about if "is dead" simply looks for the 1 DEAT tag and if it is not found then that person is considered alive and not displayed. This would substantially reduce the system resources required.

Along with this, what about considering removing the ability to set "Age at which to assume a person is dead" so that no assumptions are made. Responsibility for entering the 1 DEAT tag devolves on Users/Admins.

Then, as I proposed in my earlier posting, all of the other "is dead" and death assumption functionality be moved to the Gedcheck function to allow Admins to review the data.
Stuart
webtrees 2.0.16
⚶ Vesta Modules
PHP 7.3.7
Mysqli

Please Log in or Create an account to join the conversation.

Privacy Issues 8 years 7 months ago #7

  • norwegian_sardines
  • norwegian_sardines's Avatar
  • Offline
  • Platinum Member
  • Platinum Member
  • Posts: 2208
So you are saying that if an individual does not have a birth and death record they would be considered alive even if their parents and other close family members were born and died 200 years ago. This would then IIRC set the entire family alive (or at least any media and sources) so they would not be seen as well.
Ken

Please Log in or Create an account to join the conversation.

Privacy Issues 8 years 7 months ago #8

  • StuartG
  • StuartG's Avatar Topic Author
  • Offline
  • Junior Member
  • Junior Member
  • Posts: 150
Yes, that would be one of the outcomes of what I have put out for discussion. I am essentially saying that adding a 1 DEAT tag is the responsibility of the User/Admin to overcome the outcome you raise and webtrees should not make any assumptions. webtrees could assist Users/Admins in the decision making process by moving the existing "is dead" functionality to Gedcheck.
Stuart
webtrees 2.0.16
⚶ Vesta Modules
PHP 7.3.7
Mysqli

Please Log in or Create an account to join the conversation.

Do you need a web hosting solution for your webtrees site?
If you prefer a host that specialises in webtrees, the following page lists some suppliers able to provide one for you: 

Privacy Issues 8 years 7 months ago #9

  • ToyGuy
  • ToyGuy's Avatar
  • Offline
  • Moderator
  • Moderator
  • Live like it's Christmas every day - Santa Stephen
  • Posts: 4926

What about considering removing the ability to set "Age at which to assume a person is dead" so that no assumptions are made. Responsibility for entering the 1 DEAT tag devolves on Users/Admins.

Sorry Stuart, but this and the related settings would create a situation that voids one of the biggest features webtrees offers - international death settings depending upon local laws. Unless I misunderstand, I could not support such a concept.

Then, as I proposed in my earlier posting, all of the other "is dead" and death assumption functionality be moved to the Gedcheck function to allow Admins to review the data.

Not trying to be argumentative, but first - there is no Gedcheck function any longer. There does exist a "Check for errors" administrative function. It is important to use current terminology so we have everyone on the same page. Additionally, we already know that users don't use most of the tools and actions already available to them. Getting them to enter a correct 1 DEAT entry is no better than the quality of any other entry - generally not good. Also, I already review nearly 100 additions or modifications each day. I'm not sure I have the time or the knowledge to dig deeper into new data to know quickly whether someone should be be labeled with a 1 DEAT. Using the review changes function does not allow easily for reviewing all the related records that might provide a clue.

We may be discussing a major change in code that would take valuable time away from completing the relationship function tool, and I'm not sure, given the plans for webtrees v1 whether we have enough time to mess with an issue created by data entry and not an actual bug.
Santa Stephen the Fabled Santa
Latest webtrees at MyArnolds.com
Hosted by webtreesonline.com , a division of GeneHosts LLC
MacOS 10.6.8, Apache 2.2+, PHP 5.4.16, MySQL 5.5.28

Please Log in or Create an account to join the conversation.

Last edit: by ToyGuy.

Privacy Issues 8 years 7 months ago #10

  • StuartG
  • StuartG's Avatar Topic Author
  • Offline
  • Junior Member
  • Junior Member
  • Posts: 150
ToyGuy wrote:

one of the biggest features webtrees offers - international death settings depending upon local laws


I'm not aware that webtrees has this feature. Is this a new feature in SVN? I do not think that removing the setting "Age at which to assume a person is dead" would lead to contravention of any local laws. In fact I think the reverse is true.

ToyGuy wrote:

There does exist a "Check for errors" administrative function. It is important to use current terminology


Sorry about the misuse of terminology.

ToyGuy wrote:

Getting them to enter a correct 1 DEAT entry is no better than the quality of any other entry - generally not good.


This is exactly my point. What I am discussing does not do anything to prevent or catch incorrect data entry. What it does do is to ensure that any incorrect data for one person does not impact on the display of another living person. The existing code not only assists in the incorrect display it does nothing to prevent it. It actually creates a virtual death event affecting a person that isn't being edited and who may not be immediately visible. Privacy is absolutely important and the webtrees code should not allow the display of living people in any circumstance.

Whether or not Users/Admins have the time or the inclination to check all of the entries that are being made in their tree is irrelevant. In these circumstances, all that can happen if the change I am discussing is implemented, is that some people, sources, media etc will also not be visible. Certainly no living people will be visible.

ToyGuy wrote:

We may be discussing a major change in code


I am advised that the change I am discussing could be as simple as deleting all of the relevant code in the "is dead" function after the death event check. Certainly no major change or re-write.

There would be more work to implement the discussed changes in the "Check for errors" function but these would not be required immediately. Without implementing them, the non-display of people, media, sources etc., would serve to highlight the fact that a 1 DEAT tag for some person is required to display these. I would not think that the transfer of the main "is dead" functionality to the "Check for errors" function would be overly complicated or time consuming. But I could well be wrong on this. But this change is not immediately important.

What is so absolutely important is that webtrees should not assist in displaying living people by creating virtual deaths based upon on what could be false data. The rest is secondary.
Stuart
webtrees 2.0.16
⚶ Vesta Modules
PHP 7.3.7
Mysqli

Please Log in or Create an account to join the conversation.

Privacy Issues 8 years 7 months ago #11

Interesting discussion.

The appropriateness of an algorithm that attributes the highest possible importance to a date solely on the basis that it is the smallest date seems rather questionable, especially when the date does not belong to the record whose state you are trying to determine, but to a related record.

I remain in favor of an algorithm, but not one that simply promotes the oldest available date to most significant date.

Please Log in or Create an account to join the conversation.

Privacy Issues 8 years 7 months ago #12

  • Jackie
  • Jackie's Avatar
  • Offline
  • Platinum Member
  • Platinum Member
  • Posts: 4767
Hi,

I am not sure I understand all consequences of Stuart's proposition. Could someone tell me if I will lose the option : « Extend privacy to dead people » ?

Please Log in or Create an account to join the conversation.

Privacy Issues 8 years 7 months ago #13

  • bertkoor
  • bertkoor's Avatar
  • Offline
  • Platinum Member
  • Platinum Member
  • Greetings from Utrecht, Holland
  • Posts: 2503
Hi Jackie,

I am not sure I understand all consequences of Stuart's proposition. Could someone tell me if I will lose the option : « Extend privacy to dead people » ?

Stuart's proposition is a crude simplification of the "isDead" algorithm, back to a very naieve implementation. Not a step forward if you ask me. As I understand it, it should with his proposition still be possible to extend privacy to dead people born in the last xx years or died in the last xx years.

Stuart,

I raised a bug report at No. 1168445

I also opened a topic about it quite recently. The conclusion I drew was that under normal circumstances the privacy calculation works quite well. In your case it turned out to be the cause was an error in the entered data entry. Combined with the fact that the calculation doesn't work as straight forward as you'd naievely think, based on the parameters you have (Age at which to assume a person is dead.) But now we have a couple of topics discussing it, future users faced with simular queries can find the background info if they want.

To spot data entry errors like this there's the Statistics page which can give clues about that, especially age of youngest/oldest father/mother. I try to keep an eye on those like once a month.

And I found another method using the Advanced Search: search for Date of Birth 1923 +/- 10 years, then click Death>100; repeat for 1943, 1963, 1983 and 2003. That will show people assumed to be dead (or recorded to be dead) without having an exact date of death recorded.

What about if "is dead" simply looks for the 1 DEAT tag and if it is not found then that person is considered alive and not displayed.
Along with this, what about considering removing the ability to set "Age at which to assume a person is dead" so that no assumptions are made. Responsibility for entering the 1 DEAT tag devolves on Users/Admins.

You can't be serious. You cannot require the entry of "1 DEAT Y" for every individual. It is in no way mandatory, and personally I dislike recording the DEAT fact if I have no direct evidence of it. Also consider GEDCOM files imported from other systems where such requirements don't exist, they should function with privacy in place right after the import, and with no adding of supposedly missing data.
I think it is inevitable that assumptions are being made. Period.

international death settings depending upon local laws

I'm not aware that webtrees has this feature. Is this a new feature in SVN?

I think Stephen was referring to the current config options, being not only the age assumed dead, but also the options for extending privacy for persons born in the last xx years or died in the last xx years. This allows for extending the privacy to match local laws to some extent.

Privacy is absolutely important and the webtrees code should not allow the display of living people in any circumstance.

Yes, privacy is important, but no: there are circumstances thinkable where (assumed to be) living individuals should be shown. And you have all the tools available to let webtrees behave the way you'd like, and you have the tools to find obvious errors in your data. Whatever way you twist it, it remains the responsibility of the administrator. Software can aid, but there are limits.

But let's not make the problem bigger than it really is. You were the victim of erroneous data, which you corrected, plus obliviance of the algorithm, which is now explained, so now all should be fine. Had you entered a birth date of 1850 instead of 1950 on the living individual, then (s)he'd become erroneously public as well. Errors have consequences, sometimes stretching further than you'd think.

Again, look at your statistics page and you'll probably find more data entry flaws.
stamboom.BertKoor.nl runs on webtrees v1.7.13

Please Log in or Create an account to join the conversation.

Privacy Issues 8 years 7 months ago #14

  • StuartG
  • StuartG's Avatar Topic Author
  • Offline
  • Junior Member
  • Junior Member
  • Posts: 150
Jackie wrote:

Could someone tell me if I will lose the option : « Extend privacy to dead people » ?


The changes I am discussing should have no consequences on any other feature of webtrees.

bertkoor wrote:

personally I dislike recording the DEAT fact if I have no direct evidence of it.


Yet you accept that webtrees should do this automatically for you without your knowledge! If you are not willing to make this entry yourself, why do you accept webtrees doing it for you unknowingly?

bertkoor wrote:

Yes, privacy is important, but no: there are circumstances thinkable where (assumed to be) living individuals should be shown.


User/Admins can make assumptions, display living people or whatever and reap the rewards or suffer the consequences. What I am discussing is that webtrees should not assume that people are dead and therefore visible and do this automatically without your knowledge.

bertkoor wrote:

Errors have consequences, sometimes stretching further than you'd think.


I agree and webtrees takes one error in the data and potentially creates an untold number of other errors without warning and without your knowledge affecting individuals who are, or maybe, correctly recorded. What is the purpose of webtrees doing this other than to potentially make living people, whose data may well be correct, visible to all.

bertkoor wrote:

so now all should be fine


No. The "is dead" function is currently designed to make assumptions that it should not make. Users/Admins can make these assumption if they wish, but webtrees should not make them, especially when the result of these assumptions is to display living people whose record is otherwise correct.
Stuart
webtrees 2.0.16
⚶ Vesta Modules
PHP 7.3.7
Mysqli

Please Log in or Create an account to join the conversation.

Privacy Issues 8 years 7 months ago #15

  • Jackie
  • Jackie's Avatar
  • Offline
  • Platinum Member
  • Platinum Member
  • Posts: 4767
Thank you Bert and Stuart for your replies. You both reassured me. :-)

Please Log in or Create an account to join the conversation.

Privacy Issues 8 years 7 months ago #16

  • StuartG
  • StuartG's Avatar Topic Author
  • Offline
  • Junior Member
  • Junior Member
  • Posts: 150
No trouble Jackie. The help you have given to others is very much appreciated.

I appreciate the contributions of everyone and I ask that more people express their views. It is my wish that a wide ranging discussion will lead to a consensus view on what should or should not be done.

Reflecting more on what bertkoor wrote:

You can't be serious. You cannot require the entry of "1 DEAT Y" for every individual. It is in no way mandatory, and personally I dislike recording the DEAT fact if I have no direct evidence of it. Also consider GEDCOM files imported from other systems where such requirements don't exist, they should function with privacy in place right after the import, and with no adding of supposedly missing data.
I think it is inevitable that assumptions are being made. Period.


I am serious. My discussion has not suggested a that the changes would require anybody to do anything, let alone a requirement to add a 1 DEAT Y entry. However, one of the consequences of what I am suggesting is that if there is no such entry then that person will not be public or visible. The only people who should be making a decision on whether the person is visible (dead) or not is the User/Admin. The User/Admin should not abrogate their responsibility to webtrees and require webtrees to make the assumption that the person is dead and therefore should be visible. Leaving webtrees to make this assumption leads to the compounding of one error into making an untold number of further errors and allow individuals who should be private to be public.

Perhaps bertkoor may have meant - will the changes you are discussing require me to add a 1 DEAT Y entry for everyone I want to be visible? The answer is - yes.

I am discussing that adding the 1 DEAT Y tag is the responsibility of the User/Admin and it should not be left to webtrees to assume that the person is dead, resulting in webtrees adding, in effect, the 1 DEAT Y tag for you. This then leads on to what also may be behind bertkoor's contribution to this discussion. If a User/Admin has many individuals who have been made private by the changes being discussed, how do we make it easier for them to add the necessary 1 DEAT Y tag. The answer to this is addressed in a post of mine above. That is, the functionality of the "is dead" function be moved to the "Check for errors" function. It is here that webtrees should display in the error log, those individuals who it assumes are dead and who do not have the 1 DEAT Y tag. It will then by up to the User/Admin to make the decision as to whether to enter the DEAT tag or not. It should not be left up to webtrees to make this decision with the resultant potential of creating an untold number of other errors that are not easily seen.

And for the last sentence in the quote above. The only people who should be making assumptions and taking responsibility for such a decision is the User/Admin. It should not be webtrees!!
Stuart
webtrees 2.0.16
⚶ Vesta Modules
PHP 7.3.7
Mysqli

Please Log in or Create an account to join the conversation.

Privacy Issues 8 years 7 months ago #17

  • StuartG
  • StuartG's Avatar Topic Author
  • Offline
  • Junior Member
  • Junior Member
  • Posts: 150
Just a little bit more. As I mentioned in an earlier post here, it seems that a portion of the "is dead" function appears to used in Admin > Batch update > Add missing death records.

It seems that webtrees displays those individuals who it considers are missing the 1 DEAT tag. So assistance is already available if my suggested changes are implemented.
Stuart
webtrees 2.0.16
⚶ Vesta Modules
PHP 7.3.7
Mysqli

Please Log in or Create an account to join the conversation.

Privacy Issues 8 years 7 months ago #18

Hi Stuart,
I get what you are saying, and usually I would take the stance of being over-cautious with data publication rather than let sensitive data being publicly display in error. However in this case I am quite happy with the tools currently available to me in webtrees and would be very reluctant to start hiding individuals who do not have a 1 DEAT entry as I have many INDIs who I have no record of death and for whom I possibly never will have. I have made many connections with distant relatives through them googling a family name and finding my tree. If these INDIs (who are most certainly deceased) were hidden I would lose the one of the aspects of webtrees which I love.
Becky

Server: PHP 7.1.32, MSQL 5.0.12-dev webtrees 1.7.14

Please Log in or Create an account to join the conversation.

Privacy Issues 8 years 7 months ago #19

  • bertkoor
  • bertkoor's Avatar
  • Offline
  • Platinum Member
  • Platinum Member
  • Greetings from Utrecht, Holland
  • Posts: 2503

What I am discussing is that webtrees should not assume that people are dead and therefore visible and do this automatically without your knowledge.

In that I think you are wrong and, as we say in Dutch, throwing away the baby with the bathwater.

Perhaps bertkoor may have meant - will the changes you are discussing require me to add a 1 DEAT Y entry for everyone I want to be visible? The answer is - yes.

Thanks for the straight answer, because that indeed is what I meant.

Imagine that, as you propose, a "1 DEAT" tag is required to make an individual public. I can guarantee you that the webtrees forum and buglist will be flooded with users complaining about suddenly private individuals. The last upgrade which moved the media folder has learnt us that you can make sticky subjects about it, show flashing warning etc, people will not read release notes, installation instructions or attached readme's. They (with me sometimes included) will just install and expect everything in principle still works as before.

You'll also have loads of people that want to try out webtrees, import their GEDCOM data in it, then to find out that the people who are public in almost every genealogical package available have now become private. That really is a big deal and very frustrating to users having to go find out why, this you should not underestimate.

You also have to consider that for a large portion of the individuals of a tree you won't ever be able to find (and thus record) the date of death. Yet it is very fair to assume a certain age at which someone is most certainly and beyond doubt dead. I think this principle should not need any discussion at all. And there is a group of users (me included) that find it principally wrong to record an unknown event (manually or automatically) only to satisfy system requirements. Note that recording an event is not the same as deducing a likelyhood the event happened.

And no, I have no problems with webtrees then deducing that this event did happen. Just like I'd rather not record a "1 BIRT" event (of which we also know it did happen) if all I have is a chistening or a marriage, or even worse: a person only named in the testament of their father in the year 1700. If there is minimal data, then you should just record that minimal data and nothing more than that. This is a principle I strongly endorce.

Maybe you should also look at it from an architectural point of view.
At the base we have the recorded data. This should imho be as clean as possible, not requiring the entry of speculative or redundant data.

On the Person object holding this data there should be two seperate functions, of which the result is based on the available data:
1) isDead. The naieve implementation just looks for events DEAT, BURI and CREM.
2) isPrivate. The naieve implementation just looks for a "1 RESN" tag.

It is important to acknowledge these are seperate functions. In such a well-designed system you should be pleading for "1 RESN private" to be added to every living individual, and maybe even "1 RESN none" to every deceised. Because your point of view is that the editors & administrator are the only ones responsible for data entry such that the system behaves like you would expect it to behave. Ironically it was a data entry failure which spawned this whole discussion you started, which only shows that users cannot be trusted to always enter correct data. Shit happens, and you have to accept that to some extent (and you have acknowledged that you do accept that.)

So I'd rather have a system that aids the user and makes assumptions based on available data than a system that burdens the user by requiring certain arbitrairy tags to be set on an arbitrairy subset of the tree data. More on that later...



Now the current situation is we have an isDead algorithm looking for clues a person might be dead, and if none are found then it is assumed the person is still alive. I would endorse a change where only if a date of birth is known and which is less than maxAliveYears ago (ie indicating alive) then stop looking for indications and don't analyse the data of the relatives. This should prevent the majority of false-positives reported by the algorithm, and is still close to what an admin assumes is happening when reading the privacy control variable labels, and is also still quite close to the current implementation.

To summarize what isDead should do imho:
1) look for DEAT/BURI/CREM tags. If present, then the person is definitely dead so stop.
2) look for the BIRT tag (not BAPM) with a date. If calculated age from that is less than maxAliveYears, then the person is possibly alive, so stop.
3) look for the earliest event in the person itself, followed by (in no particular order) related families, sibblings, partners, parents, grandparents, children, grandchildren. Calculate an estimated age on that date (aided with some offsets like 45 for a parent) and if one of them is less than maxAliveYears, then the person is assumed to be dead so stop.

Note that steps 1 & 2 can be performed while performing step 3 on the person itself (requires iteration through the tags anyway) because this is performed before iterating through the relatives.

If a User/Admin has many individuals who have been made private by the changes being discussed, how do we make it easier for them to add the necessary 1 DEAT Y tag. The answer to this is addressed in a post of mine above. That is, the functionality of the "is dead" function be moved to the "Check for errors" function. It is here that webtrees should display in the error log, those individuals who it assumes are dead and who do not have the 1 DEAT Y tag. It will then by up to the User/Admin to make the decision as to whether to enter the DEAT tag or not.

As you already found out, this is the "add missing DEAT tag" batch function. The "check for errors" function only checks the structure of the tags, and not the data in them.
stamboom.BertKoor.nl runs on webtrees v1.7.13

Please Log in or Create an account to join the conversation.

Privacy Issues 8 years 7 months ago #20

  • StuartG
  • StuartG's Avatar Topic Author
  • Offline
  • Junior Member
  • Junior Member
  • Posts: 150
Thanks to taalia81 and bertkoor for your considered postings. They are appreciated. For other readers, please post your thoughts, particularly if you have a strong and considered opinion one way or the other. A simple one sentence expressing your views will also be appreciated. When sufficient time has elapsed, I will try and round out the discussion by summarising all the points raised.

In the meantime, can someone please advise whether the "is dead" function as used in Admin > Batch update > Add missing death records, is the same as that used in the normal process by which webtrees chooses which individuals are to be displayed? How confident can a User/Admin be, to use this Batch update function? If we are confident with the way webtrees ("is dead") carries out its functions in normal use (as I am), then we should be equally confident about the results it displays in the Batch update function. From my brief testing of the process I am confident it does.
Stuart
webtrees 2.0.16
⚶ Vesta Modules
PHP 7.3.7
Mysqli

Please Log in or Create an account to join the conversation.

Powered by Kunena Forum