Web based family history software

Question GEDCOM export with empty records

  • Peter_S
  • Topic Author
  • Away
  • Senior Member
  • Senior Member
More
1 month 3 weeks ago #1 by Peter_S
GEDCOM export with empty records was created by Peter_S
When I perform a GEDCOM export, "empty" records (SUBM, OBJE) may be generated depending on the privacy settings. 

This contradicts the GEDCOM grammar rules: "All GEDCOM lines have either a value or a pointer unless the line contains subordinate
GEDCOM lines."

A complete submitter record must always be generated, even with restricted privacy settings. SUBM is mandatory.
Empty OBJE records and the pointers to them should be removed for the privacy setting "Visitor".

Peter

webtrees 2.1.20, vesta modules, chart modules of magicsunday, extended family and imprint of hartenthaler
PHP 8.2.4, MariaDB 10.3.38
Webhosting: genonline.de

Please Log in or Create an account to join the conversation.

More
1 month 3 weeks ago #2 by fisharebest
Replied by fisharebest on topic GEDCOM export with empty records
AFAICT, this was done for performance in PhpGedView and the logic has not been updated.

It is mentioned in this issue.

github.com/fisharebest/webtrees/issues/4883

Greg Roach - greg@subaqua.co.uk - @fisharebest@phpc.social - fisharebest.webtrees.net

Please Log in or Create an account to join the conversation.

  • Peter_S
  • Topic Author
  • Away
  • Senior Member
  • Senior Member
More
1 month 3 weeks ago #3 by Peter_S
Replied by Peter_S on topic GEDCOM export with empty records
The result of the discussion in issue 4883 was to create a default submitter record for newly created trees in order to enable a standard-compliant  GEDCOM export. That's is fine for me, but the creation of empty records (e.g. OBJE) is still in conflict with the GEDCOM standard. For both GEDCOM 5.5.1 and GEDCOM 7.
Removing records during export also results in the associated pointers being removed. In PhpGedView, these were probably the performance reasons for not doing this. But the GEDCOM export in webtrees should not violate the standard.

Peter

webtrees 2.1.20, vesta modules, chart modules of magicsunday, extended family and imprint of hartenthaler
PHP 8.2.4, MariaDB 10.3.38
Webhosting: genonline.de

Please Log in or Create an account to join the conversation.

  • Jefferson49
  • Away
  • Junior Member
  • Junior Member
More
1 month 2 weeks ago #4 by Jefferson49
Replied by Jefferson49 on topic GEDCOM export with empty records
For my own exports, I also ran into the same situation with empty records, Besides of the mentioned issue #4883 about SUBM records, I have also submitted an earlier ticket #3817 , which described the overall situation of empty records.

During the work on my custom module DownloadGedcomWithURL , I dived deeper into the code, but the resumee of my examinations is that the current concept does not offer simple solutions.

In a simplifyed summary, the GedcomExportService of webtrees creates the export as follows:
  1. Create a list of all records (INDI, FAM, SOUR, ...)
  2. Ask each of the single records to reduce itself according to privacy settings
  3. Export the reduced objects
If step 2 results in an empty record, there is no concept implemented to avoid exporting. On first view, the export of an empty record could be easily omitted in step 3. However, this would create a huge risk on the integrity of all the references between the records, which is worse than exporting empty records.

In order to improve about the situation, we would need some "post processing" after step 2, which elimates all references to empty records.

During step 2, it would not be difficult to create a list of cross-reference identifiers (@X1234@), which point to empty records. The main challenge is to find an algorithm to remove the cross-reference identifiers from the other objects without loosing too much data. In order to fit into the overall export concept like described above, the algorithm should convert single Gedcom records (not the complete Gedcom file).

I would offer a beer as a prize for someone with a good idea.

Please Log in or Create an account to join the conversation.

  • Jefferson49
  • Away
  • Junior Member
  • Junior Member
More
1 month 2 weeks ago #5 by Jefferson49
Replied by Jefferson49 on topic GEDCOM export with empty records
For my own exports, I established a "post-processing" with external tools. I am using the tools from  Gedcom Service Programs , which also include a feature to remove empty objects.

Please Log in or Create an account to join the conversation.

More
1 month 2 weeks ago #6 by ric2015
Replied by ric2015 on topic GEDCOM export with empty records

During step 2, it would not be difficult to create a list of cross-reference identifiers (@X1234@), which point to empty records. The main challenge is to find an algorithm to remove the cross-reference identifiers from the other objects without loosing too much data.
 
The GEDCOM 7 spec suggests to use the newly introduced void pointers for this case , so this may be something to consider once GEDCOM 7 is fully supported.

Richard

webtrees 2.1.17 at cissee.de/webtrees2
Vesta custom modules (Classic Look & Feel, Gov4webtrees, Shared Places, Extended Relationships) available at cissee.de

Please Log in or Create an account to join the conversation.

  • Jefferson49
  • Away
  • Junior Member
  • Junior Member
More
1 month 2 weeks ago - 1 month 2 weeks ago #7 by Jefferson49
Replied by Jefferson49 on topic GEDCOM export with empty records

The GEDCOM 7 spec suggests to use the newly introduced void pointers for this case , so this may be something to consider once GEDCOM 7 is fully supported.
Thank you for this interesting hint. After checking the GEDCOM 7 specification, I was surprised that it even gives clear guidance how to handle the removal of data. In Chapter 1.6 Removing data it says:

There may be situations where data needs to be removed from a dataset, such as when a user requests its deletion or marks it as confidential and not for export.

In general, removed data should result in removed structures.

Pointers to a removed structure should be replaced with voidPTrs.

If removal of a structure makes the superstructure invalid because the superstructure required the substructure, the structure should instead be retained and have its payload changed to a voidPTr if a pointer, or to a data type-appropriate empty value if a non-pointer.

If removing a structure leaves its superstructure with no payload and no substructures, the superstructure should also be removed.


Therefore, it seems to be clearly defined in GEDCOM 7. However, we probably need to learn how well it works with real Gedcom files.
Last edit: 1 month 2 weeks ago by Jefferson49. Reason: voidPTrs wrongly shown as "code"

Please Log in or Create an account to join the conversation.

  • Jefferson49
  • Away
  • Junior Member
  • Junior Member
More
1 month 2 weeks ago - 1 month 2 weeks ago #8 by Jefferson49
Replied by Jefferson49 on topic GEDCOM export with empty records
While reading the GEDCOM 7 approach (posted above), I was wondering how the same concept could be applied to GEDCOM 5.5.1.

What came to my mind was to create a "void record" for each of the basic records (e.g. one "void record" for INDI, FAM, SOUR, ...) and point to these "void records". The pointers to the void records would be very simular to the void pointers in GEDCOM 7.

Example for a void record for SOUR:
0 @X12345@ SOUR
1 TITL Non disclosed source, which was removed from the GEDCOM export due to a privacy restriction
Last edit: 1 month 2 weeks ago by Jefferson49. Reason: Replaced null by void

Please Log in or Create an account to join the conversation.

  • Peter_S
  • Topic Author
  • Away
  • Senior Member
  • Senior Member
More
1 month 1 week ago #9 by Peter_S
Replied by Peter_S on topic GEDCOM export with empty records
I don't think it is a good idea to use VOID records in GEDCCOM 5.5.1, as this can lead to unwanted side effects.
If, for example, a VOID record is used as a replacement for removed INDI records, then FAMC and FAMS links are accidentally used to generate family relationships between individuals who are not related to each other. Reports on individuals related by blood, for example, become unusable as a result.

With GEDCOM 7 there are only VOID pointers instead. This is harmless. 

Therefore, I am in favor of no empty records being written in GEDCOM 5.5.1 and pointers to these records being removed without replacement.

In webtrees with privacy settings for visitors, families with confidential persons are currently not displayed. In my opinion, this is correct.
When exporting a GEDCOM with the privacy setting Visitor, the complete family structures (FAMC, FAMS, HUSB, WIFE, CHIL) for confidential individuals are listed, but without content. In my opinion, this is questionable. 

The behavior regarding the privacy settings should be the same for the export as in the GUI.

Peter

webtrees 2.1.20, vesta modules, chart modules of magicsunday, extended family and imprint of hartenthaler
PHP 8.2.4, MariaDB 10.3.38
Webhosting: genonline.de

Please Log in or Create an account to join the conversation.

Powered by Kunena Forum
}