Question gedcom merge
- wdm001
- Topic Author
- Visitor
14 years 3 months ago #1
by wdm001
gedcom merge was created by wdm001
There was a basic module for pgv that would merge gedcom's but this has a number of issues the least of which was putting a source tag on every tag on a individual or family record. The matching was sometime quite difficult and would go into loops
What I would like to see is a merge program that follows the FAMC and FAMS tags to identify families and the individuals in that familiy that could be merged and does not try and guess. Manual alignment ( Drag and drop ) of individuals to match them from the two gedcom's in any family. additional FAMC's and FAMS' would then be followed in the subsquent matched individuals
Once all individuals and families had been aligned then the process should start a new phase that aligns other tags on each matched family and individual with a tick choice of which ones to keep like the current individiual merge process. An option would be to just accept any new individuals (and their data) and families or to examine each one.
Where any NOTES OBJE and SOUR are found on a FAM or INDV record (and subsquent SOUR. OBJE and NOTES) then these are presented for matching as well.
You should be able to leave this and pick it up later.
Options on source tags could include any new fact tag from the subsidary gedcom would receive a source tag and any new family or individual would receive a level one source tag. Perhaps a tick box (with a default) against each tag on the individual tag alignment phase
I am quite happy to look at the design logic for this but my coding skills are not that developed.
I started this in March in 2009 when I had some spare time but never returned to it. I attach what can only be described as a outline that has some code elements
What I would like to see is a merge program that follows the FAMC and FAMS tags to identify families and the individuals in that familiy that could be merged and does not try and guess. Manual alignment ( Drag and drop ) of individuals to match them from the two gedcom's in any family. additional FAMC's and FAMS' would then be followed in the subsquent matched individuals
Once all individuals and families had been aligned then the process should start a new phase that aligns other tags on each matched family and individual with a tick choice of which ones to keep like the current individiual merge process. An option would be to just accept any new individuals (and their data) and families or to examine each one.
Where any NOTES OBJE and SOUR are found on a FAM or INDV record (and subsquent SOUR. OBJE and NOTES) then these are presented for matching as well.
You should be able to leave this and pick it up later.
Options on source tags could include any new fact tag from the subsidary gedcom would receive a source tag and any new family or individual would receive a level one source tag. Perhaps a tick box (with a default) against each tag on the individual tag alignment phase
I am quite happy to look at the design logic for this but my coding skills are not that developed.
I started this in March in 2009 when I had some spare time but never returned to it. I attach what can only be described as a outline that has some code elements
Please Log in or Create an account to join the conversation.
- kiwi
- Offline
- Platinum Member
Less
More
14 years 3 months ago #2
by kiwi
Nigel
www.our-families.info
Replied by kiwi on topic Re:gedcom merge
I'm not surprised to see this request. Its probably one of the most asked for feature in ANY family history software, and the fact of the matter is - no-one has ever totally cracked a solution, though a couple have come close, and I've tried most.
Its hard to respond without seeming to just criticise, which I don't mean to do. I DO think its a good aspirational goal, but have doubts it can be achieved.
Going very manual, such as (I think) you've proposed here has a lot going for it - until you consider scaling. Take two 10,000 INDI GEDCOMs, with (say) 40% overlap. The number of families you will need to work through manually could be huge and time consuming.
What most of the better solutions try to use is fairly complex comparative algorithms that grade their results, so you can let the easy ones go through automatically, and only review the "hard" ones. But that presents its own problems - either the algorithms are hard-coded, so you just have to live with their interpretations; or they have incredibly complex configuration settings that need to be constantly tweaked.
Just a few thoughts, based on experience.......
Its hard to respond without seeming to just criticise, which I don't mean to do. I DO think its a good aspirational goal, but have doubts it can be achieved.
Going very manual, such as (I think) you've proposed here has a lot going for it - until you consider scaling. Take two 10,000 INDI GEDCOMs, with (say) 40% overlap. The number of families you will need to work through manually could be huge and time consuming.
What most of the better solutions try to use is fairly complex comparative algorithms that grade their results, so you can let the easy ones go through automatically, and only review the "hard" ones. But that presents its own problems - either the algorithms are hard-coded, so you just have to live with their interpretations; or they have incredibly complex configuration settings that need to be constantly tweaked.
Just a few thoughts, based on experience.......
Nigel
www.our-families.info
Please Log in or Create an account to join the conversation.
- ToyGuy
- Offline
- Moderator
- Live like it's Christmas every day - Santa Stephen
14 years 3 months ago #3
by ToyGuy
Santa Stephen the Fabled Santa
Latest webtrees at MyArnolds.com
Hosted by webtreesonline.com , a division of GeneHosts LLC
MacOS 10.6.8, Apache 2.2+, PHP 5.4.16, MySQL 5.5.28
Replied by ToyGuy on topic Re:gedcom merge
I commend the recommendation, but second the skepticism. I too have tried perhaps a dozen possible solutions and without fail, each fails in its assumptions and / or because of the data and entry styles of the creator. While I understand that the standard provides some semblance of conformity, the variations allowed that still meet the standard, deprecated tags, different fashions of recording places, notes, and sources, etc create an almost unsurmountable obstacle to creating a manageable merge.
Daniel Kionka's GDBI perhaps had one of the best, as well as his outstanding pre-entry possible match finder, both of which I hope he is still about and will consider contributing to this project as perhaps add-on features.
I've found, after a couple of attempted merges with various programs, that its far better to simply add the data manually - as it produces a much better result which conforms to your own, personal data presentation standards.
Daniel Kionka's GDBI perhaps had one of the best, as well as his outstanding pre-entry possible match finder, both of which I hope he is still about and will consider contributing to this project as perhaps add-on features.
I've found, after a couple of attempted merges with various programs, that its far better to simply add the data manually - as it produces a much better result which conforms to your own, personal data presentation standards.
Santa Stephen the Fabled Santa
Latest webtrees at MyArnolds.com
Hosted by webtreesonline.com , a division of GeneHosts LLC
MacOS 10.6.8, Apache 2.2+, PHP 5.4.16, MySQL 5.5.28
Please Log in or Create an account to join the conversation.
- WGroleau
- Offline
- Platinum Member
Less
More
- Posts: 2165
14 years 3 months ago #4
by WGroleau
Might be worth downloading just for the sake of reading that chapter of the manual.
--
Wes Groleau
UniGen.us/
Replied by WGroleau on topic Re:gedcom merge
Long ago, I used a program called Genealogical Information Manager (GIM). I felt that its merge/graft function succeeded. It is still available (gimsoft.com) but it is not pleasant to use in a DOS Window of Win XP--some sort of incompatibility means frequent crashes.tried perhaps a dozen possible solutions and …, each fails in its assumptions
Might be worth downloading just for the sake of reading that chapter of the manual.
--
Wes Groleau
UniGen.us/
Please Log in or Create an account to join the conversation.