Web based family history software

Question API for adding data

  • matthew
  • Topic Author
  • Visitor
  • Visitor
10 years 7 months ago - 10 years 7 months ago #1 by matthew
API for adding data was created by matthew
I'm just starting to look at using webtrees. So far it has been much better than all the other things I have tested. But one thing I'm after is some way for me to add data with out manually doing it through the interface. For example I already have a few things in webtrees and I really don't want to lose all the extra information in the webtrees database that a export to GEDCOM and then import from GEDCOM would dump. However other parts of my family do use things like ancestry.com which can export GEDCOM files but again I can't take those and integrate with webtrees with out losing what I already have.

So this is where some sort of API to webtrees would be very useful. A perl API would be great but I'm not really that picky about the language I adapt to most anything else easy enough even if rather than a language you opted for some SOAP or HTTP/CGI API that might be more in line with webtrees. Those are easy enough to deal with as well. Maybe something already exists and I just haven't bumped into it yet.

But with an API to look around and to add new content I could then take and GEDCOM file I might have and work out an appropriate way to graft in things other folks are maintaining other ways. One example of how I might use this: Lets say I know that all pedigree information for "Joe Bob" in webtrees comes from some other source that can generate a GEDCOM file. I could generate a script that deletes every entry above "Joe Bob" and then reads the GEDCOM file and adds everything back in using the API. I would probably do something that actually traversed the current webtreres petagree tree of "Joe Bob" comparing and updating but in either case with a supported API for webtrees this should be fairly easy to do.

I could work out my own solution by going direct to the database or possibly working out how each web page form currently works but this long term would be a huge headache to maintain.
Last edit: 10 years 7 months ago by makitso.

Please Log in or Create an account to join the conversation.

  • bertkoor
  • Offline
  • Platinum Member
  • Platinum Member
  • Greetings from Utrecht, Holland
More
10 years 7 months ago - 10 years 7 months ago #2 by bertkoor
Replied by bertkoor on topic API for adding data
You are basicly describing a "tree merge" process. Regardless the suggested API, webtrees is currently not capable of doing that. It is an often asked for feature though. But the major complication with tree merges (and reason it is not implemented) is what to do with conflicting data. It gets very messy very quickly.

There are offline tools though that specialise in merging two gedcom files into one. Look and you will find. I hope others will chime in with recommendations. I never have bothered though. I'd rather import the "foreign" data in a seperate tree, and make changes in my own data manually, according to my own standards, in my own time, after I had a chance to validate their sources etc.

stamboom.BertKoor.nl runs on webtrees v1.7.13
Last edit: 10 years 7 months ago by bertkoor.

Please Log in or Create an account to join the conversation.

  • matthew
  • Topic Author
  • Visitor
  • Visitor
10 years 7 months ago #3 by matthew
Replied by matthew on topic API for adding data
An API doesn't have to deal with the headaches involved with a merge (well unless the api provided a merge function) but that is not the basics of what I would be after in a API. The API needs to support the same basics that the interface already supports: add, modify, delete, select for each different object type webtrees supports. With that I can write my own merge script that in forces all the rules you describe and can do a merge according to the specific scenario I have run into.

With the API I no longer have to manually touch anything other than running my script.

Please Log in or Create an account to join the conversation.

  • bertkoor
  • Offline
  • Platinum Member
  • Platinum Member
  • Greetings from Utrecht, Holland
More
10 years 7 months ago - 10 years 7 months ago #4 by bertkoor
Replied by bertkoor on topic API for adding data
My guess is you haven't thought it through yet. So I'll have to take you by the hand and illustrate it.
Mind you, before moving on to an API the concept of how to do the process manually has to be crystal clear.

So in my tree I have a set of individuals & families with events, sources, media, etc.
Then I receive a snippet of GEDCOM data to merge in: one single person.
Before deciding weather to add a new person or merge with an existing one, one has to first look up for a "simular" person.
You cannot use the XREF, since those are ID's only unique in a single tree. My I123 is not your I123.
Matching on a name would be a good start, but this is very unreliable. My John Doe is your Jonathan d'Ough. Who would have thought...
Not to mention that in webtrees (and every other mature genealogy application) one can record multiple names for a single person.
Some genealogists (or their applications) prefer maiden names for females, other use their married name.
Matching futher on birth dates is complicated if I have only his baptism recorded and you have only his true birth date recorded.
Or I have an estimated birth date based on a recorded age of marriage or death, and it seems to match yours.
But later it turns out it were different persons, one born in January and a younger brother simularly named in December.
Place names is a snakepit, don't start on that. Bring in the parents & relatives makes it even more complicated.

So even this seemingly simple step of finding a simular person is quite complicated to do automated and failproof with a good balance of false-positives and false-negatives.
You might argue that this won't be the case with your data. But it's the exceptional cases you have to think of, otherwise the whole process falls over. Learn me one thing with automated processes: if there is any trace of shit, sooner or later it will hit the fan.

Shall I continue?

stamboom.BertKoor.nl runs on webtrees v1.7.13
Last edit: 10 years 7 months ago by bertkoor.

Please Log in or Create an account to join the conversation.

  • bertkoor
  • Offline
  • Platinum Member
  • Platinum Member
  • Greetings from Utrecht, Holland
More
10 years 7 months ago - 10 years 7 months ago #5 by bertkoor
Replied by bertkoor on topic API for adding data

matthew wrote: other parts of my family do use things like ancestry.com which can export GEDCOM files but again I can't take those and integrate with webtrees with out losing what I already have.

After having thought about it a bit more: you could invite the other family members to become editors in your webtrees site, and let them enter their found data themselves. That is what webtrees was designed for: collaborative working on one single set of online data.

To add to my previous post: there are good reasons why the Smart Matching ™ technology is propriety ;-)

stamboom.BertKoor.nl runs on webtrees v1.7.13
Last edit: 10 years 7 months ago by bertkoor.

Please Log in or Create an account to join the conversation.

More
10 years 7 months ago - 10 years 7 months ago #6 by Jackie
Replied by Jackie on topic API for adding data
Hi matthew,

..I no longer have to manually touch anything other than running my script..


I am not sure if you really are a genealogist.. It sounds like you want to grab many GEDCOM files from other people and create a one single huge database. ... at least sounds pretty lazy.
Last edit: 10 years 7 months ago by Jackie.

Please Log in or Create an account to join the conversation.

More
10 years 7 months ago #7 by fisharebest
Replied by fisharebest on topic API for adding data
I have my own plans for "Smart Matching" technology...

It won't match on names, dates, places. This works well for unusual names and small villages. if you are descended from John McDonald of Glasgow, it won't work at all.

Instead, I intend to match on source-citations. If two people both refer to the same census entry, birth registration, etc., then they are almost certainly refering to the same person.

To do this, I want to introduce a series of standard/official/public/library sources. e.g. "1891 Census of England". These will be identified by a UID.

Sites can publish a list of the public source-citations that they use (as a one-way hash, for privacy and to save bandwidth). Other sites can query these to look for matches.

Greg Roach - greg@subaqua.co.uk - @fisharebest@phpc.social - fisharebest.webtrees.net

Please Log in or Create an account to join the conversation.

  • matthew
  • Topic Author
  • Visitor
  • Visitor
10 years 7 months ago #8 by matthew
Replied by matthew on topic API for adding data
Hmmm we seem to be getting lost in the noise of my initial example which was (unfortunately) a merge one.

So lets try another example. Lets say I initially started out using full names for countries: Canada, Italy, ....

But some 1000 people latter I now decide I don't want to use those anymore and instead want to use international 2 letter standard abbreviations. If I had an API that supported the things I have outlined I could easily write a script that could go through the entire list of families and individuals and for every "place" field make this correction for me.

An API is not about merging data (although it can help with that issue and even solve that issue for some cases).

It is about being able to manipulate the existing data in code for automation of various task that might come up.

Suppose you decide you need to change all surnames of McGillis to Mc Gills, or suppose you decide you want a more complex report of your data than is currently provided by the GUI.

Suppose I decide to experiment with a new 3 dimensional representation of the data. With an API I could do this.

Please Log in or Create an account to join the conversation.

  • ToyGuy
  • Offline
  • Moderator
  • Moderator
  • Live like it's Christmas every day - Santa Stephen
More
10 years 7 months ago #9 by ToyGuy
Replied by ToyGuy on topic API for adding data
What's wrong with Batch Update? You can accomplish precisely the same result you proffered in your PLAC example using it. I've done so many, many times. Same is true of the SURN manipulation.

Santa Stephen the Fabled Santa
Latest webtrees at MyArnolds.com
Hosted by webtreesonline.com , a division of GeneHosts LLC
MacOS 10.6.8, Apache 2.2+, PHP 5.4.16, MySQL 5.5.28

Please Log in or Create an account to join the conversation.

  • matthew
  • Topic Author
  • Visitor
  • Visitor
10 years 7 months ago #10 by matthew
Replied by matthew on topic API for adding data
Batch update is nice but not nearly as flexible as an API. What I'm looking for is a supported hook into the system that does not depend on manually interacting with the GUI something that can be scripted/automated.

Lets suppose I want to send out a message to everyone in my direct family that is still living on their birthday. :)

Not really something the Batch Update will support.

Please Log in or Create an account to join the conversation.

More
10 years 7 months ago #11 by Jackie
Replied by Jackie on topic API for adding data
Hello Matthew,

Would you please tell me the URL address of your genealogy website and from which country you are from?

Please Log in or Create an account to join the conversation.

  • Phred
  • Visitor
  • Visitor
10 years 7 months ago #12 by Phred
Replied by Phred on topic API for adding data
I would like to see an API interface for use in adding facts, source citations, kind of in the context of the PGV FamilySearch tool. Running in a local environment opens avenues like user written clipboard tools or FireFox add-ons to “automate” copy operations, the difficulty is the paste into webtrees. I suspect designing and building a webtrees API interface wouldn’t be a trivial exercise, with what appears to be little demand.

Fred

Please Log in or Create an account to join the conversation.

  • ToyGuy
  • Offline
  • Moderator
  • Moderator
  • Live like it's Christmas every day - Santa Stephen
More
10 years 7 months ago #13 by ToyGuy
Replied by ToyGuy on topic API for adding data
As Fred has suggested, and given all the necessary work ahead to develop webtrees v2 and normalize the database, it is very unlikely that development work on an API will be high on our priority list. I can't imagine that many would use it and it would not be trivial to develop and maintain.

Santa Stephen the Fabled Santa
Latest webtrees at MyArnolds.com
Hosted by webtreesonline.com , a division of GeneHosts LLC
MacOS 10.6.8, Apache 2.2+, PHP 5.4.16, MySQL 5.5.28

Please Log in or Create an account to join the conversation.

  • matthew
  • Topic Author
  • Visitor
  • Visitor
10 years 7 months ago #14 by matthew
Replied by matthew on topic API for adding data

ToyGuy wrote: As Fred has suggested, and given all the necessary work ahead to develop webtrees v2 and normalize the database, it is very unlikely that development work on an API will be high on our priority list.


Understand but still something I would be very interested in ... v3 :)

ToyGuy wrote: I can't imagine that many would use it


I can imagine they would.... Just glancing through some of the recent request for feature issues. If an API existed these items become more doable by individuals outside of the core developers:

XMP fact-tags and geo-tags
Android or iOS app
Kinship Chart/Family View
Advanced search for couples with no children
syncing database with gedcom
.....

ToyGuy wrote: and it would not be trivial to develop and maintain.


Well this is a double edged sword on the one hand I agree it certainly is more development but on the other hand it is also a powerful way to do your own regression testing to catch bugs and problems much earlier in the life cycle. So ya it has its costs but it also has its returns.

O-Well future wish list. :)

Please Log in or Create an account to join the conversation.

  • bertkoor
  • Offline
  • Platinum Member
  • Platinum Member
  • Greetings from Utrecht, Holland
More
10 years 7 months ago - 10 years 7 months ago #15 by bertkoor
Replied by bertkoor on topic API for adding data
Matthew,

Some things on your list can currently already be done. The rest can probably be implemented by writing a custom module in PHP. Nothing wrong with that, and it's the best possible way to access (and manipulate) your database. Ever tried that? Or were you overwhelmed when you tried? The API you suggest could start very simple but I can guarantee it would evolve into something very overwhelming as well ;-)

You're talking about regression testing. Can already be done. Firstly there's unit tests the developer could write. Alas if the developers don't start coding in a Test Driven way, writing them afterwards on a codebase of millions of lines is a costly task that never ever will be earned back.
Then there's frameworks that can do full regression tests of websites, such as JMeter. It's all possible, but it costs time to implement. Would you really invest that much time?

Really, I think you think too light of these issues, and all seems a simple matter of programming to you.

stamboom.BertKoor.nl runs on webtrees v1.7.13
Last edit: 10 years 7 months ago by bertkoor.

Please Log in or Create an account to join the conversation.

  • matthew
  • Topic Author
  • Visitor
  • Visitor
10 years 7 months ago #16 by matthew
Replied by matthew on topic API for adding data
bertkoor

It is fairly clear we will simply have to agree to disagree. You have your opinion about software development and it does not match mine. I'm not looking to change your opinion I simply tried to offer some suggestions for features that I thought would be valuable to this system from my perspective.

No one has to agree with me, but it seems at least some do.

I'm comfortable at this time that all three of them are understood in the way I intended. Which at this point is all I'm after.

Please Log in or Create an account to join the conversation.

  • norwegian_sardines
  • Offline
  • Platinum Member
  • Platinum Member
More
10 years 7 months ago - 10 years 7 months ago #17 by norwegian_sardines
Replied by norwegian_sardines on topic API for adding data
Matthew,

An API would be a nice feature.......

BUT someone has to program it, we are all volunteers here, each of us have jobs, families and have things that we do away from webtrees. webtrees is one of the fun things that we do and if someone with coding experience thinks it would be fun and devotes the 100s of hours to write, support and update the code into the future years, then they will do the programming. Open source software is a different model than commercial software. Open Source is driven by either the ego of the programmer being pattted on the back by a bunch of users that say the program is the best or by a need that they (the programmer) has in their own work to make it happen in the real world.

If you are motivated by the need you could try coding the interface yourself, make a fork of webtrees. If you can't do the programming maybe your wallet can motivate someone to do it. Maybe you will get lucky and others are motivate in the same way as yourself and you all can get it done as a group. webtrees was started by a group of motivated individuals that forked an existing software program because they need it to do things that that program could not do. They (we) knew what it would take to make it happen then and into the future.

IF YOU LOOK in other threads on this forum, an API has been talked about by others.

Ken
Last edit: 10 years 7 months ago by norwegian_sardines.

Please Log in or Create an account to join the conversation.

  • matthew
  • Topic Author
  • Visitor
  • Visitor
10 years 7 months ago #18 by matthew
Replied by matthew on topic API for adding data
Ken,

Totally agree!! What you have done is great and I'm very confident I will continue to use it with or without the things I have identified.

Who knows at some point I may contribute something on an number of things but who knows if what I do will be liked and or ever included. :)

Please Log in or Create an account to join the conversation.

  • norwegian_sardines
  • Offline
  • Platinum Member
  • Platinum Member
More
10 years 7 months ago #19 by norwegian_sardines
Replied by norwegian_sardines on topic API for adding data
I take no credit for the work that others did to create webtrees at the start or PGV before. I help here only so that others can do what they do best, while I contribute what I do best, hoping it is valuable.

Ken

Please Log in or Create an account to join the conversation.

More
10 years 6 months ago #20 by WGroleau
Replied by WGroleau on topic API for adding data

It won't match on names, dates, places. This works well for unusual names and small villages. if you are descended from John McDonald of Glasgow, it won't work at all.

I have to disagree with this. Decades ago, Genealogical Information Manager (GIM) had a match and merge function that worked very well.

It would compare everyone in the database against every other on numerous factors. A similarity in anything it looked at had some weight and the combination of weights gave a probability estimate of a match. The algorithm factored in a percentage of the predictions of immediate relatives.

Then it would display the names in two columns. Select a name in one column, and the other would resort with the highest prediction on top.

Select one in the second column and see the details of the two records side by side.

From that, decide whether to merge or not.

If merging, choose to keep/discard/merge/edit individual facts and events.

If merging two facts that were xrefs (two spouses, two children, etc.), the compare/decide feature was recursive.

At any point in recursion, you could stop and go back to the two top-level lists. Or even start over on the matching prediction.

And the prediction was actually reasonable, unlike what GenCircles called "smartMatch" which apparently was if (similar surname) and (lived on the same planet).

I'd recommend trying that out just to see what it looked like. gimsoft.com Unfortunately, it is an ancient program. You'd have to run it in DOS on Windows 2000 or earlier. It is unstable on Windows XP.

I'm not really endorsing the program overall, but in the merge/graft features, they did what others have said is impossible.

--
Wes Groleau
UniGen.us/

Please Log in or Create an account to join the conversation.

Powered by Kunena Forum
}