- Posts: 3
Question Converting/parsing MySql (MariaDB) to gedcom
- Ralle
- Topic Author
- Offline
- New Member
Less
More
1 year 1 month ago #1
by Ralle
Converting/parsing MySql (MariaDB) to gedcom was created by Ralle
I have a database with 16951 registered births in a pretty restricted area physically. It spans from 1733 to 1924, and contains birthdate and name of the child, parents and witnesses. In most cases either the age or the birthdates on the parents are given.
The database contains a single table with a primary key for each birth. Each row is a birth, and details are presented in columns. In spreadsheet format it stretches to column DJ, with placeholders for up to 8 witnesses.
I would like to convert this into a gedcom file. I'll be using php. I aim for a final database that is as tight as possible. I have some ideas, but would like to hear if someone other has done this.
My thaughts so far is to create a new table with primary key for each individual, linked against the primary key for each birth. Then there are two possible directions: 1 - Export a gedcom file with 16951 familys, and nothing more, and do the rest of the work in standalone gedcom-tools available, or 2 - Keep it in MySql and do the heavy lifting there until there is as few islands as possible.
I am aware there will be some human intervension needed, but I hope I can write some code that can help me a bit on the way. Is there anyone that have some php-code snippets lying around???
The database contains a single table with a primary key for each birth. Each row is a birth, and details are presented in columns. In spreadsheet format it stretches to column DJ, with placeholders for up to 8 witnesses.
I would like to convert this into a gedcom file. I'll be using php. I aim for a final database that is as tight as possible. I have some ideas, but would like to hear if someone other has done this.
My thaughts so far is to create a new table with primary key for each individual, linked against the primary key for each birth. Then there are two possible directions: 1 - Export a gedcom file with 16951 familys, and nothing more, and do the rest of the work in standalone gedcom-tools available, or 2 - Keep it in MySql and do the heavy lifting there until there is as few islands as possible.
I am aware there will be some human intervension needed, but I hope I can write some code that can help me a bit on the way. Is there anyone that have some php-code snippets lying around???
Please Log in or Create an account to join the conversation.
- bertkoor
- Offline
- Platinum Member
- Greetings from Utrecht, Holland
1 year 1 month ago #2
by bertkoor
stamboom.BertKoor.nl runs on webtrees v2.1.20
Replied by bertkoor on topic Converting/parsing MySql (MariaDB) to gedcom
I doubt PHP is a proper fit to solve this problem, since it is typically used to render html. That means there is typically a limit to how much work can be done before the browser gives a time-out.
I'd recommend a general purpose language such as Python. Or Java. There is a library I think called Gedcom4j. But writing gedcom text is not the challenge.
Also look at gedtool, which is a set of Excel macros.
I'd recommend a general purpose language such as Python. Or Java. There is a library I think called Gedcom4j. But writing gedcom text is not the challenge.
Also look at gedtool, which is a set of Excel macros.
stamboom.BertKoor.nl runs on webtrees v2.1.20
Please Log in or Create an account to join the conversation.
- Ralle
- Topic Author
- Offline
- New Member
Less
More
- Posts: 3
1 year 1 month ago #3
by Ralle
Replied by Ralle on topic Converting/parsing MySql (MariaDB) to gedcom
php is what I know best. It will all be done with my own LAMP at localhost, so I don't think browser time-out will be any problem. At least it should be possible to configure it, but it hasn't happened yet. I've already run scripts that keeps the cpu running for 15-20 minutes, and the browser just keeps on spinning. No time-outs. I recon there will be plenty of wildcard searching, and the browser is handy in giving previews.
Please Log in or Create an account to join the conversation.
- Jefferson49
- Offline
- Senior Member
1 year 1 month ago #4
by Jefferson49
Replied by Jefferson49 on topic Converting/parsing MySql (MariaDB) to gedcom
If I think about your conversion task, I am wondering if
GedTool
could provide an efficient approach:
- Use SQL to create a table with the wanted data from your database
- Export table to a CSV file
- Import CSV into EXCEL
- Manually create a GEDCOM text file with the GEDCOM structures and some example data you plan to create
- Alternatively to 4, create example data in webtrees and export a GEDCOM file
- Alternatively to 4, use one of the GedTool import templates , which are provided by GedTool
- Alternatively to 4, create a GedTool mapping file , in which specific conversion rules can be defined
- Import the created GEDCOM file into GedTool, which provides a table structure with colums for all the GEDCOM structures
- Apply some text conversion to your imported EXCEL data, e.g. search/replace, EXCEL date conversions, use EXCEL VLOOKUP function, VisualBasic programming, ...
- Copy the converted data from your imported EXCEL tables into the GEDCOM table of GedTool
- Alternatively to 10: Use EXCEL formulas or the EXCEL VLOOKUP function to link your imported EXCEL tables to the GEDCOM tables
- Export a GEDCOM file from GedTool
Please Log in or Create an account to join the conversation.
- Ralle
- Topic Author
- Offline
- New Member
Less
More
- Posts: 3
1 year 1 month ago #5
by Ralle
Replied by Ralle on topic Converting/parsing MySql (MariaDB) to gedcom
Thanks for input. I have looked into Gedtool, but it's mostly the wrong way around. In my case it will be to test the parents up against earlier registred parents, and once newborns - now adults. And printing a gedcom to a file is easy, strict rules.
If I put everybody in their own family, I'll get appr. 50K individuals, not incuding witnesses. The registred number of deaths in the same population in the same timespan is 13K. The number of births is 17K. By finding and merging duplicates I should end up someplace between 25-40K. Hard to tell. Will not omitt any data. Rather have thousands of islands.
I'll stick to php. In the end, I will have 5-10 scripts that are run once, the dataset itself is closed and will not change. Speed is not of the essence.
If I put everybody in their own family, I'll get appr. 50K individuals, not incuding witnesses. The registred number of deaths in the same population in the same timespan is 13K. The number of births is 17K. By finding and merging duplicates I should end up someplace between 25-40K. Hard to tell. Will not omitt any data. Rather have thousands of islands.
I'll stick to php. In the end, I will have 5-10 scripts that are run once, the dataset itself is closed and will not change. Speed is not of the essence.
Please Log in or Create an account to join the conversation.
- Peter_S
- Away
- Premium Member
1 year 1 month ago #6
by Peter_S
GedTool is also able to do this by searching and merging identical persons according to freely definable criteria.
But that would go beyond the scope of this forum. If you want further or more in-depth support for GedTool, please feel free to contact me directly. Email see my profile.
Peter
webtrees 2.1.22, vesta modules, chart modules of magicsunday, extended family and imprint of hartenthaler
PHP 8.3.12, MariaDB 10.11.8
Webhosting: genonline.de
Replied by Peter_S on topic Converting/parsing MySql (MariaDB) to gedcom
In my case it will be to test the parents up against earlier registred parents, and once newborns - now adults.
GedTool is also able to do this by searching and merging identical persons according to freely definable criteria.
But that would go beyond the scope of this forum. If you want further or more in-depth support for GedTool, please feel free to contact me directly. Email see my profile.
Peter
webtrees 2.1.22, vesta modules, chart modules of magicsunday, extended family and imprint of hartenthaler
PHP 8.3.12, MariaDB 10.11.8
Webhosting: genonline.de
Please Log in or Create an account to join the conversation.