- Posts: 224
Question Concepts for organizing a genealogy in webtrees as an archive
- Jefferson49
- Topic Author
- Offline
- Junior Member
currently, I am half way through reorganizing my genealogical data, documents, and photos with webtrees. After some re-structuring of my digital files and folders as well as of my physical documents, photos, and folders, I recognized that I have to think my genealogy as an “archive” and webtrees as an important tool to organize the archive. I found it very helpful to look at my archive in the same way like to visit a public archive. It needs to be organized in a way that someone else (and myself!) can efficiently find useful information.
In the following paragraphs, I want to share some of the concepts, which I have applied. Afterwards, I will share some draft ideas and concepts for further enhancements.
I would be happy if you could provide some feedback or share some of your experiences.
Some steps to take to organize an archive within webtrees:
- Design a directory structure for your archive (in some arbitrary text editor, text program, or spreadsheet)
- Create a dedicated repository <My Archive> as REPO object to represent your archive in webtrees
- Put all your sources of your genealogy into webtrees as SOUR records and assign them to <My Archive>.
- For each of the sources, fill the Call Number related to <My archive>
- Include the directory (or better: abbreviations of the directory) from 1. to the Call Number of the sources in <My archive>, e.g. “/Biogr/Mil/MilHe/Nr. 5 ” for the 5th source in the directory “Biographies/Miller/Miller, Henry/”. Note: This is a pretty usual way to organize call numbers, which is also used by professional archives.
- Put all the media files related to a source into a webtrees media folder, which directly corresponds with your archive directory structure from 1. and your Call Number structure from 5., e.g. ”../webtrees/data/media/Biographies/Miller/Miller, Henry/Nr. 5 - University Degree Henry Miller from 1920.pdf”
- Use the same directory structure also for your physical folders, documents, photos, e.g. put a physical folder in your bookshelf with the label “Biographies” and arrange the documents within the physical folder exactly in the same way like the archive structure from 1. (and also 5. and 6.)
- For maintaining and re-arranging the directory structure in webtrees, use “Control panel, Manage family trees, Data fixes, Search an Replace”, e.g. search for: “/Biogr/Mil/MilHe/” and replace by: “/Biogr/UK/MilHe/”
- Enhance your source data with information about content, places and date period, e.g. use the available webtrees (and GEDCOM) fields “Data”, “Data/Date”, “Data/Place”, “Data/Note”, “Data/Event” within the webtrees source objects. Note: “Data/Note” can be used for a general description of the source.
Ideas for further steps and concepts:
- Generate a finding aid (unfortunately, with several steps of manual effort, but maybe, once a year): e.g. export webtrees data via GEDCOM, convert to csv or xlsx, do some spreadshead or database magic, generate a report (e.g. a MS Access report). Note: A “finding aid” for other users (and yourself!) is a standard overview document used in archives in order to quickly find sources and provide insights about the archive structure. Note: if all the data mentioned above is available in webtrees, it is a perfect base for a finding aid.
- Automatically generate a finding aid as a webtrees report: e.g. use the available webtrees data from above and put it into a webtrees report. This would be a great webtrees feature, but needs someone to develop a plugin module for webtrees.
- Use a freely available professional archive software, e.g. AtoM ( www.accesstomemory.org/ ) with a similar LAMPP architecture like webtrees: export webtrees source data via GEDCOM, convert to csv or xlsx, do some spreadshead or database magic, generate csv file according to AtoM csv import template, import to AtoM, use full feature set of AtoM archive software, use (generated) hyperlinks from AtoM records to webtrees sources, use AtoM round trip feature for consecutive csv re-imports based on UIDs.
- Automatically create AtoM csv export/import files with a webtrees plug-in in order to automate AtoM import and round trip data exchange.
- Generate professional finding aid: import data to AtoM, generate a finding aid from AtoM.
- Generate standardized archive EAD xml file: import data to AtoM, generate standardized EAD xml file from AtoM.
- Make your archive public by providing an EAD xml file to an archive portal, e.g. www.archivesportaleurope.net
Please Log in or Create an account to join the conversation.
- hermann
- Offline
- Elite Member
I have two different types of archives: physical and virtual. The first one are two boxes and a bookshelf at home, the second one are a folder on my NAS (file server) and my e-mail archive. At the moment my archive box is structured in the way that the newest or last used documents are on the top. So point 1 from your list is on my to-do list. For my virtual archives, I'm a fan of flat storage, i.e. no directory structure. There are only two folders "done" and "to-do". I prefer tagging all documents because there are so many different views regarding the photos and documents and a directory structure supports only one view and this is in my opinion not flexible enough. To find something it is easier for me to search instead of following a predefined directory structure.
Your point 1: I have to do this for my archive boxes, but I will not do it for my virtual archives.
Your points 2, 3, 4, and 9: done
Regarding your further steps: I just had a look at AtoM, an interesting tool! Thank you for this hint.
I never heard about www.archivesportaleurope.net , but when I just opened it, I found some very interesting links to documents in an archive, I didn't know about it. Cool!
Hermann
Designer of the custom module "Extended Family"
webtrees 2.1.20 (all custom modules installed, PHP 8.2, MariaDB 10.6) @ ahnen.hartenthaler.eu
Please Log in or Create an account to join the conversation.
- Jefferson49
- Topic Author
- Offline
- Junior Member
- Posts: 224
The point about multiple views on the virtual part of the archive is really a good argument. I have to think about this. Maybe, it is also possible to use a combination, i.e. one view by directory, further views by additional attributes.For my virtual archives, I'm a fan of flat storage, i.e. no directory structure. There are only two folders "done" and "to-do". I prefer tagging all documents because there are so many different views regarding the photos and documents and a directory structure supports only one view and this is in my opinion not flexible enough. To find something it is easier for me to search instead of following a predefined directory structure.
From your experience: do have proposals for some standard approach for document attributes, e.g. use PDF, JPEG, ... attributes and an open available search tool?
Please Log in or Create an account to join the conversation.
- Jefferson49
- Topic Author
- Offline
- Junior Member
- Posts: 224
I never heard about www.archivesportaleurope.net , but when I just opened it, I found some very interesting links to documents in an archive, I didn't know about it. Cool!
The archive portals really offer great possibilities. Since more and more archives have digital catalogs, they can share it and make the catalog searchable for a greater audience.
There is also an interesting archive portal in Germany: www.archivportal-d.de
Please Log in or Create an account to join the conversation.
- miqrogroove
- Offline
- New Member
- Posts: 88
Other than photos, all "archives" have been placed via Apache Alias directive under "/lib" and meaningful subdirectories. In essence, the document library lies outside of webtrees.
Where each item is referenced in webtrees, I use a markdown link or links directly to the items.
To help with semantics and cross referencing, I place the markdown within a shared note when needed.
Pros:
- Native index browsing.
- Intuitive URL structure.
- Intuitive file-based management.
- URLs can be permanent, and would not change if webtrees changes.
- Markdown links survive the export process.
- Shared notes provide rich metadata and cross referencing.
Cons:
- MD Links are a one-way mechanism, so there's no direct navigation from the index browser back into webtrees.
- The webtrees media firewall is bypassed, which might or might not be desirable.
- The webtrees media and thumbnail system are stored and managed separately, which might or might not be desireble.
Please Log in or Create an account to join the conversation.
- miqrogroove
- Offline
- New Member
- Posts: 88
Please Log in or Create an account to join the conversation.
- Jefferson49
- Topic Author
- Offline
- Junior Member
- Posts: 224
According to the manual, CollectiveAccess provides sophisticated import/export features and web based APIs, which might offer possibilities for data exchange with webtrees and GEDCOM.
After spending significant time for a test installation of AtoM, the list of installation requirements for Collective Access does not encourage me for an instant try. Unfortunatelly, these archive and media management systems are much more complex than webtrees. I will put it on a list for a weekend, when I am in the mood for new experiments.
I would be also very interested to hear if someone has some experiences with it.
Please Log in or Create an account to join the conversation.
- miqrogroove
- Offline
- New Member
- Posts: 88
I haven't looked under the hood of the front end system yet.
I'm not going to need import features because I haven't documented my archives in any way yet. I do wonder though, how easy or difficult it would be to setup up common IDs or call numbers for webtrees and an archive system to deep link each other without duplication of labor.
Please Log in or Create an account to join the conversation.
- miqrogroove
- Offline
- New Member
- Posts: 88
Just by looking at the default setup.php settings, I get an impression that the "front end" and "back end" applications are entirely separate scripts and intended to run on separate websites (sub-domains, maybe??). They would share a database and file storage, but the site-level settings could be different for each.
The "back end" documentation says to install it "to the root of the web server instance". So I don't think it would be good to try to put it on the same site as webtrees. And now that means we're talking about running 3 totally separate websites.
Edit: Curve ball. There's a relative root path variable mentioned in app/helpers/post-setup.php of both packages that says, "attempts to determine the relative URL path automatically." It wasn't mentioned in the documentation. This should offer plenty of flexibility.
Please Log in or Create an account to join the conversation.
- miqrogroove
- Offline
- New Member
- Posts: 88
The back end docs are here:
manual.collectiveaccess.org/setup/Installation.html
But the front end docs currently live in a separate place that I couldn't find without help from multiple other resources.
web.archive.org/web/20210214194551/https...nstalling_Pawtucket2
It makes so much sense now.
Please Log in or Create an account to join the conversation.
- Jefferson49
- Topic Author
- Offline
- Junior Member
- Posts: 224
Please Log in or Create an account to join the conversation.
- miqrogroove
- Offline
- New Member
- Posts: 88
The direction I've taken is slightly different from before.
After setting up CollectiveAccess, I noticed it does a very good job of maintaining hierarchical "collections" (repositories and sources in gedcom jargon) as a separate table where they can be managed independently of other data.
As I add "objects" to the archive, they are assigned an autonumbered URL that is also independent of the object data. So I am continuing the use of Shared Notes in webtrees, and simply adding a markdown link to the archive object's URL, then using the shared note on individual source citations.
Within CollectiveAccess, I figured out how to add a list of links to the object editor, so I can also manually link an object back to the webtrees individuals. This part might be possible to automate as it is a minor duplication of info. However, I use the same feature to store links to external sites where I've found microfilm images and indexes and such.
Please Log in or Create an account to join the conversation.
- Jefferson49
- Topic Author
- Offline
- Junior Member
- Posts: 224
As I add "objects" to the archive, they are assigned an autonumbered URL that is also independent of the object data. So I am continuing the use of Shared Notes in webtrees, and simply adding a markdown link to the archive object's URL, then using the shared note on individual source citations.
Within CollectiveAccess, I figured out how to add a list of links to the object editor, so I can also manually link an object back to the webtrees individuals. This part might be possible to automate as it is a minor duplication of info. However, I use the same feature to store links to external sites where I've found microfilm images and indexes and such.
First of all, I think that manual linking (with URLs between webtrees and back to the archive management tool) is a good and pragmatic way. For private genealogical archives and smaller number of items/sources, the additional effort to insert the links is probably acceptable. And the resulting overall solution with linking forward and backward, is already very promising.
Regarding (partly automated) export/import, I tested the following approaches with AtoM, which might also apply for CollectiveAccess. CA even seems to have a more sophisticated import interface . For AtoM, I tested the following process:
- Gedcom export of repository and sources from webtrees
- Gedcom to EXCEL converter, e.g. GedTool
- Some EXCEL transformation to the required import format
- Use import features of archive management system
However, while showing the basic feasibility of export/import, it apparently is too complicated.
While CollectiveAccess and AtoM seem to focus on specific EXCEL/CSV formats, it would be better to use a standard format to cooperate with different archive management systems. What is available is EAD XML . In the documentation of both, CollectiveAccess and AtoM, I found that EAD XML can be imported.
Please Log in or Create an account to join the conversation.
- miqrogroove
- Offline
- New Member
- Posts: 88
Edit: I just saw there's a NOTE:REFN tag that might be appropriate to contain call numbers. In Gedcom 7 it changes to SNOTE:REFN. I might experiment with this.
Edit 2: In the CollectiveAccess front end, one alternative URL structure is /MultiSearch/Index?search=<identifier> which will redirect to the object URL. But it works in some situations and not others. I would need a more specific call number redirector. In webtrees, the NOTE:REFN field is not even displayed on individual facts/events. It would require some custom code to make it show up and point to the archive search address.
Please Log in or Create an account to join the conversation.
- Jefferson49
- Topic Author
- Offline
- Junior Member
- Posts: 224
Edit: I just saw there's a NOTE:REFN tag that might be appropriate to contain call numbers. In Gedcom 7 it changes to SNOTE:REFN. I might experiment with this.
Edit 2: In the CollectiveAccess front end, one alternative URL structure is /MultiSearch/Index?search=<identifier> which will redirect to the object URL. But it works in some situations and not others. I would need a more specific call number redirector. In webtrees, the NOTE:REFN field is not even displayed on individual facts/events. It would require some custom code to make it show up and point to the archive search address.
For call numbers in Gedcom, there is a CALN tag within the SOURCE_REPOSITORY_CITATION.
A source record may look like this:
A (partly) automated syncronisation process between webtrees and an archive management tool (AMT) could be designed as follows:
- Export EAD XML from webtrees, which contains the data from a Gedcom repository with the described source records; and also containing the XREFs of the sources and the webtrees rest URLs to the sources
- Import EAD XML to the AMT; in the easiest case, only new sources could be imported and existing sources could be neglected. The source XREFs could be used as unique identifyer
- The AMT needs to transfer the webtrees URLs to a suitable data structure in the AMT source records and show the webtrees link in the AMT front end
- ... do some archive work in the AMT ...
- Export EAD XML from the AMT, which contains a rest URL to the sources in the AMT
- Import EAD XML to webtrees
- Run a data fix service in webtrees to update call numbers (CALN), add the AMT URLs to Gedcom notes, and link the Gedcom notes to the Gedcom sources. The data fix service would require some intelligence for syncing/updating
Alternatively to using Gedcom notes, AMT URLs might also be appended to the call numbers, e.g.:
2 CALN my_amt_call_number, http://my_amt_url
webtrees recognizes URLs within call numbers and the link can be directly clicked in the webtrees front end.
Please Log in or Create an account to join the conversation.
- miqrogroove
- Offline
- New Member
- Posts: 88
My gedcom has always used source records for the series level collections. That's why I'm using source citations and shared notes to represent most of the objects.
Also, same concern applies to CALN where webtrees does not pull that value into the facts/events tab. This only happens with the PAGE value in citations, which in turn doesn't apply to notes. That leads me back to the need for either markdown or custom code.
Please Log in or Create an account to join the conversation.
- Jefferson49
- Topic Author
- Offline
- Junior Member
- Posts: 224
Basically, I create one source for one object. However, there is always room for specific decisions, e.g.: to define a physical folder with certificates as one source or to define each of the certificates in the folder as separate sources. Overall, I have 470 sources in my webtrees database, of which 180 belong to my private archive.You're generating a "source record" in webtrees for every "object" in the archive? My gedcom has always used source records for the series level collections.
In my case, it seems to be very simular. I also use source citations for linking sources with Gedcom facts, individuals etc.. Indeed, I consider this as the major advantage of using sources in webtrees. Overall, I count about 1700 source citations in my webtrees database.That's why I'm using source citations and shared notes to represent most of the objects.
At the moment, I do not use shared notes.
I guess your point is about accessibility of the links in the webtrees front end and that links to an external archive management tool should be directly visible/clickable. What we would like to see is a tag in the facts/events views with a direct http:// link.Also, same concern applies to CALN where webtrees does not pull that value into the facts/events tab.
I understand your point that CALN is not shown in facts/events. It is only indirectly available by opening the link to the source first and then clicking on the call number. If the call number contains the URL, it is 2 clicks away.
Have you found a way to get a one click solution with shared notes? I did some testing in the facts/events tabs. If inserting a shared note close to a source, it is possible to click the link to the shared note. Afterwards, I need a second click to open an external URL.
Please Log in or Create an account to join the conversation.
- miqrogroove
- Offline
- New Member
- Posts: 88
Have you found a way to get a one click solution with shared notes? I did some testing in the facts/events tabs. If inserting a shared note close to a source, it is possible to click the link to the shared note. Afterwards, I need a second click to open an external URL.
Yes, there are several ways to use shared notes. For example, if the note is attached to a fact, or attached to a citation on a fact, it will appear in the facts/events tab. The title of the shared note is clickable, but there is also an expandable arrow next to the note that will display the text inline.
Taking this a step further, in the Control Panel > Tree (manage) > Preferences, there is a subheading "Individual pages". Under that, there are two preferences named "Automatically expand notes" and "Automatically expand sources". With these set to "yes" there will be no clicks needed to see the shared note. For cosmetic reasons, I set the first one to "yes" and the second to "no".
Please Log in or Create an account to join the conversation.
- miqrogroove
- Offline
- New Member
- Posts: 88
Please Log in or Create an account to join the conversation.
- Jefferson49
- Topic Author
- Offline
- Junior Member
- Posts: 224
Also, the first line of a webtrees shared note is supposed to be a title followed by two linefeeds. If you're putting a URL in the first line it might look strange.
Thank you for the additional information. After changing the settings in the control panel, I was able to show and click the shared notes. For providing a direct accessible link in the facts/events area, it seams to be the best possibility. From my first tests, I would prefer to put the link directly in the shared note title, because it is the possibility with the least text.
However, while thinking about the resulting data structures, adding shared notes to the source citations raises some doubts from the Gedcom data model point of view: The URL to a source should be directly assigned to the source itself and not to the source citations, which might multiply the information. It seems that assigning it to the citation is only necessary to see the link in the webtrees front end for facts/events.
Placing the information in SOUR:REPO:CALN, SOUR:REPO:NOTE, or SOUR:REFN seems to be more suitable. However, the drawback would be indirect linking from facts/events (i.e. open link to source first, then follow link to URL).
To get more insight, I also want to learn more about the effort for showing additional information in the front end.
Please Log in or Create an account to join the conversation.