Please do NOT expect all Feature Requests to be actioned automatically. Describing your proposal here will ensure the development team are aware of it, and they will give it careful consideration.

TOPIC: Extended Census Assistant

Extended Census Assistant 2 months 3 days ago #1

  • bluesky
  • bluesky's Avatar
  • Offline
  • New
  • Posts: 2
I'm a developer, and I have actually started development on what I propose here. I'm just curious whether it aligns with the goals of the software, and whether anything like this has been attempted previously.
I came to this when I wanted to start feeding census records and looked for a convenient and natural way to coordinate the multiple persons on the census record. The Census Assistant provides just that.
But then I began wondering about various ways to extend the Census Assistant (and as mentioned, developed it a bit).
The first step is to automatically create various facts based on the census. If the census marks an age or place of birth, you can create a birth record. If it marks immigration date, you can create an immigration record. Etc. It should probably be able to create individuals as well if the individual was not previously known.
The next step is to adapt the census assistant to other similar records such as ship manifests. So instead of being just for censuses, it could also be used for immigration or emigration.
Finally, other forms such as naturalization, birth, marriage, death records, etc, could be adapted as well. It could become a generalized form assistant that could automatically create appropriate records based on the data provided.
Because all of this is so useful (at least, it sounds very useful to me), I wonder if it was done previously or if there are additional considerations to take into account.
The administrator has disabled public write access.

Extended Census Assistant 2 months 3 days ago #2

  • fisharebest
  • fisharebest's Avatar
  • Online
  • Administrator
  • Posts: 10445
1) Sounds interesting.
2) I'm not aware of anything similar that has been done before.

I didn't write the original census-assistant. If I did, there's one thing I *might* have done differently. I prefer to use GEDCOM structure wherever possible, and I think we might have been able to link the household members using ASSO links and store the transcript in a source (or source-citation) instead of a note object. The note object provides no semantics, and cannot be processed in an automated way.

As a genealogist, I'd like to see any automatically generated fields properly sourced. So, if you are creating an OCCU record based on a census, then I would like to see the the OCCU containing the approprate SOUR citaiton.

Slightly off-topic. Other digital representations of genealogy data allow a hierarchy of sources. GEDCOM has simply SOUR/REPO. This encourages us to have a single source "1881 census", and each household census becomes a citation within it. If, instead, the household census was a standalone source (and part of a larger source), then we would have a single place to store the transcript of the household census (i.e. a SOUR record), and could use it easily as a citation for all the BIRT/OCCU/IMMI/etc. facts that it identifies - including the CENS event.
Greg Roach - This email address is being protected from spambots. You need JavaScript enabled to view it. - fisharebest.webtrees.net
The administrator has disabled public write access.

Extended Census Assistant 2 months 3 days ago #3

  • bluesky
  • bluesky's Avatar
  • Offline
  • New
  • Posts: 2
Ok, so first of all, yes, I added sourcing to the note. You have the option in the dialog box of the assistant to specify the source, and if you do, then your generated note is sourced to the source you specified.
I would have liked it to be gedcom. The problem is that gedcom does not support top-level events. In fact, I came to the Census Assistant after trying to experiment with gedcom on webtrees to represent censuses in my research. This is in my view a short-coming that we have to live with.
(It supports events and even mentions specifically census events on families, but often censuses include multiple families, such as son and his wife, or wife's in-laws, or cousins, etc)
What the shared-note gives us, is that it is shared and top-level.
What we could do, is make the content of the generated note gedcom. A person can't really read a census-assistant note well anyway. It isn't even a csv.
So in this new census-assistant, what could be done is something like this pseudo-code gedcom:
0 @I1@ INDI
1 NAME Juda /ASIMOV/
1 BIRT
2 DATE CAL 1897
3 NOTE @N2@
2 PLAC Russia
3 NOTE @N2@
2 DATE CAL 1897
3 NOTE @N3@
2 PLAC Russia
3 NOTE @N3@
1 CENS
2 TYPE Form-Assisted
2 NOTE @N17@

0 @N2@ NOTE
1 CONT 0 CENS 1930 Census of Juda Asimov's Household
...
1 CONT 1 ASSO @I1@
1 CONT 2 NAME Juda /ASIMOV/
1 CONT 2 RELA Head
1 CONT 2 SEX M
1 CONT 2 AGE 33
1 CONT 2 PROP Rented home (rent $90)
...
1 CONT 2 OCCU proprietor
1 CONT 3 AGNC candy shop
1 CONT 2 TEXT "Juda Asimov",R,90,,no,M,W,33,M,22, ...
1 CONT 1 ASSO ...
1 REFN Form-Assisted

(basing myself off of 1930 census of Isaac Asimov - familysearch.org/ark:/61903/1:1:X4VB-GH4 - The full generation would include the entire family)

Here, the note is actually in gedcom, so its own text can then be re-parsed as gedcom.
The above also includes my suggestion for conflicting dates/locations of birth, marriage, etc that might be discerned from the censuses. I have seen a blog post about it from a few years ago, and basically think that the best way to handle this is as multiple values inside the relevant event rather than multiple events even when it is possible. (This also more accurately represents the real-world situation - you have the same event, and multiple values based on different sources). In the above, there are multiple elements for same values (CAL 1897, Russia). These can be joined or merged either in the gedcom itself, or alternatively, only when it is displayed to the user. Separating it in the gedcom means it is easier to interpret using code. Linking these facts to the note also means it makes it possible to later edit the census. If the census is edited, and the facts are changed, they can now be searched out and replaced with the new facts. Then the matter of display is a matter of doing something user-friendly. If we have an exact birth date in a birth record, we don't need to even show the census values. If no exact birth source is available, and two censuses give the same birth year, then it show it only once and mark it "Based on the 1930 Census of Juda Asimov's Household and the 1940 Census of Juda Asimov's Household." They are in the gedcom, but simply hidden unless you look at the raw gedcom. Also in the above, the birth elements aren't directly sourced, but the census note itself can be sourced. Doing it this way can also sort-of re-appropriate the note as another source level. For example:
repository - Ellis Island Manifests
source - 13 Feb 1923 Baltic Manifest
note - Juda Asimov's immigration record in that manifest
or:
repository - 1930 US Census
source - 651 Essex St Brooklyn NY
note - Juda Asimov's household in that address
Perhaps this is not the intended use of repository, but for me, it seems practical.

Anyway, all of the above are basically suggestions. The purpose is to create something that is workable and acceptable.
The administrator has disabled public write access.
Moderators: makitso
Powered by Kunena Forum