Web based family history software

Question Date Display Formats for Ranges and Fuzzy Dates

  • nilsonj
  • Topic Author
  • Offline
  • New Member
  • New Member
More
1 year 4 months ago #1 by nilsonj
I'm very happy to have discovered that GEDCOM (and of course webtrees) supports fuzzy dates. That made a lot of things much easier for data entry as I'm getting to parts of my own family tree with less certainty. I'm frequent to use "about" and "before" and "after" and "from" and "to" now, but I've noticed two things which make this a bit less functional on the front-end:

1) webtrees does have some sorting logic debates to work out, so I'll leave this alone for now. Suffice it to say that there's no perfect solution for this.
2) Often these fuzzy bits of logic are NOT displayed in the individual view where they would be highly useful to expose. For instance:
- if I store "BEFORE 1950", the interface will display "1950"
- if I store "AFTER 1950", the interface will again display "1950"
- if I store "BEFORE 1950 AFTER 1940", the interface will display "1950"
- if I store "AFTER 1940 BEFORE 1950", the interface will display "1940"
- if I store "FROM 1940 TO 1950", the interface will correctly display "from 1940 to 1950" and sort according to the first of the two dates.

Bottom line, there are too many cases to list exhaustively here, but I'd like for the interface to expose a more accurate view of the stored fuzzy dates and ranges so that the information is useful without resorting to the edit view. I'd like to be able to see, directly, "after 1940 and before 1950" or "between 1940 and 1950", etc. If you're looking at reworking this section, the next level feature here might be to include a visual distinction for dates which are specified by "ABOUT" or other fuzzy logic so that they can be more exactly specified when noticed.

Please Log in or Create an account to join the conversation.

  • norwegian_sardines
  • Offline
  • Platinum Member
  • Platinum Member
More
1 year 4 months ago #2 by norwegian_sardines
Replied by norwegian_sardines on topic Date Display Formats for Ranges and Fuzzy Dates
Are you actually entering:
"AFTER 1940 BEFORE 1950"

This is not valid GEDCOM

You should enter:

BET 1940 AND 1950

Ken

Please Log in or Create an account to join the conversation.

  • norwegian_sardines
  • Offline
  • Platinum Member
  • Platinum Member
More
1 year 4 months ago #3 by norwegian_sardines
Replied by norwegian_sardines on topic Date Display Formats for Ranges and Fuzzy Dates
Please enter “AFT 1950”, “BEF 1940”

These are valid GEDCOM.

Ken

Please Log in or Create an account to join the conversation.

  • nilsonj
  • Topic Author
  • Offline
  • New Member
  • New Member
More
1 year 4 months ago #4 by nilsonj
Respectfully, I’d think these simple allowances would be handled by the UI. Excuse my lack of pedantism. It does display appropriate expansions in part, but only one half of the BEF/AFT pair when using the GEDCOM tokens directly. The UI so graciously transforms most other tokens in the date field that I never actually checked to see if it was storing the full expansions “BEFORE” or “AFTER” instead of the GEDCOM native tokens BEF or AFT. Let’s just stop for a minute and consider how that’s probably not what the end user should focus attention on. And since we split these hairs with a meat cleaver, we can just chock this up to a third feature request with my original post:

3) Translational expansion/contraction of common language tokens to native GEDCOM syntax just the same way we see labels like “individual” in the interface and know that the programming end handles translating that to the GEDCOM spec “INDI”……. I.e, “BEF” from “BEFORE”, etc. Whenever possible, tokens like these would make the most sense in longhand rather than shorthand to the common user who uses this software presumably because hand-writing a massive GEDCOM file with a text editor and sharing with friends via newsgroups or dialup BBS wasn’t his or her cup of tea, but que sera sera.

Please Log in or Create an account to join the conversation.

  • bertkoor
  • Offline
  • Platinum Member
  • Platinum Member
  • Greetings from Utrecht, Holland
More
1 year 4 months ago - 1 year 4 months ago #5 by bertkoor
I think I've proposed this a decade ago already. It can be noted in a compact way:

BEF 1900 --> <1900
AFT 1900 --> >1900
ABT 1900 --> ~1900
EST 1900 --> ?1900
BET 1900 AND 1920 --> 1900-1920

stamboom.BertKoor.nl runs on webtrees v2.1.20
Last edit: 1 year 4 months ago by bertkoor.

Please Log in or Create an account to join the conversation.

  • norwegian_sardines
  • Offline
  • Platinum Member
  • Platinum Member
More
1 year 4 months ago - 1 year 4 months ago #6 by norwegian_sardines
Replied by norwegian_sardines on topic Date Display Formats for Ranges and Fuzzy Dates
Nilsonj,

I don’t disagree that it would be “nice” if you could enter “BEFORE” rather than “BEF”, but then the next person will ask for language specific terms as well such as “FØR” in Norwegian.

In the current version “< 1940”, does work. A space should be between the two parts!

As does “> 1940” and “~ 1940”

Ken
Last edit: 1 year 4 months ago by norwegian_sardines.

Please Log in or Create an account to join the conversation.

  • bertkoor
  • Offline
  • Platinum Member
  • Platinum Member
  • Greetings from Utrecht, Holland
More
1 year 4 months ago #7 by bertkoor
I was talking about display of dates in compact ways.
Had the discussion shifted to data entry? I wasn't aware. There are data entry shortcuts available. These were documented in a help dialog. Has that been removed?

stamboom.BertKoor.nl runs on webtrees v2.1.20

Please Log in or Create an account to join the conversation.

  • nilsonj
  • Topic Author
  • Offline
  • New Member
  • New Member
More
1 year 4 months ago #8 by nilsonj
Fair, though not satisfying. I currently understand the spec to include before and after in the same entry, or does this have to be entered as a “between … and” string instead?

Incidentally in reply to your initial point, the shorthand strings were chosen in English. Why stop there? Shall we localize the lexicon? :) I still think expansion entry would be a useful addition, even if it’s not “language neutral”. Obviously expansion on the display side should be localized.

Please Log in or Create an account to join the conversation.

  • nilsonj
  • Topic Author
  • Offline
  • New Member
  • New Member
More
1 year 4 months ago #9 by nilsonj
A bit wandering, yes.

Aware of the shorthand mentioned, as well as using a hyphen for ranges, which is quite handy.

Please Log in or Create an account to join the conversation.

More
1 year 4 months ago #10 by hermann
In my opinion, it is not a good idea to translate or convert user input. For example, if a source says that a event happened "just before Eastern 1950" this should be entered as "INT 8 APR 1950 (just before Eastern 1950)". If an algorithm is changing "before" to "BEF" this would become maybe unintentionally "INT 8 APR 1950 (just BEF Eastern 1950)". Ok, it can be programmed to avoid such special cases.

It would be helpful if the user gets a warning when entering text other than the "normal" keywords.

You can find a description of how to enter date values in the German webtrees manual .

Hermann
Designer of the custom module "Extended Family"

webtrees 2.1.21 (all custom modules installed, PHP 8.3.12, MariaDB 10.6) @ ahnen.hartenthaler.eu

Please Log in or Create an account to join the conversation.

  • norwegian_sardines
  • Offline
  • Platinum Member
  • Platinum Member
More
1 year 4 months ago #11 by norwegian_sardines
Replied by norwegian_sardines on topic Date Display Formats for Ranges and Fuzzy Dates
Bert,

Yes this topic is about display, but the OP entered invalid GEDCOM and expected the display to understand the incorrect entry and display the desired output.

I was pointing out that when the entry is made correctly the desired output is achieved!

As far as your suggestion for display of “> 1940” when entering “after 1940” I have no opinion at this time because I have not though about it!


“ Fair, though not satisfying. I currently understand the spec to include before and after in the same entry, or does this have to be entered as a “between … and” string instead?”

Yes the correct date-range would be “BET <date> AND <date>”. You can only select one of the possible DATE_RANGE options!

Ken

Please Log in or Create an account to join the conversation.

More
1 year 4 months ago #12 by hermann

As far as your suggestion for display of “> 1940” when entering “after 1940” I have no opinion at this time because I have not though about it!

For me, it is ok as it is in text-oriented tables, ie. using translated words. But in diagrams space is important, so for example, the custom module GVExport uses the mathematical signs instead of words. For me, that looks great and is understandable in any language (I hope).

Hermann
Designer of the custom module "Extended Family"

webtrees 2.1.21 (all custom modules installed, PHP 8.3.12, MariaDB 10.6) @ ahnen.hartenthaler.eu

Please Log in or Create an account to join the conversation.

  • nilsonj
  • Topic Author
  • Offline
  • New Member
  • New Member
More
1 year 4 months ago #13 by nilsonj
Well this certainly went sideways. :) Thanks, Ken, for the direction toward my goal. The GEDCOM compliant tokens do work as programmed. The rest is a philosophical conversation.

For anyone following this thread and scratching head about what is allowed or why, the formal grammar of GEDCOM 5.5.5 is found here: www.gedcom.org/gedcom.html ... and specifies on page 85 indeed that the DATE phrase for approximations can have only one of the following conventions:
ABT|EST|CAL <DATE>
BEF <DATE>
AFT <DATE>
BET <DATE> AND <DATE>
... and for ranges ...
FROM <DATE> TO <DATE>
... additionally there are several comments about implementing DATE_PHRASE which are almost immediately deprecated, but I digress since they are specifically not interpreted.

So other casual language like "circa" or before-and-after pairs will not be part of the formal grammar, and are therefore *not allowed in the GEDCOM representation of the data*. The programmer in me will now say "Yep, I'm doing it wrong!" and recant my original request for "appropriate display of stored dates" because, as Ken pointed out, those were not appropriate syntax.

NOW ... philosophically, the non-programmer (and part of the UI designer) in me will say that the data input interface is made better (not worse) by allowing more natural language interpretation and convoluting it to the appropriately stored GEDCOM format so people can be people and data can be data. I'll stop beating that horse now, as it's essentially what this feature request has become with appropriate correction and direction ala Ken. Perhaps the other angle to look at here would be a better UI hint when entering dates that shows what the parser sees and what it might suggest instead. The text below the input box is the static representation of the prior stored value. It could easily be taken as the live parsed version of what's being entered for some significant feedback gains.

Since this has broadened in scope a bit, I should mention the following:
github.com/FamilySearch/gedcomx/blob/mas...mat-specification.md

Please Log in or Create an account to join the conversation.

  • norwegian_sardines
  • Offline
  • Platinum Member
  • Platinum Member
More
1 year 4 months ago #14 by norwegian_sardines
Replied by norwegian_sardines on topic Date Display Formats for Ranges and Fuzzy Dates
“ NOW ... philosophically, the non-programmer (and part of the UI designer) in me will say that the data input interface is made better (not worse) by allowing more natural language interpretation and convoluting it to the appropriately stored GEDCOM format so people can be people and data can be data.”

GEDCOM allows for natural language data entry, webtrees does not. The GEDCOM 5.5.1 specification uses the same field (tag) for both specific dates and date text, a bad design! If, as webtrees and other program do, perform calculations or use dates to organize data, having two kinds of data (controlled vocabulary, open and natural) in the same location become very problematic. In GEDCOM v7 this problem has been “fixed” by having two separate tags, (I’m not a fan of the implementation, but it is better).

webtrees could use a better, more structured, UI that forces the controlled vocabulary to be used, but this would mean an increased page footprint for the data entry area, and users like myself would complain about the “clicky” UI. I know what I want to enter, why force me to click and click!

webtrees already has a lot of “forgiveness” with dates. You can already key in April 1 1923 and it will convert it to proper GEDCOM. Some other entry concepts are also allowed, so you are not required to completely know GEDCOM vocabulary.

Ken

Please Log in or Create an account to join the conversation.

  • nilsonj
  • Topic Author
  • Offline
  • New Member
  • New Member
More
1 year 4 months ago #15 by nilsonj
Forgiveness I like. Meant to recognize that earlier if that wasn't clear. I'm not sure I'd call GEDCOM "natural language" (unless you're alluding to the DATE_PHRASE object), but I can see how that point has validity.

Not a fan of mixing data types, either. My argument here is for the data entry form (on the web frontend, not the DB or the GEDCOM storage facility) to do the parsing and correcting of aberrant entry so as to coax the user's truly natural language into something stored in the lowest common denominator, not to allow "BEFORE" and "AFTER" to make it down to the GEDCOM. As it is, the UI simply makes the entire date string uppercase and transforms different date formats into GEDCOM date format with dd MON yyyy. I wouldn't want the UI to be more structured (checkboxes for ABT, dropdowns for BEF/AFT, etc.) -- I agree, less clicky is better. I'd complain about that, too. The more time my hands are on the keyboard, the faster data entry goes. It's my mind that doesn't get used to typing BEF instead of BEFORE, especially when the interface capitalizes it and rearranges things so they look like they were parsed and accepted as valid. That's the UI hint that prevented me from understanding the failure of my data entry. That's, at minimum, the part that could be better.

Hopefully that clarifies things a bit?

Incidentally, you mention two separate tags for v7. I assume you mean to separate the DATE and DATE_PHRASE parts of the grammar? They were technically always separated by the use of parenthetical tokens, though the comments I've read suggest that software has been poorly implemented to ignore them for many years. Or were you highlighting a different aspect I'm unaware of?

Appreciate your willingness to explore the topic.

Please Log in or Create an account to join the conversation.

  • norwegian_sardines
  • Offline
  • Platinum Member
  • Platinum Member
More
1 year 4 months ago #16 by norwegian_sardines
Replied by norwegian_sardines on topic Date Display Formats for Ranges and Fuzzy Dates
I’ve been entering GEDCOM syntax since the 1980s, long before webtrees, so it come naturally to me.

Yes, I do see the date_phrase as “natural language” in that the user could enter exactly what the source says (a good thing) rather than interpret the data (problematic). This allows for the possibility of entering data in the primary language or source info like, “born on Christmas the year after they were wed.”

Obviously, this can’t be used in a calculation for age, and during data collect I may not yet know the date of the wedding so recording of the date information must be entered as a NOTE until the wedding date is determined.

I however understand you want natural language to be used to enter the information but have it converted to native GEDCOM vocabulary, but, my question is, what languages do we support, where would the processing of this AI occur (server or client) and will that platform (and bandwidth) support the added cycles for ALL webtrees users?

Ken

Please Log in or Create an account to join the conversation.

Powered by Kunena Forum