Question
DNA matching
- thomas52
-
Topic Author
- Offline
- Premium Member
-
- Western North Carolina
Research is what I’m doing when I don’t know what I’m doing – Wernher von Braun
Please Log in or Create an account to join the conversation.
- RickM
-
- Offline
- Junior Member
-
I'm not sure about Lists Charts etc, but someone might be able to point you in the right direction.
Well Done!!
Rick M
Tauranga
Nu Tirani - New Zealand
alineofmalkins.com/wt-malkins
Webtrees v2.2.1
Hosted by HostAmarada on litespeed Server, PHP v8.3
Offline: WebTrees v2.2.2-dev
Laragon v6, PHP v8.3.15
Please Log in or Create an account to join the conversation.
- hermann
-
- Offline
- Elite Member
-
Hermann
Designer of the custom module "Extended Family"
webtrees 2.2.1 (all custom modules installed, PHP 8.3.12, MariaDB 10.6) @ ahnen.hartenthaler.eu
Please Log in or Create an account to join the conversation.
- drblam
-
- Offline
- Junior Member
-
- Posts: 201
If it is merely to verify relationships between Individuals you have already added to your tree, then by all means use the ASSOC tags to link existing people to show how much DNA they share. On the other hand, if you want to figure out how some previously-unknown "DNA relative" is connected to your tree, then there is a lot more that you can do. However, it's not clear how to use webtrees to document the information....
We were very lucky to have gotten both of my wife's elderly parents to submit DNA tests to 23andMe before they passed away. Since the parents' DNA gets randomly shuffled as it is passed to each child, it is always MUCH easier to determine relationships using DNA evidence from older generations than younger ones.
23andMe gives you a list of approximately 1000 other people who share DNA with your profile, giving a numeric score for the amount of shared DNA between every pair of individuals on this list. I have found several people in this list whose profiles say that (a) they were adopted and don't know their biological parents, or (b) their mother never told them who was their biological father. In other cases, the common relative is too far back to be recognized. Using DNA relationships in combination with conventional family tree information, I have been able to figure out many of these connections.
The key step is to create a similarity matrix to show how much DNA each person shares with everyone else, in order to help answer questions about how you are related to unknown people. Once you have the data, you can reorder the rows and columns to group people with similar connection patterns. For example, in the case of my father-in-law, it is very easy to partition the similarity matrix according to whether the unknown person is related to his mother or father. For people related to the paternal grandmother, I can determine if the unknown person is related to the paternal grandmother's mother or father -- who were born almost 200 years ago.. And for people related to the paternal grandfather, I can determine which of his siblings (who were born about 150-180 years ago) is the ancestor of the unknown individual.
Please Log in or Create an account to join the conversation.
- thomas52
-
Topic Author
- Offline
- Premium Member
-
- Western North Carolina
Research is what I’m doing when I don’t know what I’m doing – Wernher von Braun
Please Log in or Create an account to join the conversation.
- thomas52
-
Topic Author
- Offline
- Premium Member
-
- Western North Carolina
Research is what I’m doing when I don’t know what I’m doing – Wernher von Braun
Please Log in or Create an account to join the conversation.
- drblam
-
- Offline
- Junior Member
-
- Posts: 201
Sorry, I should have explained it better but I was in a hurry....I don't follow your matrix construction. Would you please contact me via my website @ www.adkins.ws ?
23andMe has a feature that allows you to view a long list of "DNA relatives" (roughly 1000 people, in my experience), and for each one they give a numeric score between 0 and 1 for the DNA similarity between you and the other person. (It is a proprietary combination of factors, but it doesn't really matter how it's calculated.)
If you click on any of those people, you can see a secondary list of "Relatives in common", who share DNA with BOTH you and the target individual -- and for person P in this list they display the same pairwise numeric DNA similarity score between you and P, and between the target person and P.
Using this data, one can construct a similarity matrix which has one row and one column for each of your "DNA relatives", including yourself. (In the sample matrix I provided, I hid the header row and column that shows each person's name for privacy - which makes it really hard to figure out what is going on. Sorry....) I found it was worth including all DNA relatives whose DNA score was at least 0.005 (i.e., 0.5 percent); beyond that things were getting too noisy to be interesting.
To help me visualize the data, I color coded the cells. They are pink containing 1.0 on the diagonal because your DNA is a perfect match to itself. You can also see some bright turquiose with values 0.5 or 0.25, which represent parent/child or sibling relationships. Note that these turquoise cells highlight people who are very closely related to each other -- but not to me. I also colored in all cells where the level of DNA matching was "significant".
The next step was to reorder the rows and columns to group people with similar DNA profiles together. In particular, people who share a lot of DNA with each other should be placed close together, so you see the bright turqoise cells are clustered along the diagonal. Applying clustering algorithms to similarity matrices is a well-studied problem in Computer Science, but for this case I just did it by manually moving rows and columns around in a spreadsheet.
For most people, there is almost no DNA sharing between the maternal relatives and paternal relatives, so I split the data into a maternal similarity matrix and a separate paternal similarity matrix. (The sample I included is the paternal matrix.)
After all that, I'm left with the matrix in the form I showed you, which shows DNA relatives for "Dad" (my wife's father, who gave his DNA to 23andMe) but only on his father's side. Dad's data is shown in the first row and column, and my wife's data is shown in the second row and column. Dad's father had one sibling, who married several times. Data for his direct descendants is shown next in a block of orange rows and columns. After that, you can see people who are related to Dad's paternal grandfather (born in 1851) in blue, and Dad's paternal grandmother (born in 1855) in yellow. I have been able to further split the blue band into separate sections for descendants of each of the paternal grandfather's siblings, and split the yellow band into separate sections for descendants of the paternal grandmother's father and mother.
Please Log in or Create an account to join the conversation.
- thomas52
-
Topic Author
- Offline
- Premium Member
-
- Western North Carolina
Research is what I’m doing when I don’t know what I’m doing – Wernher von Braun
Please Log in or Create an account to join the conversation.
- drblam
-
- Offline
- Junior Member
-
- Posts: 201
The similarity matrix is not a "tool" -- just a spreadsheet I created at home, using pairwise similarity scores obtained from 23andMe.I don't have any experience with any DNA testing service besides 23andMe. I know the similarity score they use as a default is proprietary, but it's just a secret combination of number of shared cMs, number of chromosomes with some overlap and perhaps other factors such as the lengths of individual chunks of shared base pairs. Any method of measuring DNA similarity between people could be used.So I understand this is a proprietary tool of 23andMe?
The important part is that 23andMe allows me to view not only the similarity score between me and person X, and between me and person Y, but also the similarity score between person X and person Y. If you can get that from Ancestry or whatever DNA testing service you use, then you can do the same thing.
BTW, creating the clusters is a mechanical process of reordering rows and columns to bring together people with similar DNA sharing patterns. However, labeling each cluster (e.g., to say this group is people descended from a particular sibling of Dad's paternal grandfather) involves some sleuthing. In some cases, a person belonging to that cluster is already in my tree. In others, the person's profile has a list of family surnames which includes a name that I recognize from my own tree. Occasionally, I contact the person through 23andMe and work with them to figure out how we are connected.
Please Log in or Create an account to join the conversation.
- RickM
-
- Offline
- Junior Member
-
Rick M
Tauranga
Nu Tirani - New Zealand
alineofmalkins.com/wt-malkins
Webtrees v2.2.1
Hosted by HostAmarada on litespeed Server, PHP v8.3
Offline: WebTrees v2.2.2-dev
Laragon v6, PHP v8.3.15
Please Log in or Create an account to join the conversation.
- Franz Frese
-
- Offline
- Platinum Member
-
Wat is the usage for such a private entry?
This message contains secure information
Please Log in or Create an account to join the conversation.
- bertkoor
-
- Offline
- Platinum Member
-
- Greetings from Utrecht, Holland
stamboom.BertKoor.nl runs on webtrees v2.2.1
Please Log in or Create an account to join the conversation.
- RickM
-
- Offline
- Junior Member
-
Yeah, Sorry Franz, My query was a little off topic and I could not find another way to PM the party.. Not wanting to hi jack the topic.
Wat is the usage for such a private entry?
This message contains secure information
Rick M
Tauranga
Nu Tirani - New Zealand
alineofmalkins.com/wt-malkins
Webtrees v2.2.1
Hosted by HostAmarada on litespeed Server, PHP v8.3
Offline: WebTrees v2.2.2-dev
Laragon v6, PHP v8.3.15
Please Log in or Create an account to join the conversation.