Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synteny depth pattern and blocks #449

Open
cdanmaigona opened this issue Mar 7, 2022 · 3 comments
Open

Synteny depth pattern and blocks #449

cdanmaigona opened this issue Mar 7, 2022 · 3 comments

Comments

@cdanmaigona
Copy link

cdanmaigona commented Mar 7, 2022

Hello Haibao,

A follow-up question on this. When I run
python -m jcvi.compara.synteny depth --histogram F1.F4.anchors --depthfile=F1.F4.depth
I get this

Genome F1 depths:
Depth 0: 704 of 16,800 (4.2%)
Depth 1: 15,612 of 16,800 (92.9%)
Depth 2: 173 of 16,800 (1.0%)
Depth 3: 138 of 16,800 (0.8%)
Depth 4: 106 of 16,800 (0.6%)
Depth 5: 57 of 16,800 (0.3%)
Depth 6: 10 of 16,800 (0.1%)
Genome F4 depths:
Depth 0: 2,998 of 19,588 (15.3%)
Depth 1: 16,130 of 19,588 (82.3%)
Depth 2: 340 of 19,588 (1.7%)
Depth 3: 120 of 19,588 (0.6%)
[08:41:07 PM] DEBUG Depth written to F1.F4. synteny.py:1773
F1 vs F4 syntenic depths
1:1 pattern

From the explanation on your wiki, there are up to 6 F4 blocks per F1 gene

however, when I run this
python -m jcvi.compara.synteny stats F1.F4.i6.blocks
to get the statistics on my blocks and actual duplicate genes, the numbers do not correlate.

Count 0: 1,450 of 16,800 (8.6%)
Count 1: 15,052 of 16,800 (89.6%)
Count 2: 87 of 16,800 (0.5%)
Count 3: 83 of 16,800 (0.5%)
Count 4: 48 of 16,800 (0.3%)
Count 5: 80 of 16,800 (0.5%)

Total lines with matches: 15,350 of 16,800 (91.4%)
Count 1: 15,052 of 15,350 (98.1%)
Count 2: 87 of 15,350 (0.6%)
Count 3: 83 of 15,350 (0.5%)
Count 4: 48 of 15,350 (0.3%)
Count 5: 80 of 15,350 (0.5%)

The numbers do not correspond to what I'm getting with the depth command. I can only see a maximum of 5 duplicates when the depth analysis shows up to 6. Please help me understand what I'm missing.

Thank you!!

Originally posted by @cdanmaigona in #235 (comment)

@cdanmaigona
Copy link
Author

Hello Haibao

If I'm interested in extracting all possible duplicate genes in a comparison will this command be most appropriate?

python -m jcvi.compara.synteny mcscan F1.bed F1.F4.lifted.anchors --iter=6 -o F1.F4.i6.blocks

@tanghaibao
Copy link
Owner

@cdanmaigona

This is possibly the easiest that you can do.
However, the caveat is that you'll miss some duplicate genes that are only present (in multiple copies) in F4 but not in F1.

Haibao

@cdanmaigona
Copy link
Author

cdanmaigona commented Mar 9, 2022

Thanks Haibao ,

That makes sense, but I am also interested in accounting for those multiple duplicates only present within F4 and only present within F1. How can I extract those?

Catherine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants