Dear Professor Gabor, I would like to thank you immensely. I'm working with runs of homozygosity in cattle and I've been learning a lot from your videos and the book. In an increasingly individualistic world, I congratulate you for sharing your knowledge.
@user-ri4wn7jh7v
8 ай бұрын
Thank you so much for simplifying a process that seemed too complicated.
@mohammadj.shamim9342
2 жыл бұрын
Thank you so much. To be honest they are very informative and helpful. I was very interested in quality control as it is fundamental in GWAS analysis.
@artjamwithross4374
3 жыл бұрын
Thank you so much for these very useful videos! 🙏🏼👍🏼 They have guided me so much when using plink 👍🏼
@moslemmoghbeli4325
7 ай бұрын
thank you, this is amazing
@samrawittsehay1610
3 жыл бұрын
Thank you Prof. Gabor for your continuous tutorial. I was wondering why the sum of observed homozygosity and observed heterozygosity is different from one, even though we did the QC. Theoretically, the sum of observed homozygosity and observed heterozygosity should give one.
@GenomicsBootCamp
3 жыл бұрын
Thanks for your comment! Could you point out which SNP you have in mind? I would like to check in detail. For now, my guess is that the difference comes from the way of how the missing values are counted. Even after QC, we are left with some missing values, that were below the specified thresholds.
@samrawittsehay1610
3 жыл бұрын
@@GenomicsBootCamp Okay, I will contact you via mail
@michegn
Жыл бұрын
Dear Professor Gabor. Thanks for developing such a friendly tutorial series. Im trying to run all your proposed quality control criteria as in line 15. But R gives me an error "Error: --geno accepts at most 1 parameter". It seems it only allows one parameter at a time. My command was"> system("plink --bfile AIM_ped_2023_03_21 --geno 0.1 --mind 0.1 --maf 0.05 --hwe 0.0000001 --allow-no-sex --nonfounders --make-bed --out afterQC")". I really appreciate your guidance. once again THANKS!
@GenomicsBootCamp
Жыл бұрын
This error comes up when there is a missing space or similar. For you, there seems to be a weird dash just after geno's 0.1 Try to re type that. It should work as you wrote, so there is a typo or similar...
@emmanouilathanasakis310
11 ай бұрын
Dear Professor Gabor, great lessons! Can I ask if it possible to illustrate how to convert plink genotyping data to VCF file format? There are several pre-processing steps before conversion as also after conversion and to clarify the steps will be great! Thanks in advance
@GenomicsBootCamp
11 ай бұрын
There is a similar video on the channel, maybe this is what you are looking for? Convert between PLINK to VCF file formats kzitem.info/news/bejne/poB60aKogXOci6g
@jadecelis6838
Жыл бұрын
Thank you for the clear explanation! For a SNP dataset I have to do QC for hardy weinberg equilibrium and heterozygosity. Do you have a video on how to do this as well? And how to choose the window when using the function '--indep-pairwise'?
@GenomicsBootCamp
Жыл бұрын
Hi, the HW check is covered in the general quality control video. About the LD pruning, I am uncertain if there is a "best" window to choose, I go with the defaults here, tbh.
@jadecelis6838
Жыл бұрын
Thank you! Your videos are extremely helpful!
@bellofolaniyi5546
Жыл бұрын
Thank you Prof Gabor. Thank you for your explanation. Could - - dog option be used for chicken consider that they both have equal number of chromosomes?
@GenomicsBootCamp
Жыл бұрын
Yes, technically possible, but it is not a good practice (although you might see a similar approach in my older videos). So it works, but much better is to specify the exact chromosome number using --chr-set. As you see in the link below, the --dog is just a shortcut for --chr-set 38 www.cog-genomics.org/plink/1.9/input#chr_set
@HuyHoiHay
2 жыл бұрын
thank you
@elielsonveloso7517
2 жыл бұрын
Dear Professor Gabor, Thanks a lot for such an interesting tutorial! I was wondering which of these QC parameters and tresholds I should use when working with SNP panel data rather than exome ou whole genome sequencing. Would you have any reference or guidelines for such application ?
@GenomicsBootCamp
2 жыл бұрын
Hi, to clarify: You work with exome and whole genome sequence data? From your comment I understod that you work with SNP, which is the same data type as in the video, so the thresholds there would fit you. In case you work with WGS data of sorts, those should have their own process of quality control, already before the variant calling. If you go further to e.g. extract the SNPs you have in the WGS data, than you can make a PLINK file out of them and see how the current quality control criteria hold up.
@ccdj35
2 жыл бұрын
8:28 There is something bothering me with deleting alleles that occurs less than 5%. Doesn't assuming that these differences could be a mistake create a situation where we oversee we are losing some actual alleles here?
@GenomicsBootCamp
2 жыл бұрын
The low frequency alleles are not deleted because they would be faulty. In some cases we do not need, or do not want them in our data set. But each and every time we need to consider if any of the quality control parameters are needed or not. For example using a MAF threshold carelessly in Fst calculations might actually delete the results we are looking for. There are more such examples. So one should be careful all the time!
@shivambhardwaj2683
2 жыл бұрын
Dear professor Gabor, for genomic data analysis we are interested to identify the loci undergoing selection. Most of the animal samples are taken from established herds, i.e. no random mating. Then why do we use HWE quality check where we are keeping only those loci which are not showing deviation from equilibrium ?
@GenomicsBootCamp
2 жыл бұрын
Greetings! When implemented in the quality control, the HWE check is not supposed to limit only loci strictly behaving according to the HWE expectations. It is rather a tool to remove those SNPs with a huge difference between observed and expected proportions. But one should be **careful** in the quality check all the time. Sometimes we look exactly for these SNPs, so a removal / HWE filtering in QC is disastrous for the results. Also, if we study multiple populations together, e.g. for admixture, or Fst, a HWE filtering on the joint data could remove the most interesting SNPs. So to summarize, HWE check is implemented, to filter out potentially wrongly genotyped SNPs, but its use should be considered on a case-by-case basis.
@shivambhardwaj2683
2 жыл бұрын
@@GenomicsBootCamp Thank you professor 👍
@getinetmekuriawtarekegn1916
3 жыл бұрын
Dear Gabor, thanks for the very important guidance of managing genomic data. I have one problem: I want upgrade the R in rstudio. Plink works in recent R version. And installed R version 4.0.5; however, when I check the library in rstudio it has not changed (see lib="/Library/Frameworks/R.framework/Versions/4.0/Resources/library"). I upgraded the rstudio too but still R version, 4.0, couldnt change to 4.0.5. Could you advise me plse?
@GenomicsBootCamp
3 жыл бұрын
Hi, PLINK should work with any R version, as we really use just the system() option here... To change to other R versions in R Studio, go to Tools>Global options>General tab on the left Right on the top, there is the "R version" currently in use. Just hit the "Change" button near it, and give your choice there.
@mwanganamubita9617
Жыл бұрын
@GenomicsBootCamp Dear Prof. Gabor, when preparing the map file for PLINK, how should SNPs be ordered if there are SNPs on several chromosomes and genetic position is set to 0? Does ordering in terms of the physical positions take precedence over the chromosome number?
@GenomicsBootCamp
Жыл бұрын
Hi, normally the map files are ordered according to chromosome (ascending) and then by base pair position of the SNP within chromosome (ascending). But tbh, I have seen map files ordered by alphabetic order of SNP names (so a total mess chromosome and base pair position wise) and was worked just fine. Thus I assume PLINK has an internal mechanism to figure out the correct order and just use it.
@mwanganamubita9617
Жыл бұрын
Thanks Prof. Gabor. Just a quick question, what's the generally accepted HWE threshold?p < 0.05 with Boniferron correction?
@GenomicsBootCamp
Жыл бұрын
That is an interesting approach... If HWE is to be used, In generally use it to see if there are some SNPs that do not comply with the expectations. For this I use 10-5 or 10-6 p value, so e.g. 0.00001 without stronger justification.
@sowadanognigamal134
Жыл бұрын
dear prof, I really appreciate the work you are doing for us. my question is that I'm currently working on rice SNP data. I download the 3k snp data from SNBI but I don't know how to extract the ones that match my accessions and create my own files. kindly guide me. much aprereciet
@GenomicsBootCamp
Жыл бұрын
Hi, Your request to "match" data is not clear to me. Do you have your own genotypes, and you want to merge them with some open access data? Or you just want to select specific subset of the many open access data you downloaded? Or you mea something else?
@georgewanjala4605
3 жыл бұрын
It looked like R sofware.
@GenomicsBootCamp
3 жыл бұрын
The question is not clear to me. Do you mean the system() command in R. If yes, this could be invoked just from R itself.
Пікірлер: 35