Biostatistics - Exam
  • 1. Biostatistics is a branch of statistics that deals with data related to living organisms. It involves the design, analysis, and interpretation of data in fields such as biology, medicine, public health, and environmental science. Biostatistics plays a crucial role in research studies, clinical trials, and public health initiatives by providing statistical methods to analyze data, draw conclusions, and make informed decisions. It helps in understanding patterns of diseases, identifying risk factors, evaluating treatment interventions, and predicting health outcomes. Biostatisticians use their expertise in statistical theory and methods to address complex research questions and contribute to advancements in health science and policy.

    What is the purpose of hypothesis testing in biostatistics?
A) To determine if there is enough evidence to reject a null hypothesis.
B) To prove a hypothesis with 100% certainty.
C) To calculate standard deviation.
D) To estimate the population mean.
  • 2. In a clinical trial, what is the role of a control group?
A) To collect data from participants.
B) To analyze the results.
C) To provide a baseline for comparison to the treatment group.
D) To administer the treatment to participants.
  • 3. Which type of study design is best suited for determining cause and effect relationships?
A) Cross-Sectional Study
B) Observational Study
C) Case-Control Study
D) Randomized Controlled Trial
  • 4. Which statistical test can be used to compare more than two group means?
A) Two-Sample t-test
B) Chi-Square Test
C) ANOVA
D) Paired t-test
  • 5. What is the purpose of regression analysis?
A) To explore the relationship between a dependent variable and one or more independent variables.
B) To calculate probabilities.
C) To determine central tendency.
D) To estimate population parameters.
  • 6. Which type of sampling technique divides a population into subgroups and then samples each subgroup?
A) Systematic Sampling
B) Simple Random Sampling
C) Stratified Sampling
D) Cluster Sampling
  • 7. What does p-value indicate in hypothesis testing?
A) The sample size required for the study.
B) The probability of obtaining results as extreme as the observed results, assuming the null hypothesis is true.
C) The confidence interval of the estimate.
D) The strength of the relationship between variables.
  • 8. What is sensitivity in the context of diagnostic testing?
A) The proportion of false positive results.
B) The proportion of false negative results.
C) The proportion of true negative results among all individuals without the condition.
D) The proportion of true positive results among all individuals with the condition.
  • 9. What is biostatistics also referred to as?
A) Bioinformatics
B) Biomathematics
C) Biomechanics
D) Biometry
  • 10. Which field is closely related to medical statistics?
A) Biostatistics
B) Pharmacology
C) Epidemiology
D) Pathology
  • 11. Who started genetics studies by investigating segregation patterns in pea families?
A) Gregor Mendel
B) Charles Darwin
C) William Bateson
D) Francis Galton
  • 12. Who strongly disagreed with Galton's ideas on heredity?
A) Karl Pearson
B) Raphael Weldon
C) Arthur Dukinfield Darbishire
D) William Bateson
  • 13. Which group supported Mendel's ideas on genetic inheritance?
A) Mendelians
B) Neo-Darwinians
C) Biometricians
D) Darwinists
  • 14. Who developed the ANOVA and p-value concepts?
A) J. B. S. Haldane
B) Sewall G. Wright
C) Betty Allan
D) Ronald Fisher
  • 15. Who developed F-statistics and methods of computing them?
A) Ronald Fisher
B) Betty Allan
C) J. B. S. Haldane
D) Sewall G. Wright
  • 16. What did J. B. S. Haldane's book reestablish as the premier mechanism of evolution?
A) Mutation
B) Gene flow
C) Natural selection
D) Genetic drift
  • 17. Who banned the Friden calculator from his department at Caltech?
A) J. B. S. Haldane
B) Thomas Hunt Morgan
C) Sewall G. Wright
D) Ronald Fisher
  • 18. Which of the following is NOT a basic principle of experimental statistics?
A) Sample size determination
B) Local control
C) Randomization
D) Replication
  • 19. What should guide the formulation of a research question?
A) Data analysis perspectives.
B) An exhaustive literature review.
C) Cost considerations.
D) The experimental design.
  • 20. Which component of research planning involves defining how to ask a scientific question?
A) Experimental design.
B) Data analysis perspectives.
C) Costs involved.
D) The research question.
  • 21. Which principle of experimental statistics helps to eliminate bias?
A) Local control
B) Randomization
C) Cost estimation
D) Replication
  • 22. What is the first step in defining a research question according to the text?
A) Outlining experimental design.
B) Conducting an exhaustive literature review.
C) Estimating costs.
D) Determining data collection methods.
  • 23. In the formula for arithmetic mean, what does '∑' represent?
A) Product
B) Division
C) Summation
D) Difference
  • 24. Which cloud service provider is mentioned as a tool for statistical analysis in biological data?
A) IBM Cloud
B) Amazon Web Services
C) Microsoft Azure
D) Google Cloud Platform
  • 25. Which software is used for linear algebra computations?
A) SciPy
B) NumPy
C) SageMath
D) LAPACK
  • 26. What does a Pearson correlation coefficient value of -1 indicate?
A) A perfect positive correlation
B) An undefined relationship
C) No linear correlation
D) A perfect negative correlation
  • 27. Which database is dedicated to Arabidopsis thaliana?
A) Phytozome
B) KEGG
C) dbSNP
D) TAIR
  • 28. Which database stores assemblies and annotation files of dozens of plant genomes?
A) dbSNP
B) TAIR
C) KEGG
D) Phytozome
  • 29. What is another term for a scatter plot?
A) Pie chart
B) Line graph
C) Bar diagram
D) Scatter chart
  • 30. What is a scatter plot also known as?
A) Histogram
B) Pie chart
C) Scattergram
D) Bar chart
  • 31. Which distribution was initially used for RNA-Seq counts data but underestimated sample error?
A) Negative Binomial
B) Binomial
C) Poisson
D) Normal
  • 32. What major initiative relates data from DDBJ, EMBL-EBI, and NCBI?
A) World Data Exchange Program
B) Bioinformatics Data Consortium
C) Global Genome Initiative
D) International Nucleotide Sequence Database Collaboration (INSDC)
  • 33. What does a significance level (α) represent in hypothesis testing?
A) The range of values for a confidence interval
B) The probability that the null hypothesis is true
C) The correlation coefficient between two variables
D) The acceptable error rate when deciding statistical significance
  • 34. Which statistical models are used to perform tests for statistical significance in RNA-Seq data analysis?
A) ANOVA
B) Linear regression models
C) Chi-square tests
D) Generalized linear models
  • 35. Which type of graph is best suited for showing changes over time?
A) Bar chart
B) Pie chart
C) Line graph
D) Histogram
  • 36. What is a genome-wide association study (GWAS) based on?
A) Linkage disequilibrium.
B) Recombination frequency.
C) Quantitative trait loci.
D) Genomic selection.
  • 37. Which biostatistical method has gained popularity for statistical classification?
A) Re-sampling methods
B) Bootstrapping
C) Decision trees
D) Random forests
  • 38. What does marker-assisted selection aim to improve?
A) Genomic selection models.
B) Clinical decision support systems.
C) Breeding outcomes in agriculture.
D) Quantitative trait mapping.
  • 39. Which software package allows for variance component estimation under a general linear mixed model using REML?
A) CycDesigN
B) Orange
C) ASReml
D) SAS
  • 40. How does a well-defined research question benefit the scientific community?
A) By simplifying data analysis.
B) By minimizing costs.
C) By adding value through novel insights.
D) By reducing the need for replication.
  • 41. In a line graph, which axis typically represents time?
A) Time is not represented in a line graph
B) Both axes equally represent time
C) The horizontal axis
D) The vertical axis
  • 42. What is the formula for calculating the total number of observations (N) in a frequency table?
A) N = fi / N
B) N = fi - N
C) N = fi * N
D) N = f1 + f2 + f3 + ... + fn
  • 43. Which programming language is associated with deep-learning and image analysis in bioinformatics?
A) R
B) SQL
C) SAS
D) Python
  • 44. Which programming language is known for its open-source environment and statistical computing capabilities, with packages available on CRAN?
A) SQL
B) MATLAB
C) Python
D) R
  • 45. Which symbol represents the arithmetic mean in mathematical notation?
A) x̄
B) i
C) Σ
D) n
  • 46. What technique considers the perturbation of whole gene sets rather than single genes?
A) Principal component analysis
B) Gene Set Enrichment Analysis (GSEA)
C) Next-generation sequencing
D) Linear discriminant analysis
  • 47. Which aspect of research planning involves determining how to collect data?
A) Research question formulation.
B) Hypothesis testing.
C) Data collection methods.
D) Cost estimation.
  • 48. Which database is used for indexing scientific articles?
A) Gene Ontology
B) PubMed
C) KEGG
D) dbSNP
  • 49. What is the term for high intercorrelation between predictors in biostatistical settings?
A) Gene Set Enrichment Analysis
B) Dimensionality reduction
C) Principal component analysis
D) Multicollinearity
  • 50. Which software is a Java-based tool for machine learning and data mining?
A) Orange
B) SAS
C) Weka
D) R
  • 51. Which biostatistical method helps in reducing dimensionality by transforming predictors into a smaller set of uncorrelated components?
A) Gene Set Enrichment Analysis
B) Principal component analysis
C) Linear regression
D) Logistic regression
  • 52. Which software supports Quantitative Response Assays for regulated environments such as drug testing?
A) PLA 3.0
B) SAS
C) Weka
D) Apache Spark
  • 53. In which field is the design and analysis of clinical trials particularly important?
A) Animal breeding
B) Quantitative genetics
C) Systems medicine
D) Public health
  • 54. Which tool is used for high-level data processing, data mining, and visualization?
A) ASReml
B) PLA 3.0
C) CycDesigN
D) Orange
  • 55. Which mapping algorithm is not commonly used in QTL mapping?
A) Interval Mapping
B) Composite Interval Mapping
C) None of the above
D) Multiple Interval Mapping
  • 56. Which database focuses on SNPs?
A) Gene Ontology
B) dbSNP
C) KEGG
D) PubMed
  • 57. Who introduced histograms as a graphical representation?
A) Francis Galton
B) John Tukey
C) Karl Pearson
D) Ronald Fisher
Created with That Quiz — the math test generation site with resources for other subject areas.