Referencias

Abrams, B. 2003. The Pit of Success. Https://blogs.msdn.microsoft.com/brada/2003/10/02/the-pit-of-success/.

Baggerly, K, and K Coombes. 2009. “Deriving Chemosensitivity from Cell Lines: Forensic Bioinformatics and Reproducible Research in High-Throughput Biology.” The Annals of Applied Statistics 3 (4): 1309–34.

Bartley, E AND Schliep, M . AND Hanks. 2019. “Identifying and Characterizing Extrapolation in Multivariate Response Data.” PLOS ONE 14 (December): 1–20.

Biecek, Przemyslaw, and Tomasz Burzykowski. 2021. Explanatory Model Analysis. Chapman; Hall/CRC, New York. https://ema.drwhy.ai/.

Bohachevsky, I, M Johnson, and M Stein. 1986. “Generalized Simulated Annealing for Function Optimization.” Technometrics 28 (3): 209–17.

Bolstad, B. 2004. Low-Level Analysis of High-Density Oligonucleotide Array Data: Background, Normalization and Summarization. University of California, Berkeley.

Box, GEP, W Hunter, and J Hunter. 2005. Statistics for Experimenters: An Introduction to Design, Data Analysis, and Model Building. Wiley.

Bradley, R, and M Terry. 1952. “Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons.” Biometrika 39 (3/4): 324–45.

Breiman, L. 1996a. “Bagging Predictors.” Machine Learning 24 (2): 123–40.

Breiman, L. 1996b. “Stacked Regressions.” Machine Learning 24 (1): 49–64.

Breiman, L. 2001a. “Random Forests.” Machine Learning 45 (1): 5–32.

Breiman, L. 2001b. “Statistical Modeling: The Two Cultures.” Statistical Science 16 (3): 199–231.

Carlson, B. 2012. “Putting Oncology Patients at Risk.” Biotechnology Healthcare 9 (3): 17–21.

Chambers, J. 1998. Programming with Data: A Guide to the S Language. Springer-Verlag.

Chambers, J, and T Hastie, eds. 1992. Statistical Models in S. CRC Press, Inc.

Claeskens, G. 2016. “Statistical Model Choice.” Annual Review of Statistics and Its Application 3: 233–56.

Cleveland, W. 1979. “Robust Locally Weighted Regression and Smoothing Scatterplots.” Journal of the American Statistical Association 74 (368): 829–36.

Craig–Schapiro, R, M Kuhn, C Xiong, et al. 2011. “Multiplexed Immunoassay Panel Identifies Novel CSF Biomarkers for Alzheimer’s Disease Diagnosis and Prognosis.” PLoS ONE 6 (4): e18850.

Cybenko, G. 1989. “Approximation by Superpositions of a Sigmoidal Function.” Mathematics of Control, Signals and Systems 2 (4): 303–14.

Danowski, T, J Aarons, J Hydovitz, and J Wingert. 1970. “Utility of Equivocal Glucose Tolerances.” Diabetes 19 (7): 524–26.

Davison, A, and D Hinkley. 1997. Bootstrap Methods and Their Application. Vol. 1. Cambridge university press.

De Cock, D. 2011. “Ames, Iowa: Alternative to the Boston Housing Data as an End of Semester Regression Project.” Journal of Statistics Education 19 (3).

Dobson, A. 1999. An Introduction to Generalized Linear Models. Boca Raton.

Durrleman, S, and R Simon. 1989. “Flexible Regression Models with Cubic Splines.” Statistics in Medicine 8 (5): 551–61.

Faraway, J. 2016. Extending the Linear Model with R: Generalized Linear, Mixed Effects and Nonparametric Regression Models. CRC press.

Fox, J. 2008. Applied Regression Analysis and Generalized Linear Models. Second. Sage.

Frazier, R. 2018. A Tutorial on Bayesian Optimization. https://arxiv.org/abs/1807.02811.

Freund, Y, and R Schapire. 1997. “A Decision-Theoretic Generalization of on-Line Learning and an Application to Boosting.” Journal of Computer and System Sciences 55 (1): 119–39.

Friedman, J. 1991. “Multivariate Adaptive Regression Splines.” The Annals of Statistics 19 (1): 1–141.

Friedman, J. 2001. “Greedy Function Approximation: A Gradient Boosting Machine.” Annals of Statistics 29 (5): 1189–232.

Friedman, J, T Hastie, and R Tibshirani. 2010. “Regularization Paths for Generalized Linear Models via Coordinate Descent.” Journal of Statistical Software 33 (1): 1.

Geladi, P., and B Kowalski. 1986. “Partial Least-Squares Regression: A Tutorial.” Analytica Chimica Acta 185: 1–17.

Gentleman, R, V Carey, W Huber, R Irizarry, and S Dudoit. 2005. Bioinformatics and Computational Biology Solutions Using R and Bioconductor. Springer-Verlag.

Good, I. J. 1985. “Weight of Evidence: A Brief Survey.” Bayesian Statistics 2: 249–70.

Goodfellow, I, Y Bengio, and A Courville. 2016. Deep Learning. MIT Press.

Guo, Cheng, and Felix Berkhahn. 2016. Entity Embeddings of Categorical Variables. http://arxiv.org/abs/1604.06737.

Hand, D, and R Till. 2001. “A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems.” Machine Learning 45 (August): 171–86.

Hill, A, P LaPan, Y Li, and S Haney. 2007. “Impact of Image Segmentation on High-Content Screening Data Quality for SK-BR-3 Cells.” BMC Bioinformatics 8 (1): 340.

Ho, T. 1995. “Random Decision Forests.” Proceedings of 3rd International Conference on Document Analysis and Recognition 1: 278–82.

Hosmer, D, and Sy Lemeshow. 2000. Applied Logistic Regression. John Wiley; Sons.

Hvitfeldt, E., and J. Silge. 2021. Supervised Machine Learning for Text Analysis in r. A Chapman & Hall Book. CRC Press. https://smltar.com/.

Hyndman, R, and G Athanasopoulos. 2018. Forecasting: Principles and Practice. OTexts.

Ismay, C, and A Kim. 2021. Statistical Inference via Data Science: A ModernDive into r and the Tidyverse. Chapman; Hall/CRC. https://moderndive.com/.

Jaworska, J, N Nikolova-Jeliazkova, and T Aldenberg. 2005. “QSAR Applicability Domain Estimation by Projection of the Training Set in Descriptor Space: A Review.” Alternatives to Laboratory Animals 33 (5): 445–59.

Johnson, D, P Eckart, N Alsamadisi, H Noble, C Martin, and R Spicer. 2018. “Polar Auxin Transport Is Implicated in Vessel Differentiation and Spatial Patterning During Secondary Growth in Populus.” American Journal of Botany 105 (2): 186–96.

Joseph, V, E Gul, and S Ba. 2015. “Maximum Projection Designs for Computer Experiments.” Biometrika 102 (2): 371–80.

Jungsu, K, D Basak, and D Holtzman. 2009. “The Role of Apolipoprotein E in Alzheimer’s Disease.” Neuron 63 (3): 287–303.

Kerleguer, A., J.-L. Koeck, M. Fabre, P. Gérôme, R. Teyssou, and V. Hervé. 2003. “Use of Equivocal Zone in Interpretation of Results of the Amplified Mycobacterium Tuberculosis Direct Test for Diagnosis of Tuberculosis.” Journal of Clinical Microbiology 41 (4): 1783–84.

Kirkpatrick, S, D Gelatt, and M Vecchi. 1983. “Optimization by Simulated Annealing.” Science 220 (4598): 671–80.

Koklu, M, and IA Ozkan. 2020. “Multiclass Classification of Dry Beans Using Computer Vision and Machine Learning Techniques.” Computers and Electronics in Agriculture 174: 105507.

Krueger, T, D Panknin, and M Braun. 2015. “Fast Cross-Validation via Sequential Testing.” Journal of Machine Learning Research 16 (33): 1103–55.

Kruschke, J, and T Liddell. 2018. “The Bayesian New Statistics: Hypothesis Testing, Estimation, Meta-Analysis, and Power Analysis from a Bayesian Perspective.” Psychonomic Bulletin and Review 25 (1): 178–206.

Kuhn, Max. 2014. Futility Analysis in the Cross-Validation of Machine Learning Models. https://arxiv.org/abs/1405.6974.

Kuhn, M, and K Johnson. 2013. Applied Predictive Modeling. Springer.

Kuhn, M, and K Johnson. 2020. Feature Engineering and Selection: A Practical Approach for Predictive Models. CRC Press.

Lambert, D. 1992. “Zero-Inflated Poisson Regression, with an Application to Defects in Manufacturing.” Technometrics 34 (1): 1–14.

Littell, R, J Pendergast, and R Natarajan. 2000. “Modelling Covariance Structure in the Analysis of Repeated Measures Data.” Statistics in Medicine 19 (13): 1793–819.

Long, J. 1992. “Measures of Sex Differences in Scientific Productivity*.” Social Forces 71 (1): 159–78.

Lundberg, Scott M., and Su-In Lee. 2017. “A Unified Approach to Interpreting Model Predictions.” Proceedings of the 31st International Conference on Neural Information Processing Systems (Red Hook, NY, USA), NIPS’17, 4768–77.

Mangiafico, S. 2015. An R Companion for the Handbook of Biological Statistics. Https://rcompanion.org/handbook/.

Maron, O, and A Moore. 1994. “Hoeffding Races: Accelerating Model Selection Search for Classification and Function Approximation.” Advances in Neural Information Processing Systems, 59–66.

McCullagh, P, and J Nelder. 1989. Generalized Linear Models. Chapman; Hall.

McDonald, J. 2009. Handbook of Biological Statistics. Sparky House Publishing.

McElreath, R. 2020. Statistical Rethinking: A Bayesian Course with Examples in R and Stan. CRC press.

McInnes, L, J Healy, and J Melville. 2020. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction.

McKay, M, R Beckman, and W Conover. 1979. “A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output from a Computer Code.” Technometrics 21 (2): 239–45.

Micci-Barreca, Daniele. 2001. “A Preprocessing Scheme for High-Cardinality Categorical Attributes in Classification and Prediction Problems.” SIGKDD Explor. Newsl. (New York, NY, USA) 3 (1): 27–32. https://doi.org/10.1145/507533.507538.

Mingqiang, Y, K Kidiyo, and R Joseph. 2008. “A Survey of Shape Feature Extraction Techniques.” Chap. 3 in Pattern Recognition, edited by PY Yin. IntechOpen. https://doi.org/10.5772/6237.

Molnar, Christopher. 2020. Interpretable Machine Learning. Lulu.com. https://christophm.github.io/interpretable-ml-book/.

Mullahy, J. 1986. “Specification and Testing of Some Modified Count Data Models.” Journal of Econometrics 33 (3): 341–65.

Netzeva, T, A Worth, T Aldenberg, et al. 2005. “Current Status of Methods for Defining the Applicability Domain of (Quantitative) Structure-Activity Relationships: The Report and Recommendations of ECVAM Workshop 52.” Alternatives to Laboratory Animals 33 (2): 155–73.

Olsson, D, and L Nelson. 1975. “The Nelder-Mead Simplex Procedure for Function Minimization.” Technometrics 17 (1): 45–51.

Opitz, J, and S Burst. 2019. Macro F1 and Macro F1. https://arxiv.org/abs/1911.03347.

R Core Team. 2014. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. http://www.R-project.org/.

Rasmussen, C, and C Williams. 2006. Gaussian Processes for Machine Learning. In Gaussian Processes for Machine Learning. MIT Press.

Santner, T, B Williams, W Notz, and B Williams. 2003. The Design and Analysis of Computer Experiments. Springer.

Schmidberger, M, M Morgan, D Eddelbuettel, H Yu, L Tierney, and U Mansmann. 2009. “State of the Art in Parallel Computing with R.” Journal of Statistical Software 31 (1): 1–27. https://www.jstatsoft.org/v031/i01.

Schulz, E, M Speekenbrink, and A Krause. 2018. “A Tutorial on Gaussian Process Regression: Modelling, Exploring, and Exploiting Functions.” Journal of Mathematical Psychology 85: 1–16.

Shahriari, B., K. Swersky, Z. Wang, R. P. Adams, and N. de Freitas. 2016. “Taking the Human Out of the Loop: A Review of Bayesian Optimization.” Proceedings of the IEEE 104 (1): 148–75.

Shewry, M, and H Wynn. 1987. “Maximum Entropy Sampling.” Journal of Applied Statistics 14 (2): 165–70.

Shmueli, G. 2010. “To Explain or to Predict?” Statistical Science 25 (3): 289–310.

Symons, S, and RG Fulcher. 1988. “Determination of Wheat Kernel Morphological Variation by Digital Image Analysis: I. Variation in Eastern Canadian Milling Quality Wheats.” Journal of Cereal Science 8 (3): 211–18.

Thomas, R, and D Uminsky. 2020. The Problem with Metrics Is a Fundamental Problem for AI. https://arxiv.org/abs/2002.08512.

Tibshirani, Robert. 1996. “Regression Shrinkage and Selection via the Lasso.” Journal of the Royal Statistical Society. Series B (Methodological) 58 (1): 267–88. http://www.jstor.org/stable/2346178.

Van Laarhoven, P, and E Aarts. 1987. “Simulated Annealing.” In Simulated Annealing: Theory and Applications. Springer.

Wasserstein, R, and N Lazar. 2016. “The ASA Statement on p-Values: Context, Process, and Purpose.” The American Statistician 70 (2): 129–33.

Weinberger, K, A Dasgupta, J Langford, A Smola, and J Attenberg. 2009. “Feature Hashing for Large Scale Multitask Learning.” Proceedings of the 26th Annual International Conference on Machine Learning, 1113–20.

Wickham, H. 2019. Advanced r. 2nd ed. Chapman & Hall/CRC the r Series. Taylor & Francis. https://doi.org/10.1201/9781351201315.

Wickham, H, M Averick, J Bryan, et al. 2019. “Welcome to the Tidyverse.” Journal of Open Source Software 4 (43).

Wickham, H, and G Grolemund. 2016. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. O’Reilly Media, Inc.

Wolpert, D. 1992. “Stacked Generalization.” Neural Networks 5 (2): 241–59.

Wu, X, and Z Zhou. 2017. “A Unified View of Multi-Label Performance Measures.” International Conference on Machine Learning, 3780–88.

Wundervald, B, A Parnell, and K Domijan. 2020. Generalizing Gain Penalization for Feature Selection in Tree-Based Models. https://arxiv.org/abs/2006.07515.

Xu, Q, and Y Liang. 2001. “Monte Carlo Cross Validation.” Chemometrics and Intelligent Laboratory Systems 56 (1): 1–11.

Yeo, I-K, and R Johnson. 2000. “A New Family of Power Transformations to Improve Normality or Symmetry.” Biometrika 87 (4): 954–59.

Zeileis, A, C Kleiber, and S Jackman. 2008. “Regression Models for Count Data in R.” Journal of Statistical Software 27 (8): 1–25. https://www.jstatsoft.org/v027/i08.

Zumel, Nina, and John Mount. 2019. Vtreat: A Data.frame Processor for Predictive Modeling. http://arxiv.org/abs/1611.09477.