References
Abid, A., A. Abdalla, A. Ali, D. Khan, A. Alfozan, and J. Zou. 2022. Gradio: Hassle-Free Sharing and Testing of ML Models in the Wild. https://www.gradio.app/docs/.
Aggarwal, C. C. 2018. Machine Learning for Text. Springer.
Alam, S., L. Bălan, N. L. Chan, G. Comym, Y. Dada, I. Danov, L. Hoang, et al. 2022. Kedro. https://github.com/kedro-org/kedro.
Alnæs, M. S., and Project Jupyter. 2022. nbdime – Diffing and Merging of Jupyter Notebooks. https://nbdime.readthedocs.io.
Alquraan, A., H. Takruri, M. Alfatafta, and S. Al-Kiswany. 2018. “An Analysis of Network-Partitioning Failures in Cloud Systems.” In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), 51–68.
Altair. 2022. Altair: Declarative Visualization in Python. https://altair-viz.github.io/.
Amazon. 2021. Dynamic A/B Testing for Machine Learning Models with Amazon SageMaker MLOps Projects. https://aws.amazon.com/blogs/machine-learning/dynamic-a-b-testing-for-machine-learning-models-with-amazon-sagemaker-mlops-projects/.
Amazon. 2022a. Amazon Redshift Documentation. https://docs.aws.amazon.com/redshift/index.html.
———. 2022b. Amazon SageMaker Examples. https://sagemaker-examples.readthedocs.io/en/latest/sagemaker_model_monitor/index.html.
———. 2022c. AWS Cloud9 Documentation. https://docs.aws.amazon.com/cloud9.
———. 2022d. Machine Learning: Amazon Sagemaker. https://aws.amazon.com/sagemaker/.
Amazon Web Services. 2022a. Amazon Elastic Kubernetes Service Documentation. https://docs.aws.amazon.com/eks.
———. 2022b. Amazon Machine Images (AMI). https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AMIs.html.
———. 2022c. AWS Trainium. https://aws.amazon.com/machine-learning/trainium/.
Anaconda. 2022a. Conda for Data Scientists. https://docs.conda.io/projects/conda/en/latest/user-guide/concepts/data-science.html.
———. 2022b. Package, Dependency and Environment Management for Any Language. https://docs.conda.io.
Anderson, E., Z. Bai, C. Bishof, S. Blackford, J. Demmel, J. Dongarra, J. Du Croz, et al. 1999. LAPACK Users’ Guide. 3rd ed. SIAM.
Ansible Project. 2022. Ansible Documentation. https://docs.ansible.com/ansible/latest/index.html.
Apache Software Foundation. 2022a. Celery Executor. https://airflow.apache.org/docs/apache-airflow/stable/executor/celery.html.
———. 2022b. Impala Documentation. https://impala.apache.org/impala-docs.html.
Apple. 2022. TensorFlow 2 Conversion. https://coremltools.readme.io/docs/tensorflow-2.
Aquasecurity. 2022. Trivy Documentation. https://aquasecurity.github.io/trivy/.
Argo Project. 2022. Argo Workflow Documentation. https://argoproj.github.io/argo-workflows.
Arisholm, E., H. Gallis, T. Dybå, and D. I. K. Sjøberg. 2007. “Evaluating Pair Programming with Respect to System Complexity and Programmer Expertise.” IEEE Transactions on Software Engineering 33 (2): 5–86.
Arpteg, A., B. Brinne, L. Crnkovic-Friis, and J. Bosch. 2018. “Software Engineering Challenges of Deep Learning.” In Euromicro Conference on Software Engineering and Advanced Applications, 50–59. IEEE.
ArXiv. 2022. arXiv API Access. https://arxiv.org/help/api.
Atom. 2022. A hackable text editor for the 21st Century. https://atom.io/.
Ayer, A. 2022. git-crypt: Transparent File Encryption in Git. https://github.com/AGWA/git-crypt.
Batchelder, N., and et al. 2022. A Static Type Analyzer for Python Code. https://google.github.io/pytype.
Bates, D., and M. Maechler. 2021. Matrix: Sparse and Dense Matrix Classes and Methods. https://cran.r-project.org/web/packages/Matrix/.
BBC. 2018. Amazon Scrapped “Sexist AI” Tool. https://www.bbc.com/news/technology-45809919.
———. 2021a. Facebook Apology as AI Labels Black Men “Primates”. https://www.bbc.com/news/technology-58462511.
———. 2021b. Twitter Finds Racial Bias in Image-Cropping AI. https://www.bbc.com/news/technology-57192898.
Beam, A. L., A. K. Manrai, and M. Ghassemi. 2020. “Challenges to the Reproducibility of Machine Learning Models in Health Care.” Journal of the American Medical Association 323 (4): 305–6.
Beck, K. 2002. Test-Driven Development by Example. Addison-Wesley.
Beck, K., M. Beedle, A. Van Bennekum, A. Cockburn, W. Cunningham, M. Fowler, J. Grenning, et al. 2001. The Agile Manifesto. https://www.agilealliance.org/wp-content/uploads/2019/09/agile-manifesto-download-2019.pdf.
BentoML. 2022. Unified Model Serving Framework. https://docs.bentoml.org/en/latest/.
Bezanson, J., S. Karpinski, V. B. Shah, and et al. 2022. Style Guide: The Julia Language. https://docs.julialang.org/en/v1/manual/style-guide/index.html.
Bhupinder, K., M. Dugré, A. Hanna, and T. Glatard. 2021. “An Analysis of Security Vulnerabilities in Container Images for Scientific Data Analysis.” GigaScience 10 (6): giab025.
Bishop, C. M. 1995. “Training with Noise is Equivalent to Tikhonov Regularization.” Neural Computation 7 (1): 108–16.
Blackford, L. S., J. Demmel, J. Dongarra, I. Duff, S. Hammarling, G. Henry,. Heroux, et al. 2002. “An Updated Set of Basic Linear Algebra Subprograms (BLAS).” ACM Transactions on Mathematical Software 28 (2): 135–51.
Blagotic, A., D. Valle-Jones, J. Breen, J. Lundborg, J. M. White, J. Bode, K. White, et al. 2021. ProjectTemplate: Automates the Creation of New Statistical Analysis Projects. https://cran.r-project.org/web/packages/ProjectTemplate/.
Blei, D. M., A. Kucukelbir, and J. D. McAuliffe. 2017. “Variational Inference: A Review for Statisticians.” Journal of American Statistical Association 112 (518): 859–77.
Blischak, J. D., P. Carbonetto, and M. Stephens. 2022. workflowr: A Framework for Reproducible and Collaborative Data Science. https://cran.r-project.org/web/packages/workflowr.
Bogner, J., R. Verdecchia, and I. Gerostathopoulos. 2021. “Characterizing Technical Debt and Antipatterns in AI-Based Systems: A Systematic Mapping Study.” In 2021 IEEE/ACM International Conference on Technical Debt (TechDebt), 64–73.
Bokeh. 2022. Bokeh Documentation. https://docs.bokeh.org/en/latest/.
Bonawitz, K., H. Eichner, W. Grieskamp, D. Huba, A. Ingerman, V. Ivanov, C. Kiddon, et al. 2019. “Towards Federated Learning at Scale: System Design.” In Proceedings of Machine Learning and Systems, 374–88.
Braiek, H. B., and F. Khomh. 2020. “On Testing Machine Learning Programs.” Journal of Systems and Software 164: 110542.
Brandl, G., and the Sphinx Team. 2022. Sphinx: Python Documentation Generator. https://www.sphinx-doc.org/en/master/.
Brass, P. 2008. Advanced Data Structures. Cambridge University Press.
Breck, E., S. Cai, E. Nielsen, M. Salib, and D. Sculley. 2017. “The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction.” In IEEE International Conference on Big Data, 1123–32.
Breiman, L. 1996. Out-of-Bag Estimation. https://www.stat.berkeley.edu/pub/users/breiman/OOBestimation.pdf.
———. 2001a. “Random Forests.” Machine Learning 45 (1): 5–32.
———. 2001b. “Statistical Modeling: The Two Cultures.” Statistical Science 16 (3): 199–231.
Callon, Ross. 1996. The Twelve Networking Truths. https://rfc-editor.org/rfc/rfc1925.txt.
Canonical. 2022a. Cloud-Init Documentation. https://cloudinit.readthedocs.io/en/latest.
———. 2022b. MicroK8s Documentation. https://microk8s.io/docs.
Carpenter, B., A. Gelman, M. D. Hoffman, D. Lee, B. Goodrich, M. Betancourt, M. Brubaker, J. Guo, P. Li, and A. Riddell. 2017. “Stan: A Probabilistic Programming Language.” Journal of Statistical Software 76 (1): 1–32.
Cass, S. 2019. “Taking AI to the Edge: Google’s TPU Now Comes in a Maker-Friendly Package.” IEEE Spectrum 56 (5): 16–17.
Castillo, E., J. M. Gutiérrez, and A. S. Hadi. 1997. Expert Systems and Probabilistic Network Models. Springer.
Chang, A. C., and P. Li. 2015. “Is Economics Research Replicable? Sixty Published Papers from Thirteen Journals Say ‘Usually Not’.” In Federal Reserve Board Finance and Economics Discussion Paper, 083.
Chang, W., J. Cheng, J. J. Allaire, C. Sievert, B. Schloerke, Y. Xie, J. Allen, J. McPherson, A. Dipert, and B. Borges. 2022. shiny: Web Application Framework for R. https://cran.r-project.org/web/packages/shiny.
Cheney, J., L. Chiticariu, and W.-C. Tan. 209AD. “Provenance in Databases: Why, How and Where.” Foundations and Trends in Databases 1 (4): 379–474.
Clements, P., F. Bachmann, L. Bass, D. Garlan, J. Ivers, R. Little, P. Merson, R. Nord, and J. Stafford. 2011. Documenting Software Architectures: Views and Beyond. 2nd ed. Addison-Wesley.
Cloudera. 2022. Cloudera: The Hybrid Data Company. https://www.cloudera.com/.
Cohen, A. Gokaslan V., E. Pavlick, and S. Tellex. 2019. OpenGPT-2: We Replicated GPT-2 Because You Can Too. https://blog.usejournal.com/opengpt-2-we-replicated-gpt-2-because-you-can-too-45e34e6d36dc.
Comet. 2022. Comet Documentation. https://www.comet.com/docs/v2.
Cormen, T. H. 2013. Algorithms Unlocked. The MIT Press.
Cortes-Ortuno, D., O. Laslett, T. Kluyver, V. Fauske, M. Albert, MinRK, O. Hovorka, and H. Fangohr. 2022. IPython Notebook Validation for py.test: Documentation. https://nbval.readthedocs.io.
CRAN Team. 2022. The Comprehensive R Archive Network. https://cran.r-project.org/.
Crook, J., and J. Banasik. 2004. “Does Reject Inference Really Improve the Performance of Application Scoring Models?” Journal of Banking and Finance 28: 857–74.
Crosley, T. 2022. A Python Utility and Library to Sort Imports. https://pycqa.github.io/isort/.
Cunningham, W. 1992. “The WyCash Portfolio Management System.” In Addendum to the Proceedings of ACM Object-Oriented Programming, Systems, Languages & Applications Conference, 29–30.
———. 2011. Ward Explains the Debt Metaphor. https://wiki.c2.com/?WardExplainsDebtMetaphor.
DagsHub. 2022. Welcome to the DagsHub Docs. https://dagshub.com/docs/.
Databricks. 2022. Databricks Documentation. https://docs.databricks.com/applications/machine-learning/index.html.
de Lima Salge, C. A., and N. Berente. 2016. “Pair Programming vs. Solo Programming: What Do We Know After 15 Years of Research?” In Proceedings of the Annual Hawaii International Conference on System Sciences, 5398–5406.
Devlin, J., M.-W. Chang, K. Lee, and K. Toutanova. 2019. “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.” In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NNACL-HLT), 4171–86.
Dimakopoulou, M., Z. Zhou, S. Athey, and G. Imbens. 2018. Estimation Considerations in Contextual Bandits. https://arxiv.org/abs/1711.07077.
———. 2019. “Balanced Linear Contextual Bandits.” In Proceedings of the AAAI Conference on Artificial Intelligence, 3445–53.
DMTF. 2022. Open Virtualization Format. https://www.dmtf.org/standards/ovf.
Docker. 2022a. Docker. https://www.docker.com/.
Docker. 2022b. Docker Registry HTTP API V2 Documentation. https://docs.docker.com/registry/spec/api/.
———. 2022c. Overview of Docker Compose. https://docs.docker.com/compose.
Dusserre, E., and M. Padró. 2017. “Bigger Does Not Mean Better! We Prefer Specificity.” In Proceedings of the 12th International Conference on Computational Semantics, 1–6.
Duvall, P. M., S. Matyas, and A. Glover. 2007. Continuous Integration: Improving Software Quality and Reducing Risk. Addison-Wesley.
Eclipse Che. 2022. Run your favorite IDE on Kubernetes. https://www.eclipse.org/che/technology/.
Eclipse Foundation. 2022a. Desktop IDEs. https://www.eclipse.org/ide/.
———. 2022b. Theia: Cloud & Desktop IDE. https://theia-ide.org/docs/.
Edmundson, A. 2021. The Rise (and Lessons Learned) of ML Models to Personalize Content on Home.
Elasticsearch. 2022. Free and Open Search: The Creators of Elasticsearch, ELK & Kibana. https://www.elastic.co/.
Elementl. 2022. Dagster Documentation. https://docs.dagster.io.
Espe, L., A. Jindal, V. Podolskiy, and M. Gerndt. 2020. “Performance Evaluation of Container Runtimes.” In Proceedings of the 10th International Conference on Cloud Computing and Services Science, 273–81.
Espeholt, L., H. Soyer, R. Munos, K. Simonyan, V. Mnih, T. Ward, Y. Doron, et al. 2018. “IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures.” In Proceedings of the 35th International Conference on Machine Learning (ICML), 1407–16.
ETF OAuth Working Group. 2022. OAuth 2.0. https://oauth.net/2/.
Evans, E. 2003. Domain-Driven Design: Tackling Complexity in the Heart of Software. Addison-Wesley.
Explosion. 2021. Spacy: Industrial-Strength Natural Language Processing. https://spacy.io/.
Feast Authors. 2022. Feast Documentation. https://docs.feast.dev/.
Fenniak, M. 2022. PyPDF2 Documentation. https://pypdf2.readthedocs.io/en/latest/.
Fernandez, A., S. Garcia, F. Herrera, and N. V. Chawla. 2018. “SMOTE for Learning from Imbalanced Data: Progress and Challenges, Marking the 15-Year Anniversary.” Journal of Artificial Intelligence Research 61: 863–905.
Firke, S., B. Denney, C. Haid, R. Knight, M. Grosser, and J. Zadra. 2022. janitor: Simple Tools for Examining and Cleaning Dirty Data. https://cran.r-project.org/web/packages/janitor.
Formagrid. 2022. Airtable Is a Modern Spreadsheet Platform with Database Functionalities. https://airtable.com.
Fortin, P, A. Fleury, F. Lemaire, and M. Monagan. 2021. “High-Performance SIMD Modular Arithmetic for Polynomial Evaluation.” Concurrency and Computation: Practice and Experience 33 (16): e6270.
Fowler, M. 2003. UML Distilled. 3rd ed. Addison-Wesley.
———. 2018. Refactoring: Improving the Design of Existing Code. 2nd ed. Addison-Wesley.
Galassi, M., J. Davies, J. Theiler, B. Gough, G. Jungman, P. Alken, M. Booth, F. Rossi, and R. Ulerich. 2021. GNU Scientific Library. https://www.gnu.org/software/gsl/doc/latex/gsl-ref.pdf.
Gama, J., I. Žliobaitè, A. Bifet, M. Pechenizkiy, and A. Bouchachia. 2014. “A Survey on Concept Drift Adaptation.” ACM Computing Surveys 46 (4): 44.
Ganiev, A., C. Chapin, A. Andrade, and C. Liu. 2021. “An Architecture for Accelerated Large-Scale Inference of Transformer-Based Language Models.” In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics, 163–69.
Gelman, A., B. Carlin, H. S. Stern, D. B. Dunson, and A. Vehtari. 2013. Bayesian Data Analysis. 3rd ed. CRC Press.
Ghahramani, Z. 2015. “Probabilistic Machine Learning and Artificial Intelligence.” Nature 521: 452–59.
GitHub. 2022a. GitHub Codespaces. https://github.com/features/codespaces.
———. 2022b. Storing Workflow Data as Artifacts. https://docs.github.com/en/actions/using-workflows/storing-workflow-data-as-artifacts.
———. 2022c. Working with the Container Registry. https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry.
Gitlab. 2022a. GitLab Runner Documentation. https://docs.gitlab.com/runner/.
———. 2022b. What Is GitOps? https://about.gitlab.com/topics/gitops.
GitLab. 2022a. GitLab Artifacts.
———. 2022b. GitLab Container Registry. https://docs.gitlab.com/ee/user/packages/container_registry/.
GitLab. 2022c. Group Direction: MLOps. https://about.gitlab.com/direction/modelops/mlops/.
Gitpod. 2022. Gitpod: Always Ready to Code. https://www.gitpod.io.
GNU Project. 2022. GNU EMacs. https://www.gnu.org/software/emacs/.
Gong, M., Y. Xie, K. Pan, and K. Feng. 2020. “A Survey on Differentially Private Machine Learning.” IEEE Computational Intelligence Magazine 15 (2): 49–64.
Goodfellow, I., Y. Bengio, and A. Courville. 2016. Deep Learning. MIT Press.
Goodfellow, I., J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. 2014. “Generative Adversarial Nets.” In Advances in Neural Information Processing Systems (NIPS), 2672–80.
Goodger, D., and G. van Rossum. 2022. PEP 257: Docstring Conventions. https://peps.python.org/pep-0257/\#what-is-a-docstring].
Google. 2022a. BigQuery Documentation. https://cloud.google.com/bigquery/docs.
Google. 2022b. Deep Learning Containers. https://cloud.google.com/deep-learning-containers.
———. 2022c. Google Kubernetes Engine. https://cloud.google.com/kubernetes-engine/docs.
———. 2022d. Google Python Style Guide. https://google.github.io/styleguide/pyguide.html.
———. 2022e. repo: The Multiple Git Repository Tool. https://github.com/GerritCodeReview/git-repo.
———. 2022f. Vertex AI Documentation. https://cloud.google.com/vertex-ai/docs.
———. 2022g. Welcome to Colab! https://colab.research.google.com.
GrafanaLabs. 2022. Grafana: The Open Observability Platform. https://grafana.com/.
Grafana Labs. 2022. Grafana Loki Documentation. https://grafana.com/docs/loki/latest/.
Greenfeld, A. R. 2022. Cookiecutter Data Science. https://drivendata.github.io/cookiecutter-data-science/.
Gregg, B. 2021. Systems Performance: Enterprise and the Cloud. 2nd ed. Addison-Wesley.
Grotov, K., S. Titov, V. Sotnikov, Y. Golubev, and T. Bryksin. 2022. “A Large-Scale Comparison of Python Code in Jupyter Notebooks and Scripts.” In Proceedings of the 19th Working Conference on Mining Software Repositories, 1–12.
Groves, R. M., F. J. Fowler, M. P. Couper, J. M. Lepkowski, E. Singer, and R. Tourangeau. 2009. Survey Methodology. Wiley.
Hammant, P. 2020. Trunk Based Development. https://trunkbaseddevelopment.com/.
Hao, J., T. Jiang anang, and K. Kim. 2021. “An Empirical Analysis of VM Startup Times in Public IaaS Clouds: An Extended Report.” In Proceedings of the 14th Ieee International Conference on Cloud Computing, 398–403.
Harbor. 2022. Harbor Documentation. https://goharbor.io/docs/.
Harris, C. R., K. J. Millman, Stéfan J. van der Walt, R. Gommers, P. Virtanen, D. Cournapeau, E. Wieser, et al. 2020. “Array Programming with NumPy.” Nature 585 (7285): 357–62.
Harvard Business Review. 2012. Data Scientist: The Sexiest Job of the 21st Century. https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century.
HashiCorp. 2022a. Packer Documentation. https://www.packer.io/docs.
HashiCorp. 2022b. Terraform Documentation. https://www.terraform.io/docs.
———. 2022c. Terraform Registry. https://registry.terraform.io.
———. 2022d. Vagrant Documentation. https://www.vagrantup.com/docs.
Hastie, T., R. Tibshirani, and J. Friedman. 2009. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd ed. Springer.
Hazelwood, K., S. Bird, D. Brooks, S. Chintala, U. Diril, D. Dzhulgakov, M. Fawzy, et al. 2018. “Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective.” In Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA), 620–29.
He, X., K. Zhao, and X. Chu. 2021. “AutoML: A Survey of the State-of-the-Art.” Knowledge-Based Systems 212: 106622.
Hester, J., F. Angly, R. Hyde, M. Chirico, K. Ren, and A. Rosenstock. 2022. A Linter for R Code. https://cran.r-project.org/web/packages/lintr.
Holoviz. 2022. Panel User Guide. https://panel.holoviz.org/user_guide/index.html.
Hopsworks. 2022. Hopsworks Documentation. https://docs.hopsworks.a.
Humble, J., and D. Farley. 2011. Continuous Delivery. Addison Wesley.
Hunt, E. 2016. Tay, Microsoft’s AI Chatbot, Gets a Crash Course in Racism from Twitter. https://www.theguardian.com/technology/2016/mar/24/tay-microsofts-ai-chatbot-gets-a-crash-course-in-racism-from-twitter.
Hunter, J. D. 2022. Matplotlib API Reference. https://matplotlib.org/stable/api/index.
Intel. 2021. Intel oneAPI Math Kernel Library. https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onemkl.html.
Iterative. 2022a. CML Documentation. https://cml.dev/doc.
Iterative. 2022b. DVC: Data Version Control. Git for Data & Models. https://github.com/iterative/dvc.
———. 2022c. DVC Python API. https://dvc.org/doc/api-reference.
———. 2022d. MLEM Documentation. https://mlem.ai/doc.
Jacek, C., M. Greiler, C. Bird, L. Panjer, and T. Coatta. 2018. “CodeFlow: Improving the Code Review Process at Microsoft.” ACM Queue 6 (5): 1–20.
Jain, P., X. Mo, A. Jain, H. Subbaraj, R. Durrani, A. Tumanov, J. Gonzalez, and I. Stoica. 2018. “Dynamic Space-Time Scheduling for GPU Inference.” In Workshop on Systems for ML and Open Source Software, NeurIPS 2018, 1–9.
Jenkins. 2022a. A Command Line Tool to Run Jenkinsfile as a Function. https://github.com/jenkinsci/jenkinsfile-runner.
———. 2022b. Jenkins User Documentation. https://www.jenkins.io/doc/.
JetBrains. 2022a. IntelliJ IDEA. https://www.jetbrains.com/idea/.
———. 2022b. PyCharm. https://www.jetbrains.com/pycharm/.
Jouppi, N. P., D. H. Yoon, G. Kurian, S. Li, N. Patil, J. Laudon, C. Young, and D. Patterson. 2020. “A Domain-Specific Supercomputer for Training Deep Neural Networks.” Communications of the ACM 63 (7): 67–78.
Jouppi, N. P., C. Young, N. Patil, and D. Patterson. 2018. “A Domain-Specific Architexture for Deep Neural Networks.” Communications of the ACM 61 (9): 50–59.
JuliaLang. 2022. Pkg: Package Manager for the Julia Programming Language. https://pkgdocs.julialang.org/v1/.
Julia VS Code. 2022a. An Implementation of the Microsoft Language Server Protocol for the Julia Language. https://juliapackages.com/p/languageserver.
———. 2022b. Julia for Visual Studio Code. https://www.julia-vscode.org.
Kaji, N., and H. Kobayashi. 2017. “Incremental Skip-gram Model with Negative Sampling.” In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 363–71.
Kalnytskyi, I. 2022a. Poetry Documentation. https://python-poetry.org/docs.
———. 2022b. sphinxcontrib-openapi Is a Sphinx Extension to Generate APIs Docs from OpenAPI. https://sphinxcontrib-openapi.readthedocs.io.
———. 2022c. The Sphinx Extension that Renders OpenAPI Specs Using ReDoc. https://sphinxcontrib-redoc.readthedocs.io/en/stable.
Kanagawa, M., P. Hennig, D. Sejdinovic, and B. K. Sriperumbudur. 2018. Gaussian Processes and Kernel Methods: A Review on Connections and Equivalences. https://arxiv.org/abs/1807.02582.
Kang, S., R. Jin, X. Deng, and R. S. Kenett. 2021. “Challenges of Modeling and Analysis in Cybermanufacturing: A Review from a Machine Learning and Computation Perspective.” Journal of Intelligent Manufacturing Online first.
Katal, A., M. Wazid, and R. H. Goudar. 2013. “Big Data: Issues, Challenges, Tools and Good Practices.” In Proceedings of the International Conference on Contemporary Computing, 404–9.
Kenett, R. S., and T. C. Redman. 2019. The Real Work of Data Science. Wiley.
Kernigham, B. W., and R. Pike. 1999. The Practice of Programming. Addison-Wesley.
Khan, W. Z., E. Ahmed, S. Hakak, I. Yaqoob, and A. Ahmed. 2019. “Edge Computing: A Survey.” Future Generation Computer Systems 97: 219–35.
Kibirige, H. 2022. Plotnine API Reference. https://plotnine.readthedocs.io/en/stable/api.html.
Knuth, D. E. 1976. “Big Omicron and Big Omega and Big Theta.” ACM Sigact News 8 (2): 18–24.
———. 1997. The Art of Computer Programming, Volume 1: Fundamental Algorithms. 3rd ed. Addison-Wesley.
Krämer, S. 2022. Julia Autodoc. https://bastikr.github.io/sphinx-julia/juliaautodoc.html\#julia-autodoc].
Kriasoft. 2016. Folder Structure Conventions. https://github.com/kriasoft/Folder-Structure-Conventions.
Kuhn, D. R., R. N. Kacker, and Y. Lei. 2013. Introduction to Combinatorial Testing. CRC Press.
Kuhn, M., and K. Johnson. 2013. Applied Predictive Modeling. Springer.
Lai, R., and K. Ren. 2022. An Implementation of the Language Server Protocol for R. https://cran.r-project.org/web/packages/languageserver.
Langa, and et al. 2022. Black: The Uncompromising Code Formatter. https://black.readthedocs.io/en/stable/.
Li, J., X. Chen, E. Hovy, and D. Jurafsky. 2016. “Visualizing and Understanding Neural Models in NLP.” In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 681–91. Association for Computational Linguistics.
Li, Q., Z. Wen, Z. Wu, S. Hu, N. Wang, Y. Li, X. Liu, and B. He. 2021. “A Survey on Federated Learning Systems: Vision, Hype and Reality for Data Privacy and Protection.” IEEE Transactions on Knowledge and Data Engineering Advance publication.
Linardatos, P., V. Papastefanopoulos, and S. Kotsiantis. 2021. “Explainable AI: A Review of Machine Learning Interpretability Methods.” Entropy 23 (1): 18.
Linux Kernel Organization. 2022. The Linux Kernel Archives. https://kernel.org/.
Lipizzi, C., H. Behrooz, M. Dressman, A. G. Vishwakumar, and K. Batra. 2022. “Acquisition Research: Creating Synergy for Informed Change.” In Proceedings of the 19th Annual Acquisition Research Symposium, 242–55.
Lipizzi, C., D. Borrelli, and F. de Oliveira Capela. 2021. A Computational Model Implementing Subjectivity with the “Room Theory”: The case of Detecting Emotion from Text. https://arxiv.org/abs/2005.06059.
Logilab and PyCQA and contributors. 2022. Pylint is a Static Code Analyser for Python 2 or 3. https://pylint.pycqa.org/en/latest/.
Lohr, S. L. 2021. Sampling: Design and Analysis. 3rd ed. CRC Press.
Lopes, C. V. 2020. Exercises in Programming Style. CRC Press.
Loria, S., and et al. 2022. A Pluggable API Specification Generator. https://apispec.readthedocs.io/en/latest.
Lundberg, S. M., and S.-I. Lee. 2017. “A Unified Approach to Interpreting Model Predictions.” In Advances in Neural Information Processing Systems (NIPS), 4765–74.
Lyttle, I., H. Jeppson, and Altair Developers. 2022. altair: Interface to Altair. https://cran.r-project.org/web/packages/altair.
Manohar, A. 2022. asdf Documentation. https://asdf-vm.com/guide/getting-started.html.
Marin, J.-M., and C. P. Robert. 2014. Bayesian Essentials with R. 2nd ed. Springer.
Martin, R. C. 2008. Clean Code. Prentice Hall.
McConnell, S. 2004. Code Complete. 2nd ed. Microsoft Press.
McKinney, W. 2017. Python for Data Analysis. 2nd ed. O’Reilly.
Mehrabi, N., F. Morstatter, N. Saxena, K. Lerman, and A. Galstyan. 2021. “A Survey on Bias and Fairness in Machine Learning.” ACM Computing Surveys 54 (6): 115.
Melançon, G., I. Dutour, and M. Bousquet-Mélou. 2001. “Random Generation of Directed Acyclic Graphs.” Electronic Notes in Discrete Mathematics 10: 202–7.
Meta Platforms. 2022a. A Performant Type-Checker for Python 3. https://pyre-check.org.
———. 2022b. React: A JavaScript Library for Building User Interfaces. https://reactjs.org/.
Microsoft. 2022a. A performant, Feature-Rich Language Server for Python in VS Code. https://marketplace.visualstudio.com/items?itemName=ms-python.vscode-pylance.
Microsoft. 2022b. Azure Kubernetes Service (AKS). https://docs.microsoft.com/en-gb/azure/aks.
———. 2022c. Azure Machine Learning. https://azure.microsoft.com/en-us/services/machine-learning/.
———. 2022d. Code editing. Redefined. https://code.visualstudio.com/.
———. 2022e. Data Visualization: Microsoft PowerBI. https://powerbi.microsoft.com/.
———. 2022f. Language Server Protocol. https://microsoft.github.io/language-server-protocol.
———. 2022g. Shadow Testing. https://microsoft.github.io/code-with-engineering-playbook/automated-testing/shadow-testing/.
———. 2022h. Virtualization Documentation. https://docs.microsoft.com/en-us/virtualization/.
———. 2022i. Visual Studio Code: Code Editing, Redefined. https://code.visualstudio.com/.
———. 2022j. VS Code in the Web. https://vscode.dev.
Microsoft Research Cambridge. 2022. Project InnerEye–Democratizing Medical Imaging AI.
Miłkowski, M., W. M. Hensel, and M. Hohol. 2018. “Replicability or Reproducibility? On the Replication Crisis in Computational Neuroscience and Sharing Only Relevant Detail.” Journal of Computational Neuroscience 45: 163–72.
MinIO. 2022. MinIO Documentation. https://docs.min.io/docs.
Montgomery, D. C. 20AD. Design and Analysis of Experiments. 10th ed. Wiley.
Mood, C. 2010. “Logistic Regression: Why We Cannot Do What We Think We Can Do, and What We Can Do About It.” European Sociological Review 26 (1): 67–82.
Mujtaba, Hassan. 2018. Samsung Powers NVIDIA Quadro RTX Graphics Cards with 16Gb GDDR6 Memory. https://wccftech.com/nvidia-quadro-rtx-turing-gpu-samsung-gddr6-memory/.
Müller, K., and L. Walthert. 2022a. Non-Invasive Pretty Printing of R Code. https://cran.r-project.org/web/packages/styler.
———. 2022b. Third-Party Integrations. https://styler.r-lib.org/articles/third-party-integrations.html.
Muth, C., Z. Oravecz, and J. Gabry. 2018. “User-Friendly Bayesian Regression Modeling: A Tutorial with rstanarm and shinystan.” The Quantitative Methods for Psychology 14 (2): 99–119.
Myers, G. J., T. Badgett, and C. Sandler. 2012. The Art of Software Testing. 3rd ed. Wiley.
Narayanan, A., and V. Shmatikov. 2008. “Robust De-Anonymization of Large Sparse Datasets.” In Proceedings of the IEEE Symposium on Security and Privacy, 111–25.
Natekin, A., and A. Knoll. 2013. “Gradient Boosting Machines, a Tutorial.” Frontiers in Neurorobotics 7 (21): 1–21.
Nature. 2016. “Reality Check on Reproducibility.” Nature 533 (437).
nbQA Team. 2022. Run isort, pyupgrade, mypy, pylint, flake8, and More on Jupyter Notebooks. https://github.com/nbQA-dev/nbQA.
Nektos. 2022. Run Your GitHub Actions Locally. https://github.com/nektos/act.
Neovim. 2022. Hyperextensible Vim-Based Text Editor. https://neovim.io/.
Neptune Labs. 2022. Neptune Documentation. https://docs.neptune.ai/.
Newman, S. 2021. Building Microservices: Designing Fine-Grained Systems. O’Reilly.
NLTK Team. 2021. NLTK: A Natural Language Toolkit. https://www.nltk.org/.
Nteract Team. 2022a. Papermill Is a Tool for Parameterizing and Executing Jupyter Notebooks. https://papermill.readthedocs.io.
———. 2022b. Testbook. https://testbook.readthedocs.io/en/latest/.
Nvidia. 2018. Nvidia Turing GPU Architecture: Graphics Reinvented. https://images.nvidia.com/aem-dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf.
———. 2021. CUDA Toolkit Documentation. https://docs.nvidia.com/cuda/.
Oanda. 2018. A C++ Fixed Point Math Library Suitable for Financial Applications. https://github.com/oanda/libfixed.
Odena, A., C. Olsson, D. Andersen, and I. Goodfellow. 2019. “TensorFuzz: Debugging Neural Networks with Coverage-Guided Fuzzing.” Proceedings of Machine Learning Research (ICML 2018) 97: 4901–11.
ONNX. 2021. Open Neural Network Exchange. https://github.com/onnx/onnx.
Open Container Initiative. 2022. Open Container Initiative. https://opencontainers.org/.
Openrefine. 2022. A Free, Open Source, Powerful Tool for Working with Messy Data. https://openrefine.org.
Open Virtualization Alliance. 2022. Documents. https://www.linux-kvm.org/page/Documents.
Oracle. 2022. Oracle VM Virtualbox. https://www.virtualbox.org/.
Ousterhout, J. 2018. A Philosophy of Software Design. Yaknyam Press.
Overton, M. L. 2001. Numerical Computing with IEEE Floating Point Arithmetic. SIAM.
Pachyderm. 2022. Data-Centric Pipelines and Data Versioning. https://docs.pachyderm.com/latest.
PagerDuty. 2022. PagerDuty: Uptime Is Money. https://www.pagerduty.com/.
Palantir. 2022. Python Language Server. https://github.com/palantir/python-language-server.
Pallets Team. 2022. Flask Documentation. https://flask.palletsprojects.com/en/latest.
Papernot, N., P. McDaniel, A. Sinha, and M. P. Wellman. 2018. “SoK: Security and Privacy in Machine Learning.” In Proceedings of the IEEE European Symposium on Security and Privacy, 399–414.
Paszke, A., S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, et al. 2019. “PyTorch: An Imperative Style, High-Performance Deep Learning Library.” In Advances in Neural Information Processing Systems (Nips), 32:8026–37.
Pennington, J., R. Socher, and C. Manning. 2014. “Glove: Global Vectors for Word Representation.” In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–43.
Pineau, J., P. Vincent-Lamarre, K. Sinha, V. Larivière, A. Beygelzimer, F. d’Alché-Buc, E. Fox, and H. Larochelle. 2021. “Improving Reproducibility in Machine Learning Research (A Report from the NeurIPS 2019 Reproducibility Program).” Journal of Machine Learning Research 22: 1–20.
Plotly. 2022a. Analytical Web Apps for Python, R, Julia, and Jupyter. No JavaScript Required. https://github.com/plotly/dash.
———. 2022b. Dash Python User Guide. https://dash.plotly.com/.
———. 2022c. Plotly Open Source Graphing Library for Python. https://plotly.com/python/.
Polyaxon. 2022. Polyaxon Documentation. https://github.com/polyaxon/polyaxon.
Popejoy, A. B., and S. M. Fullerton. 2016. “Genomics Is Failing on Diversity.” Nature 538: 161–64.
Popescu, M. 2019. Pair Programming Explained. https://shopify.engineering/pair-programming-explained.
Poseidon Laboratories. 2022. Typhoon Documentation. https://typhoon.psdn.io/\#documentation.
Potvin, R., and J. Levenberg. 2016. “Why Google Stores Billions of Lines of Code in a Single Repository.” Communications of the ACM 59 (7): 78–87.
Prefect. 2022. Prefect 2.0 Documentation. https://docs.prefect.io.
Preston-Werner, T. 2022. Semantic Versioning. https://semver.org/.
Prinz, F., T. Schlange, and K. Asadullah. 2011. “Believe It or Not: How Much Can We Rely on Published Data on Potential Drug Targets?” Nature Reviews Drug Discovery 10: 712.
Progress Software. 2022. Chef Documentation. https://docs.chef.io.
Project Jupyter. 2022. Jupyter. https://jupyter.org/.
Prometheus Authors, and The Linux Foundation. 2022. Prometheus: Monitoring System and Time Series Databases. https://prometheus.io/.
Puppet. 2022. Puppet Documentation. https://puppet.com/docs.
Pyright. 2022. Static Type Checker for Python. https://github.com/microsoft/pyright.
Python Packaging Authority. 2022. Building and Distributing Packages with Setuptools. https://setuptools.pypa.io/en/latest/userguide/index.html.
Python Packaging Authority (PyPA). 2022. Virtualenv Documentation. https://virtualenv.pypa.io/en/latest/.
Python Software Foundation. 2022a. PyPI: The Python Package Index. https://pypi.org/.
———. 2022b. Test Interactive Python Examples. https://docs.python.org/3/library/doctest.html.
———. 2022c. unittest: Unit Testing Framework. https://docs.python.org/3/library/unittest.html.
QS Quacquarelli Symonds. 2022. QS World University Rankings. https://www.topuniversities.com/qs-world-university-rankings.
Quest, K. 2022. Standard Go Project Layout. https://github.com/golang-standards/project-layout.
Radford, A., J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever. 2019. Language Models Are Unsupervised Multitask Learners. https://openai.com/blog/better-language-models/.
Ramírez, S. 2022. FastAPI Framework, High Performance, Easy to Learn, Fast to Code, Ready for Production. https://fastapi.tiangolo.com.
Ranganathan, P., C. S. Pramesh, and R. Aggarwal. 2017. “Common Pitfalls in Statistical Analysis: Logistic Regression.” Perspectives in Clinical Research 8 (3): 148–51.
Rasmussen, C. E., and C. K. I. Williams. 2006. Gaussian Processes for Machine Learning. MIT Press.
Rathgeber, F. 2022. Strip Output from Jupyter and IPython Notebooks. https://github.com/kynan/nbstripout.
Read the Docs. 2022. Read the Docs: Documentation Simplified. https://docs.readthedocs.io.
REditorSupport. 2022. R in Visual Studio Code. https://marketplace.visualstudio.com/items?itemName=REditorSupport.r.
Řehůřek, R., and P. Sojka. 2022a. Gensim Documentation. https://radimrehurek.com/gensim/auto_examples/index.html.
———. 2022b. Gensim Documentation. https://radimrehurek.com/gensim/models/word2vec.html.
Reitz, K., and Python Packaging Authority (PyPA). 2022. Pipenv: Python Dev Workflow for Humans. https://pipenv.pypa.io.
Reuther, A., P. Michaleas, M. Jones, V. Gadepally, S. Samsi, and J. Kepner. 2020. “Survey of Machine Learning Accelerators.” In Proceedings of the 2020 Ieee High Performance Extreme Computing Conference (Hpec), 1–12.
Ribeiro, M. T., S. Singh, and C. Guestrin. 2016. “Why Should I Trust You? Explaining the Predictions of Any Classifier.” In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135–44. ACM.
Rice, L. 2020. Container Security: Fundamental Technology Concepts that Protect Containerized Applications. O’Reilly.
Rigby, P., and C. Bird. 2013. “Convergent Contemporary Software Peer Review Practices.” In Proceedings of the 9th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, 202–12.
Rong, X. 2014. “Word2vec Parameter Learning Explained.” arXiv Preprint arXiv:1411.2738.
Royce, W. W. 1987. “Managing the Development of Large Software Systems: Concepts and Techniques.” In Proceedings of the 9th International Conference on Software Engineering, 328–38.
RStudio. 2022a. Open Source and Enterprise-Ready Professional Software for Data Science. https://www.rstudio.com.
———. 2022b. RStudio Server. https://www.rstudio.com/products/rstudio/\#rstudio-server.
Rump, S. M. 2020a. “Addendum to ’On Recurrences Converging to the Wrong Limit in Finite Precision’.” Electronic Transactions on Numerical Analysis 52: 571–75.
———. 2020b. “On Recurrences Converging to the Wrong Limit in Finite Precision.” Electronic Transactions on Numerical Analysis 52: 358–69.
Russell, S. J., and P. Norvig. 2009. Artificial Intelligence: A Modern Approach. 3rd ed. Prentice Hall.
Sadowski, C., E. Söderberg, L. Church, M. Sipko, and A. Bacchelli. 2018. “Modern Code Review: A Case Study at Google.” In Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice, 181–90.
Saltz, J. S., and I. Shamshurin. 2017. “Does Pair Programming Work in a Data Science Context? An Initial Case Study.” In Proceedings of the IEEE International Conference on Big Data, 2348–54.
Santner, T. J., B. J. Williams, and E. I. Notz. 2018. The Design and Analysis of Computer Experiments. 2nd ed. Springer.
Satyanarayan, A., D. Moritz, K. Wongsuphasawat, and J. Heer. 2022. A High-Level Grammar of Interactive Graphics. https://vega.github.io/vega-lite/docs/.
Schubert, E., J. Sander, M. Ester, H. P. Kriegel, and X Xu. 2017. “DBSCAN Revisited, Revisited: Why and How You Should (Still) Use DBSCAN.” ACM Transactions on Database Systems 42 (3): 19.
Scikit-learn Developers. 2022. Scikit-learn: Machine Learning in Python. https://scikit-learn.org/.
Sculley, D., G. Holt, D. Golovin, E. Davydov, T. Phillips, D. Ebner, V. Chaudhary, and M. Young. 2014. “Machine Learning: The High Interest Credit Card of Technical Debt.” In SE4ML: Software Engineering for Machine Learning (NIPS 2014 Workshop).
Sculley, D., G. Holt, D. Golovin, E. Davydov, T. Phillips, D. Ebner, V. Chaudhary, M. Young, J.-F. Crespo, and D. Dennison. 2015. “Hidden Technical Debt in Machine Learning Systems.” In Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS), 2:2503–11.
Scutari, M., and J.-B. Denis. 2021. Bayesian Networks with Examples in R. 2nd ed. Chapman & Hall.
Seldon Technologies. 2022. Seldon Core. https://docs.seldon.io/projects/seldon-core/en/latest/.
Services, Amazon Web. 2022. AWS Deep Learning Containers. https://aws.amazon.com/en/machine-learning/containers/.
.Seven, D. 2014. Knightmare: A DevOps Cautionary Tale. https://dougseven.com/2014/04/17/knightmare-a-devops-cautionary-tale/.
Shelton, K. 2017. The Value of Search Results Rankings. https://www.forbes.com/sites/forbesagencycouncil/2017/10/30/the-value-of-search-results-rankings/.
Sherman, E. 2022. What Zillow’s Failed Algorithm Means for the Future of Data Science. https://fortune.com/education/business/articles/2022/02/01/what-zillows-failed-algorithm-means-for-the-future-of-data-science/.
Shinyama, Y., P. Guglielmetti, and P. Marsman. 2022. Pdfminer.six’s Documentation. https://pdfminersix.readthedocs.io/en/latest/.
Shiraishi, M., H. Washizaki, Y. Fukazawa, and J. Yoder. 2019. “Mob Programming: A Systematic Literature Review.” In Proceedings of the IEEE 43rd Annual Computer Software and Applications Conference, 616–21.
SIGHUP. 2022. Kubernetes Fury Distribution. https://docs.kubernetesfury.com/docs/distribution/.
Silverlake Software. 2022. Velocity: The Documentation and Docset Viewer for Windows. https://velocity.silverlakesoftware.com/.
Simmons, A. J., S. Barnett, J. Rivera-Villicana, A. Bajaj, and R. Vasa. 2020. “A Large-Scale Comparative Analysis of Coding Standard Conformance in Open-Source Data Science Projects.” In Proceedings of the 14th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), 1–11.
Simonyan, K., A. Vedaldi, and A. Zisserman. 2014. “Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps.” In Proceedings of the 2nd International Conference on Learning Representations (ICLR), Workshop Track.
SmartBear Software. 2021. OpenAPI Specification. https://swagger.io/specification/.
Snowflake. 2022. Snowflake Documentation. https://docs.snowflake.com.
Sonatype. 2022. Nexus Repository Manager. https://www.sonatype.com/products/nexus-repository.
Spotify. 2022a. Luigi Documentation. https://luigi.readthedocs.io/en/stable/.
———. 2022b. Spotify Engineering Blog. https://engineering.atspotify.com/.
Stapleton Cordasco, I. 2022. Flake8: Your Tool for Style Guide Enforcement. https://flake8.pycqa.org/en/latest/.
Stevens, J. R. 2017. “Replicability and Reproducibility in Comparative Psychology.” Frontiers in Psychology 8: 862.
Streamlit. 2022. Streamlit Documentation. https://docs.streamlit.io/.
Superconductive. 2022. Great Expectations. https://docs.greatexpectations.io/docs.
Swoboda, S. 2021. Connecting with Mob Programming. https://shopify.engineering/mob-programming.
Tableau. 2022. Execute Python Code on The Fly and Display Results in Tableau Visualizations. https://tableau.github.io/TabPy.
Tableau Software. 2022. Tableau. https://www.tableau.com/.
Tabuchi, A., A. Kasagi, M. Yamazaki, T. Honda, M. Miwa, T. Shiraishi, M. Kosaki, et al. 2019. “Extremely Accelerated Deep Learning: ResNet-50 Training in 70.4 Seconds.” https://sc19.supercomputing.org/proceedings/tech_poster/poster_files/rpost203s2-file3.pdf.
Tang, Y., R. Khatchadouriant, M. Bagherzadeh, R. Singh, A. Stewart, and A. Raja. 2021. “An Empirical Study of Refactorings and Technical Debt in Machine Learning Systems.” In Proceedings of the 2021 IEEE/ACM 43rd International Conference on Software Engineering, 238–50.
Tatman, R., J. VanderPlas, and S. Dane. 2018. “A Practical Taxonomy of Reproducibility for Machine Learning Research.” In Proceedings of 2nd the Reproducibility in Machine Learning Workshop at ICML 2018.
Tensorflow. 2021. XLA: Optimizing Compiler for Machine Learning. https://www.tensorflow.org/xla.
TensorFlow. 2021a. TensorFlow. https://www.tensorflow.org/overview/.
———. 2021b. TensorFlow Extended (TFX). https://www.tensorflow.org/tfx/.
TensorFlow. 2022a. ML Metadata. https://www.tensorflow.org/tfx/guide/mlmd.
———. 2022b. Serving Models. https://www.tensorflow.org/tfx/guide/serving.
———. 2022c. TensorBoard: TensorFlow’s Visualization Toolkit. https://www.tensorflow.org/tensorboard.
———. 2022d. The TFX User Guide. https://www.tensorflow.org/tfx/guide.
The Apache Software Foundation. 2022a. Airflow Documentation. https://airflow.apache.org/docs/.
———. 2022b. Apache Beam Documentation. https://beam.apache.org/documentation/.
———. 2022c. Apache Hadoop. https://hadoop.apache.org/.
———. 2022d. Apache Hive Documentation. https://cwiki.apache.org/confluence/display/Hive.
———. 2022e. Apache Pig Documentation. https://pig.apache.org/docs/latest.
———. 2022f. Apache Spark Documentation. https://spark.apache.org/docs/latest.
The Containers Organization. 2022. podman. https://podman.io.
The Delta Lake Project Authors. 2022a. Delta Lake Documentation. https://docs.delta.io.
———. 2022b. Zeal Is an Offline Documentation Browser for Software Developers. https://zealdocs.org.
The Economist. 2017. The World’s Most Valuable Resource Is No Longer Oil, but Data. https://www.economist.com/leaders/2017/05/06/the-worlds-most-valuable-resource-is-no-longer-oil-but-data.
———. 2020. An Understanding of AI’s Limitations Is Starting to Sink In. https://www.economist.com/technology-quarterly/2020/06/11/an-understanding-of-ais-limitations-is-starting-to-sink-in.
The Fluentd Project. 2022. Fluentd: Open Source Data Collector. https://www.fluentd.org/.
The Git Development Team. 2022. Git Source Code Mirror. https://github.com/git/git.
The Hadolint Project. 2022. Hadolint: Haskell Dockerfile Linter Documentation. https://github.com/hadolint/hadolint.
The KServe Authors. 2022. KServe Control Plane. https://kserve.github.io/website/latest/modelserving/control_plane.
The Kubeflow Authors. 2022. All of Kubeflow documentation. https://www.kubeflow.org/docs/.
The Kubernetes Authors. 2022a. Kubernetes. https://kubernetes.io/.
———. 2022b. Kubernetes Documentation: Schedule GPUs. https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/.
———. 2022c. minikube. https://minikube.sigs.k8s.io/docs.
The mypy Project. 2014. mypy: Optional Static Typing for Python. http://mypy-lang.org/.
The Register. 2020. Twilio: Someone Waltzed into Our Unsecured AWS S3 Silo, Added Dodgy Code to Our JavaScript SDK for Customers. https://www.theregister.com/2020/07/21/twilio_javascript_sdk_code_injection.
Thomas, D., and A. Hunt. 2019. The Pragmatic Programmer: Your Journey to Mastery. Anniversary. Addison-Wesley.
Tian, Y., Y. Zhang, K.-J. Stol, L. Jiang, and H. Liu. 2022. “What Makes a Good Commit Message?” In Proceedings of the 44th International Conference on Software Engineering, 1–13.
Tornhill, A., and M. Borg. 2022. “Code Red: The Business Impact of Code Quality: A Quantitative Study of 39 Proprietary Production Codebases.” In Proceedings of International Conference on Technical Debt, 1–10.
Toro, A. L. 2020. Great Code Reviews–the Superpower Your Team Needs. https://shopify.engineering/great-code-reviews.
Tremel, E. 2017. Deployment Strategies on Kubernetes. https://www.cncf.io/wp-content/uploads/2020/08/CNCF-Presentation-Template-K8s-Deployment.pdf.
Trifacta. 2022. Profile, Prepare, and Pipeline Data for Analytics and Machine Learning. https://www.trifacta.com.
Tsay, R. S. 2010. Analysis of Financial Time Series. 3rd ed. Wiley.
Uber. 2022. Piranha: A Tool for Refactoring Code Related to Feature Flag APIs. https://github.com/uber/piranha.
Uber Technologies. 2022. Uber Engineering Blog. https://eng.uber.com/.
Unicode. 2021. Unicode Technical Documentation. https://www.unicode.org/main.html.
Ushey, K., JJ. Allaire, and Y. Tang. 2022. reticulate: Interface to Python. https://cran.r-project.org/web/packages/reticulate.
Ushey, K., J. McPherson, J. Cheng, A. Atkins, JJ. Allaire, and T. Allen. 2022. Packrat: Reproducible Package Management for R. https://rstudio.github.io/packrat/.
van der Schaar, M., A. M. Alaa, A. Floto, A. Gimson, S. Scholtes, A. Wood, E. McKinney, D. Jarrett, P. Liò, and A. Ercole. 2021. “How Artificial Intelligence and Machine Learning Can Help Healthcare Systems Respond to COVID-19.” Machine Learning 110: 1–14.
van Heesch, D. 2022. Doxygen. https://www.doxygen.nl/index.html.
van Oort, B., L. Cruz, M. Aniche, and A. van Deursen. 2021. “The Prevalence of Code Smells in Machine Learning Projects.” In Proceedings of the 2021 IEEE/ACM 1st Workshop on AI Engineering: Software Engineering for AI, 35–42.
van Rossum, G., B. Warsaw, and N. Coghlan. 2001. PEP 8: Style Guide for Python Code. https://peps.python.org/pep-0008/.
van Vliet, H. 2008. Software Engineering: Principles and Practice. Wiley.
Velero Authors. 2022a. Postman Documentation. https://learning.postman.com/docs.
———. 2022b. Velero Documentation. https://velero.io/docs.
Virtanen, P., R. Gommers, T. E. Oliphant, M. Haberland, T. Reddy, D. Cournapeau, E. Burovski, et al. 2020. “SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python.” Nature Methods 17: 261–72.
VmWare. 2022. VMware vSphere Documentation. https://docs.vmware.com/en/VMware-vSphere/index.html.
VMware. 2022. VMware Workstation Pro. https://www.vmware.com/products/workstation-pro.html.
Voilà Dashboards. 2022. From Notebooks to Standalone Web Applications and Dashboards. https://voila.readthedocs.io/en/stable.
Volkov, V., and J. W. Demmel. 2008. “Benchmarking GPUs to Tune Dense Linear Algebra.” In Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, 1–11.
Voulodimos, A., N. Doulamis, A. Doulamis, and E. Protopapadakis. 2018. “Deep Learning for Computer Vision: A Brief Review.” Computational Intelligence and Neuroscience 2018 (7068349): 1–13.
VPNOverview. 2022. Fintech App Switch Leaks Users’ Transactions, Personal IDs. https://vpnoverview.com/news/fintech-app-switch-leaks-users-transactions-personal-ids.
Walters, M., and P. Lee Scott. 2021. meta-git: Manage Your Meta Repo and Child Git Repositories. https://www.npmjs.com/package/meta-git.
Waskom, M. 2022. Seaborn: Statistical Data Visualization. https://seaborn.pydata.org/.
Weights & Biases. 2022. Weights & Biases Documentation. https://docs.wandb.ai/.
Weisberg, S. 2014. Applied Linear Regression. 4th ed. Wiley.
Wickham, H. 2022a. ggplot2: Elegant Graphics for Data Analysis. https://cran.r-project.org/web/packages/ggplot2.
———. 2022b. The tidyverse Style Guide. https://style.tidyverse.org/.
Wickham, H., P. Danenberg, G. Csárdi, M. Eugster, and RStudio. 2022. roxygen2: In-Line Documentation for R.
Wickham, H., R. François, L.Henry, and K. Müller. 2022. A Fast, Consistent Tool for Working with Data Frame Like Objects, Both in Memory and Out of Memory. https://cloud.r-project.org/web/packages/dplyr.
Wickham, H., M. Girlich, and RStudio. 2022. tidyr: Tidy Messy Data. https://cloud.r-project.org/web/packages/tidyr.
Wickham, H., RStudio, and R Core Team. 2022. Unit Testing for R. https://cloud.r-project.org/web/packages/testthat.
Widgren, S., and et al. 2022. git2r: Provides Access to Git Repositories. https://cran.r-project.org/web/packages/git2r/index.html.
Wiggins, A. 2017. The Twelve Factor App. https://12factor.net.
Wikipedia. 2021a. Cholesky Decomposition. https://en.wikipedia.org/wiki/Cholesky_decomposition.
———. 2021b. Matrix Multiplication Algorithm. https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm.
———. 2021c. QR Decomposition. https://en.wikipedia.org/wiki/QR_decomposition.
Wilkinson, L. 2005. The Grammar of Graphics. 2nd ed. Springer.
Williams, L., R. R. Kessler, and W. Cunningham. 2000. “Strengthening the Case for Pair Programming.” IEEE Software 17 (4): 19–25.
Williams-Young, D. B., and X. Li. 2019. On the Efficacy and High-Performance Implementation of Quaternion Matrix Multiplication. https://arxiv.org/abs/1903.05575.
Wu, X., L. Xiao, Y. Sun, J. Zhang, T. Ma, and L. He. 2022. “A Survey of Human-in-the-Loop for Machine Learning.” Future Generation Computer Systems 135: 364–81.
Xie, Y. 2015. Dynamic Documents with R and knitr. 2nd ed. CRC Press.
Xie, Y., J. J. Allaire, and G. Grolemund. 2022. R Markdown: The Definitive Guide. https://bookdown.org/yihui/rmarkdown/.
Xin, D., L. Ma, J. Liu, S. Song, and A. Parameswaran. 2018. “Accelerating Human-in-the-Loop Machine Learning: Challenges and Opportunities.” In Proceedings of the Second Workshop on Data Management for End-to-End Machine Learning, 1–4.
Yamashita, Y., S. Stephenson, and et al. 2022. pyenv: Simple Python Version Management. https://github.com/pyenv/pyenv.
Yang, Z., Z. Dai, Y. Yang, J. Carbonell, R. R. Salakhutdinov, and Q. V. Le. 2019. “XLNet: Generalized Autoregressive Pretraining for Language Understanding.” In Advances in Neural Information Processing Systems (NeurIPS), 5753–63.
You, E. 2022. Vue.js: The Progressive JavaScript Framework. https://vuejs.org/.
Zaharia, M., and The Linux Foundation. 2022. MLflow Documentation. https://www.mlflow.org/docs/latest/index.html.
Zellers, R., A. Holtzman, H. Rashkin, Y. Bisk, A. Farhadi, F. Roesner, and Y. Choi. 2019. “Defending against Neural Fake News.” In Advances in Neural Information Processing Systems (NeurIPS), 9054–65.
Zelvenskiy, S., G. Harisinghani, T. Yu, E. Ng, and R. Wei. 2022. Project Radar: Intelligent Early-Fraud Detection. https://eng.uber.com/project-radar-intelligent-early-fraud-detection/.
Zhang, H., L. Cruz, and A. van Deursen. 2022. “Code Smells for Machine Learning Applications.” In Proceedings of the 1st International Conference on AI Engineering: Software Engineering for AI, 1–12.
Zhang, J. M., M. Harman, L. Ma, and Y. Liu. 2020. “Machine Learning Testing: Survey, Landscapes and Horizons.” IEEE Transactions on Software Engineering 48 (1): 1–36.
Zheng, A. 2015. Evaluating Machine Learning Models. O’Reilly.